PURPOSE
To design, develop, implement, and maintain scalable and robust data integration interfaces and data models required by Analysts, Data Product Owners and Data Scientists
KEY WORK OUTPUT AND ACCOUNTABILITIES
Drive automation in data integration and management:
- Track data consumption patterns, preparation and integration tasks to identify the most common, repeatable tasks
- Prioritize opportunities for automation to minimize manual and error-prone processes
- Improve productivity across the data-science/analytics team
- Promote reuse of content and data through a centralized portal, catalog or other system
- May be required to monitor schema changes, perform intelligent sampling and caching, and learn and use AI-enabled metadata management techniques
Build, manage and maintain data pipelines and architecture to provide a foundation for all analytics projects:
- Match appropriate data sources to identified analytics use cases
- Integrate data from multiple sources into a single source or system and contribute to educational programs to increase accessibility for analytic team and users
- Ensure data pipelines comply with applicable regulations and organisational governance standards
- Maintain data pipelines by fixing technical issues related to team access
- Optimize data quality by flagging sources for review, filling gaps, developing proxy variables, etc., where appropriate and clarifying limitations with data science teams
Communicate cross-functionally and with data science teams and marketing analysts:
- Collaborate with IT and other departments to gain access to enterprise wide data systems
- Work with IT leaders, platform specialists and others to identify unknown sources of data or reconcile important differences in datasets
- Work closely with data scientists and marketing analysts to define data requirements for analytics projects
- Continuously support efforts to upskill and educate data scientists, marketing analysts and others within the marketing organization
Participate in efforts to improve data governance and ensure compliance:
- Recommend methods to continuously optimise data collection processes as well as tagging and the use of analytics tools
- Coordinate across regions/provinces/municipalities, where applicable, ensuring standardised data models and analytical methodologies
Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Python, GCP, Azure ‘Big Data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centres and GCP, Azure regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Continuously learn and apply latest and fit-for-purpose, open-source and proprietary tools and technologies to achieve results, including some or all of the following:
Cloud
- Microsoft Azure (must)
- AWS
- Google Cloud
Database and Data Management
- Microsoft SQL
- Server
- MySQL
- PostgreSQL
- Mongodb
Languages
- Python (must)
- R (must)
- SQL (must)
- Java
- C/C++
KNOWLEDGE REQUIRED TO DO THE JOB:
- ETL
- Advanced working Python knowledge and experience working with relational databases, query authoring Python as well as working familiarity with a variety of databases.
- Experience building and optimizing data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large, disconnected datasets.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
SKILLS / ABILITIES REQUIRED TO DO THE JOB:
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large, disconnected datasets.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- General data management and analytics skills, including:
- Present results in consumable formats, including dashboards, reports and data visualizations
- Conduct quantitative research/analysis, generate insights, data modeling, data engineering, often through specific languages and/or statistical methodologies (i.e., Python, SQL)
- Edit databases, mining data and performing queries (i.e., Hadoop, Hive, SQL, Teradata) and working with data warehouses and other relational database storage environments
- Apply advanced analytics and data science concepts and/or methodologies such as predictive analytics, data modeling, forecasting and machine learning where appropriate
- Database management and/or platform knowledge, including:
- Operational [data] automation systems (i.e., ERP, OT and other data source platforms)
- Common spreadsheet applications (e.g., Excel), including pivot tables and charting
- Common BI tools and/or data analytics platforms (i.e., Excel, PowerBI, Google Analytics)
- Ability to execute against use cases for data science, including:
- Digital channel and platforms optimization and measurement
- Support analysis of measurement of cross-functional (internal and municipal) areas to support strategic decision making (KPIs, Metrics, etc)
- Cleaning and integrating multiple data sources to support analytics
PERSONAL ATTRIBUTES REQUIRED FOR THIS JOB:
- The ability to solve problems.
- The ability to rotate around a problem, to see if solutions can be gained in different ways.
- The ability to work in an ever changing, unstructured environment.
- The ability to work as part of a team, with vastly differing skill sets and opinions.
- The ability to contribute ideas to the quorum.
- The ability to mentor and provide guidance for other team members.
- The ability to communicate across teams, stakeholders, and functional areas
- A systems approach to thinking, as opposed to a siloed approach. The candidate needs to understand how their work affects the greater system.
- The ability to work without supervision and take accountability for the work they deliver.
- The ability to liaise with a client, sifting through the fluff and extracting the actual requirements.
MINIMUM REQUIREMENTS:
- Bachelor’s or Honour’s degree in Computer Science or Engineering or equivalent experience
- 4+ years in software development experience and data management disciplines including data integration, modelling, optimization and data quality and/or other areas directly relevant to data engineering responsibilities and tasks
- Project Management (classic and agile) experience
- Proficient in Java, C/C++, or Python
Please note that should we not contact you within two weeks please consider your application unsuccessful.
Desired Skills:
- Data engineering
Desired Work Experience:
- 2 to 5 years
Desired Qualification Level:
- Degree