Data Engineer - IT-Online

PURPOSE
To design, develop, implement, and maintain scalable and robust data integration interfaces and data models required by Analysts, Data Product Owners and Data Scientists
KEY WORK OUTPUT AND ACCOUNTABILITIES
Drive automation in data integration and management:

Track data consumption patterns, preparation and integration tasks to identify the most common, repeatable tasks
Prioritize opportunities for automation to minimize manual and error-prone processes
Improve productivity across the data-science/analytics team
Promote reuse of content and data through a centralized portal, catalog or other system
May be required to monitor schema changes, perform intelligent sampling and caching, and learn and use AI-enabled metadata management techniques

Build, manage and maintain data pipelines and architecture to provide a foundation for all analytics projects:

Match appropriate data sources to identified analytics use cases
Integrate data from multiple sources into a single source or system and contribute to educational programs to increase accessibility for analytic team and users
Ensure data pipelines comply with applicable regulations and organisational governance standards
Maintain data pipelines by fixing technical issues related to team access
Optimize data quality by flagging sources for review, filling gaps, developing proxy variables, etc., where appropriate and clarifying limitations with data science teams

Communicate cross-functionally and with data science teams and marketing analysts:

Collaborate with IT and other departments to gain access to enterprise wide data systems
Work with IT leaders, platform specialists and others to identify unknown sources of data or reconcile important differences in datasets
Work closely with data scientists and marketing analysts to define data requirements for analytics projects
Continuously support efforts to upskill and educate data scientists, marketing analysts and others within the marketing organization

Participate in efforts to improve data governance and ensure compliance:

Recommend methods to continuously optimise data collection processes as well as tagging and the use of analytics tools
Coordinate across regions/provinces/municipalities, where applicable, ensuring standardised data models and analytical methodologies

Assemble large, complex data sets that meet functional / non-functional business requirements.

Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Python, GCP, Azure ‘Big Data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centres and GCP, Azure regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.

Continuously learn and apply latest and fit-for-purpose, open-source and proprietary tools and technologies to achieve results, including some or all of the following:
Cloud

Microsoft Azure (must)
AWS
Google Cloud

Database and Data Management

Microsoft SQL
Server
MySQL
PostgreSQL
Mongodb

Languages

Python (must)
R (must)
SQL (must)
Java
C/C++

KNOWLEDGE REQUIRED TO DO THE JOB:

ETL
Advanced working Python knowledge and experience working with relational databases, query authoring Python as well as working familiarity with a variety of databases.
Experience building and optimizing data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large, disconnected datasets.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.

SKILLS / ABILITIES REQUIRED TO DO THE JOB:

Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large, disconnected datasets.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
General data management and analytics skills, including:
Present results in consumable formats, including dashboards, reports and data visualizations
Conduct quantitative research/analysis, generate insights, data modeling, data engineering, often through specific languages and/or statistical methodologies (i.e., Python, SQL)
Edit databases, mining data and performing queries (i.e., Hadoop, Hive, SQL, Teradata) and working with data warehouses and other relational database storage environments
Apply advanced analytics and data science concepts and/or methodologies such as predictive analytics, data modeling, forecasting and machine learning where appropriate
Database management and/or platform knowledge, including:
Operational [data] automation systems (i.e., ERP, OT and other data source platforms)
Common spreadsheet applications (e.g., Excel), including pivot tables and charting
Common BI tools and/or data analytics platforms (i.e., Excel, PowerBI, Google Analytics)
Ability to execute against use cases for data science, including:
Digital channel and platforms optimization and measurement
Support analysis of measurement of cross-functional (internal and municipal) areas to support strategic decision making (KPIs, Metrics, etc)
Cleaning and integrating multiple data sources to support analytics

PERSONAL ATTRIBUTES REQUIRED FOR THIS JOB:

The ability to solve problems.
The ability to rotate around a problem, to see if solutions can be gained in different ways.
The ability to work in an ever changing, unstructured environment.
The ability to work as part of a team, with vastly differing skill sets and opinions.
The ability to contribute ideas to the quorum.
The ability to mentor and provide guidance for other team members.
The ability to communicate across teams, stakeholders, and functional areas
A systems approach to thinking, as opposed to a siloed approach. The candidate needs to understand how their work affects the greater system.
The ability to work without supervision and take accountability for the work they deliver.
The ability to liaise with a client, sifting through the fluff and extracting the actual requirements.

MINIMUM REQUIREMENTS:

Bachelor’s or Honour’s degree in Computer Science or Engineering or equivalent experience
4+ years in software development experience and data management disciplines including data integration, modelling, optimization and data quality and/or other areas directly relevant to data engineering responsibilities and tasks
Project Management (classic and agile) experience
Proficient in Java, C/C++, or Python

Please note that should we not contact you within two weeks please consider your application unsuccessful.

Desired Skills:

Data engineering

Desired Work Experience:

2 to 5 years

Desired Qualification Level:

Degree

Learn more/Apply for this position