Data Engineer

To design, develop, implement and maintain scalable and robust data integration interfaces and data models required by Analysts, Data Product Owners and Data Scientists

A formidable Data Engineer will demonstrate unsatiated curiosity and outstanding interpersonal skills. Also responsible for employing machine learning techniques to create and sustain structures that allow for the analysis of data, while remaining familiar with dominant programming and deployment strategies in the field. Drive automation in data integration and management, as well as managing and building data pipelines to provide a foundation for all analytics programmes

Responsibilities

Drive automation in data integration and management:

Track data consumption patterns, preparation and integration tasks to identify the most common, repeatable tasks

Prioritize opportunities for automation to minimize manual and error-prone processes

Improve productivity across the data-science/analytics team

Promote reuse of content and data through a centralized portal, catalog or other system

May be required to monitor schema changes, perform intelligent sampling and caching, and learn and use AI-enabled metadata management techniques

Build, manage and maintain data pipelines and architecture to provide a foundation for all analytics projects:

Match appropriate data sources to identified analytics use cases

Integrate data from multiple sources into a single source or system and contribute to educational programs to increase accessibility for analytic team and users

Ensure data pipelines comply with applicable regulations and organisational governance standards

Maintain data pipelines by fixing technical issues related to team access

Optimize data quality by flagging sources for review, filling gaps, developing proxy variables, etc., where appropriate and clarifying limitations with data science teams

Communicate cross-functionally and with data science teams and marketing analysts:

Collaborate with IT and other departments to gain access to enterprisewide data systems

Work with IT leaders, platform specialists and others to identify unknown sources of data or reconcile important differences in datasets

Work closely with data scientists and marketing analysts to define data requirements for analytics projects

Continuously support efforts to upskill and educate data scientists, marketing analysts and others within the marketing organization

Participate in efforts to improve data governance and ensure compliance:

Recommend methods to continuously optimise data collection processes as well as tagging and the use of analytics tools

Coordinate across regions/provinces/municipalities, where applicable, ensuring standardised data models and analytical methodologies

Assemble large, complex data sets that meet functional / non-functional business requirements.

Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Python, GCP, Azure ‘Big Data’ technologies.

Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.

Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.

Keep our data separated and secure across national boundaries through multiple data centres and GCP, Azure regions.

Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.

Work with data and analytics experts to strive for greater functionality in our data systems.

Continuously learn and apply latest and fit-for-purpose, open-source and proprietary tools and technologies to achieve results, including some or all of the following:

Cloud

Microsoft Azure (must)

AWS

Google Cloud

Database and Data Management

Microsoft SQL

Server

MySQL

PostgreSQL

Mongodb

Languages

Python (must)

R (must)

SQL (must)

Java

C/C++

Requirements

Bachelor’s or Honour’s degree in Computer Science or Engineering or equivalent experience

4+ years in software development experience and data management disciplines including data integration, modelling, optimization and data quality and/or other areas directly relevant to data engineering responsibilities and tasks

Project Management (classic and agile) experience

Proficient in Java, C/C++, or Python

Desired Skills:

Java

C++

Python

Desired Qualification Level:

Honours

Employer & Job Benefits:

To discuss in the interview process

