Data Engineer II

Jul 18, 2022

Position Purpose:

Data Engineers build and support data pipelines and datamarts built off those pipelines. Both must be scalable, repeatable, and secure.

The Data Engineer helps to facilitate gathering data from a variety of different sources, in the correct format, assuring that it conforms to data quality standards and assuring that downstream user can get to that data timeously.

This role functions as a core member of an agile team.

These professionals are responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly.

They enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces. Additional responsibilities include developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analyzing, and visualizing large datasets.

They know how to apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.

Data Engineering is a technical job that requires substantial expertise in a broad range of software development and programming fields.

These professionals have a knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution.

They have a solid understanding of physical database design and the systems development lifecycle.

This role must work well in a team environment.

Qualifications:

IT Degree/Diploma (3 years)

Desirable:

AWS Certification at least to associate level

Experience:

Business Intelligence (3-5 years)

Extract Transform and Load (ETL) processes (3-5 years)

Agile exposure, Kanban, or Scrum (2+ years)

Desirable:

Retail Operations (3-5 years)

Big Data (1+ years)

Cloud AWS (1+ years)

Job objectives:

Design and develop data feeds from an on-premises environment into a datalake environment in an AWS cloud environment

Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality

Design and develop programmatic transformation, combinations, and calculations to populate complex DataMart’s based on feed from the datalike

Provide operational support to DataMart data feeds and datamarts

Design infrastructure required to develop and operate datalake data feeds

Design infrastructure required to develop and operate datamarts, their user interfaces and the feeds required to populate the datalake.

Knowledge & Skills:
Knowledge:

Creating data feeds from on-premises to AWS Cloud (1 year)

Support data feeds in production on break fix basis (1 year)

Creating data marts using Talend or similar ETL development tool (2 years)

Manipulating data using python and pyspark (1 year)

Processing data using the Hadoop paradigm particularly using EMR, AWS’s distribution of Hadoop (1 year)

Devop for Big Data and Business Intelligence including automated testing and deployment (1 year)

Skills:

Talend (1 year)

Python (1 year)

Business Intelligence Data modelling (3 years)

SQL (3 years)

Desirable:

AWS: EMR, EC2, S3 (1 year)

PySpark or Spark (1 year)

Position Purpose:

Data Engineers build and support data pipelines and datamarts built off those pipelines. Both must be scalable, repeatable, and secure.

The Data Engineer helps to facilitate gathering data from a variety of different sources, in the correct format, assuring that it conforms to data quality standards and assuring that downstream user can get to that data timeously.

This role functions as a core member of an agile team.

These professionals are responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly.

They enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces. Additional responsibilities include developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analyzing, and visualizing large datasets.

They know how to apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.

Data Engineering is a technical job that requires substantial expertise in a broad range of software development and programming fields.

These professionals have a knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution.

They have a solid understanding of physical database design and the systems development lifecycle.

This role must work well in a team environment.

Qualifications:

IT Degree/Diploma (3 years)

Desirable:

AWS Certification at least to associate level

Experience:

Business Intelligence (3-5 years)

Extract Transform and Load (ETL) processes (3-5 years)

Agile exposure, Kanban, or Scrum (2+ years)

Desirable:

Retail Operations (3-5 years)

Big Data (1+ years)

Cloud AWS (1+ years)

Job objectives:

Design and develop data feeds from an on-premises environment into a datalake environment in an AWS cloud environment

Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality

Design and develop programmatic transformation, combinations, and calculations to populate complex DataMart’s based on feed from the datalike

Provide operational support to DataMart data feeds and datamarts

Design infrastructure required to develop and operate datalake data feeds

Design infrastructure required to develop and operate datamarts, their user interfaces and the feeds required to populate the datalake.

Knowledge & Skills:
Knowledge:

Creating data feeds from on-premises to AWS Cloud (1 year)

Support data feeds in production on break fix basis (1 year)

Creating data marts using Talend or similar ETL development tool (2 years)

Manipulating data using python and pyspark (1 year)

Processing data using the Hadoop paradigm particularly using EMR, AWS’s distribution of Hadoop (1 year)

Devop for Big Data and Business Intelligence including automated testing and deployment (1 year)

Skills:

Talend (1 year)

Python (1 year)

Business Intelligence Data modelling (3 years)

SQL (3 years)

Desirable:

AWS: EMR, EC2, S3 (1 year)

PySpark or Spark (1 year)

Desired Skills:

• Business Intelligence Data modelling
• SQL
• Python
• Talend

Learn more/Apply for this position