Position Purpose:

  • This role functions as a core member of an agile team.
  • These professionals are responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly.
  • They enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces.
  • Additional responsibilities include developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analysing and visualising large datasets.
  • They know how to apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions.
  • Data Engineering is a technical job that requires substantial expertise in a broad range of software development and programming fields.
  • These professionals have a knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution.
  • They have a solid understanding of physical database design and the systems development lifecycle.
  • This role must work well in a team environment.

Qualifications

  • 3 year IT Degree/Diploma
  • AWS Certification at least to associate level

Experience:Essential:

  • 5+ years Business Intelligence experience
  • 2+ years Big Data experience
  • 5+ years experience working with Extract Transform and Load (ETL) processes
  • 2+ years Could AWS experience At least 2 years Agile exposure – Kanban or Scrum

Desirable:

  • 5+ years Retail Operations experience
  • Experience working as a Technical Lead in the relevant area

Knowledge & Skills:

  • Creating data feeds from on-premise to AWS Cloud (2 years)
  • Support data feeds in production on break fix basis (2 years)
  • Creating data marts using Talend or similar ETL development tool (4 years)
  • Manipulating data using python and pyspark (2 years)
  • Processing data using the Hadoop paradigm particularly using EMR, AWS’s distribution of Hadoop (2 years)
  • Devop for Big Data and Business Intelligence including automated testing and deployment (2 years) Further technical skills required:
  • Talend (1 year)
  • AWS: EMR, EC2, S3 (1 year)
  • Python (1 year)
  • PySpark or Spark (1 year) – Desirable
  • Business Intelligence Data modelling (3 years)
  • SQL (3 years)

Job objectives:

  • Design and develop data feeds from an on-premise environment into a datalake environment in an AWS cloud environment
  • Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality
  • Design and develop programmatic transformation, combinations and calculations to populate complex datamarts based on feed from the datalake
  • Provide operational support to datamart datafeeds and datamarts
  • Design infrastructure required to develop and operate datalake data feeds
  • Design infrastructure required to develop and operate datamarts, their user interfaces and the feeds required to populate the datalake.

Learn more/Apply for this position