Main Responsibilities:

  • The performance of the Data Engineer can be described and measured by:
  • Define a structured approach to problem solving and delivery against it.
  • Create role specific design standards, patterns, and principles
  • Assist the planning and management of the workload of the team and to ensure delivery
  • Load large, complex data sets to and make data available for other data engineers
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing models for greater scalability
  • Working with other data engineers and data modelers, you will design, implement, and manage data vaults, data transformations and the data pipeline
  • Identify, design, and implement vault access layers to enable BI products to leverage the data within data vaults
  • Monitor and fine-tune data vaults and data transformations on the Cloudera Hadoop stack
  • Use modern development and modelling techniques and tools to implement BI and data management solutions, including data quality, metadata and reference data
  • Engage with a wide range of technical stakeholders including data scientists, data analysts, business analysts, other data engineers and solutions architects
  • Support data stewards to establish and enforce guidelines for data collection, quality improvements, integration, and processes

Role Requirements:
Qualifications:

  • Bachelor’s degree in Computer Science, Statistics, Informatics, Information Systems, Engineering or another quantitative field / National Diploma in an Information Technology related discipline preferred

Work Experience:

  • The Junior Data Engineer must have relevant experience in a similar environment working with the relevant tools and techniques

Technical Knowledge and Experience:

  • The Junior Data Engineer is someone with a strong understanding of data, data structures and data sources. Required skills include:
  • Application and data engineering background with a solid background in SQL is required
  • Knowledge of database management system (DBMS) physical implementation, including tables, joins and SQL querying.
  • Data architecture design and delivery experience preferred
  • Experience in Database technologies (e.g., SAP Hana, Teradata or similar) or Hadoop components including HDFS, Hive, Spark, Oozie and Impala preferred and highly advantageous.
  • Object-oriented/object functional scripting languages (e.g., Python, Java, Scala or related)
  • Knowledge and experience of structured data, such as entities, classes, hierarchies, relationships, and metadata.
  • Strong Data Engineering background with a specific focus on staging high quality data
  • Understanding of data warehousing principles (e.g., Kimball and Vault).
  • Experience in agile development
  • Ability to comply to and manage data assets under a strict governance framework

Desirable/ preferred skills include:

  • Data warehousing (Kimball and Data Vault patterns are preferred) and dimensional data modelling (e.g., OLAP and MDX experience)
  • Experience in developing data pipelines using ETL tools (e.g., SAP Data Services), automation (e.g. Wherescape), scheduling and test automation (e.g. Robot) is desirable
  • A solid background in SQL, Information Architecture and ETL procedures is required Experience with object-oriented/functional/scripting languages (e.g., Python, Unix Shell scripting, Java, Scala etc.) is preferred but not essential.
  • Data Management technologies (e.g., Informatica Data Quality (IDQ), Informatica Enterprise Data Catalog (EDC), Axon, EBX)
  • Event/Streaming based data pipelines (e.g., Kafka or Nifi) nice to have

Desired Skills:

  • Data
  • python
  • scala
  • SAP

Desired Work Experience:

  • 1 to 2 years

Desired Qualification Level:

  • Degree

Learn more/Apply for this position