Junior Data Engineer

Main Responsibilities:

The performance of the Data Engineer can be described and measured by:
Define a structured approach to problem solving and delivery against it.
Create role specific design standards, patterns, and principles
Assist the planning and management of the workload of the team and to ensure delivery
Load large, complex data sets to and make data available for other data engineers
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing models for greater scalability
Working with other data engineers and data modelers, you will design, implement, and manage data vaults, data transformations and the data pipeline
Identify, design, and implement vault access layers to enable BI products to leverage the data within data vaults
Monitor and fine-tune data vaults and data transformations on the Cloudera Hadoop stack
Use modern development and modelling techniques and tools to implement BI and data management solutions, including data quality, metadata and reference data
Engage with a wide range of technical stakeholders including data scientists, data analysts, business analysts, other data engineers and solutions architects
Support data stewards to establish and enforce guidelines for data collection, quality improvements, integration, and processes

Role Requirements:
Qualifications:

Bachelor’s degree in Computer Science, Statistics, Informatics, Information Systems, Engineering or another quantitative field / National Diploma in an Information Technology related discipline preferred

Work Experience:

The Junior Data Engineer must have relevant experience in a similar environment working with the relevant tools and techniques

Technical Knowledge and Experience:

The Junior Data Engineer is someone with a strong understanding of data, data structures and data sources. Required skills include:
Application and data engineering background with a solid background in SQL is required
Knowledge of database management system (DBMS) physical implementation, including tables, joins and SQL querying.
Data architecture design and delivery experience preferred
Experience in Database technologies (e.g., SAP Hana, Teradata or similar) or Hadoop components including HDFS, Hive, Spark, Oozie and Impala preferred and highly advantageous.
Object-oriented/object functional scripting languages (e.g., Python, Java, Scala or related)
Knowledge and experience of structured data, such as entities, classes, hierarchies, relationships, and metadata.
Strong Data Engineering background with a specific focus on staging high quality data
Understanding of data warehousing principles (e.g., Kimball and Vault).
Experience in agile development
Ability to comply to and manage data assets under a strict governance framework

Desirable/ preferred skills include:

Data warehousing (Kimball and Data Vault patterns are preferred) and dimensional data modelling (e.g., OLAP and MDX experience)
Experience in developing data pipelines using ETL tools (e.g., SAP Data Services), automation (e.g. Wherescape), scheduling and test automation (e.g. Robot) is desirable
A solid background in SQL, Information Architecture and ETL procedures is required Experience with object-oriented/functional/scripting languages (e.g., Python, Unix Shell scripting, Java, Scala etc.) is preferred but not essential.
Data Management technologies (e.g., Informatica Data Quality (IDQ), Informatica Enterprise Data Catalog (EDC), Axon, EBX)
Event/Streaming based data pipelines (e.g., Kafka or Nifi) nice to have

Desired Skills:

Desired Work Experience:

Desired Qualification Level: