• Building and maintaining Big Data Pipelines using Data Platforms.
  • Expertise in data modelling Oracle SQL.
  • Analysing large and complex data sets.
  • Perform thorough testing and data validation to ensure the accuracy of data transformations.
  • Working with Enterprise Collaboration tools such as Confluence, JIRA etc.
  • Knowledge of data formats such as Parquet, AVRO, JSON, XML, CSV etc.
  • Working with Data Quality Tools such as Great Expectations.
  • Developing and working with REST API’s is a bonus.

Minimum Requirements:

Above average experience/understanding (in order of importance):

  • Terraform
  • Python 3x
  • SQL – Oracle/PostgreSQL
  • Py Spark
  • Boto3
  • ETL
  • Docker
  • Linux / Unix
  • Big Data
  • Powershell / Bash
  • Cloud Data Hub (CDH)
  • CDEC Blueprint

Basic experience/understanding of AWS Components (in order of importance):

  • Glue
  • CloudWatch
  • SNS
  • Athena
  • S3
  • Kinesis Streams (Kinesis, Kinesis Firehose)
  • Lambda
  • DynamoDB
  • Step Function
  • Param Store
  • Secrets Manager
  • Code Build/Pipeline
  • CloudFormation
  • Business Intelligence (BI) Experience
  • Technical data modelling and schema design (“not drag and drop”)
  • Kafka
  • AWS EMR
  • Redshift
  • Basic experience in Networking and troubleshooting network issues.
  • Knowledge of the Agile Working Model.

Desired Skills:

  • Big Data pipelines
  • AWS
  • Terraform
  • Python
  • Data Engineer

Learn more/Apply for this position