Data Engineer (Contract)
Requirements:
- 5 years plus experience only
- Matric (Essential)
- National Diploma /BTech/Mtech IT or BSC/MSC Computer Science
Technical Skills
- Containerization and Docker (Intermediate)
- Linux Script (Intermediate)
- Python(Spark)/R/Java/C#/Bash (Advanced)
- Building and maintaining API(s) and Webhooks (intermediate)
- Object Oriented Programming (Advanced)
- SQL programming (Advanced)
- Data Warehouse, Data Lake and Lakehouse (Advanced)
- Git versioning of code (Intermediate)
- Documentation using Confluence/Wiki tools (Intermediate)
- Distributed programming skills on a clustered environment (Advantageous)
- Security and data governance (Advantageous)
- Data architecture methodologies – Data Mesh (Advantageous)
Responsibilities
- Perform data analysis and normalization to create features for machine learning applications.
- Ingest data into the data environment in batch or realtime/near realtime.
- Build Meta-data driven data integration solutions.
- Develop and maintain APIs for seamless integration of modern data platform with other systems.
- Productize Machine Learning models on a Linux containerized environment to ensure scalability and reliability.
- Utilize distributed programming techniques on a cluster of servers for optimized data processing.
- Work closely with data scientists, system architects, and other stakeholders to design and build a modern data environment.
- Collaborate with teams to advise on optimal datasystem architecture for data warehousing, data lakes, and lakehouses.
- Conduct root cause analysis on production issues surrounding data pipelining and execution.
- Technical leadership of information management process of both structured and unstructured data
- Create comprehensive technical documentation for data systems, APIs, and processes.
- Designing, constructing, maintaining, and troubleshooting an organization’s data architecture.
- Collecting and collating data from multiple sources.
- Ensuring the accuracy and integrity of stored data.
- Conducting exploratory research and implementing new technologies.
- Building reliable data pipelines that deliver useful insights.
Desired Skills:
- Data Warehousing
- Python
- Spark
- Confluence
- R
- C#
- Bash
- Data lakes
Desired Work Experience:
- 5 to 10 years