A provider of cutting-edge Digital Health Tools & Solutions in Pretoria seeks the expertise of a Data Scientist with at least 10 years’ experience in the field of Technology and Information Management. Your role will entail the design, build and management of data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management. You must have the ability to work with both IT and business in Integrating Analytics and Data Science output into business processes and workflows. The successful incumbent will need a suitable Bachelor’s Degree in Computer Science/Informatics or equivalent with a Data Science specific qualification. You will also require 5 years’ experience working with Predictive Analytics, Machine Learning and Neural Networks and proficiency with Big Data, SQL, R, Python, Java, ETL processes, data integration and data preparation flows and helping to move them in production. You must also have advanced skills in – Wrangling and transforming data to perform meaningful analyses, Data visualisation and interpretation and effectively communicating results and findings and the application of statistical methods to draw scientific conclusions from data.

Relevant Bachelor’s Degree or equivalent in Computer Science or Informatics.

Specific qualification in Data Science.

Honours Degree will be an advantage.

At least 10 years’ experience in the field of Technology and Information Management.

A minimum of 5 years’ experience working with Data Science technologies, such as Predictive Analytics, Machine Learning and Neural Networks.

Statistical Modelling and Machine Learning technologies.

Applying methods for Big Data to reveal patterns, trends, and associations.

Data management and analytics software and coding.

Popular database programming languages including SQL for relational databases.

Strong experience with advanced Analytics tools for Object-Oriented/Object function scripting using languages such as R, Python, Java etc.

Strong experience working with large, heterogeneous datasets in building and optimising data pipelines, pipeline architectures, and integrated datasets using traditional data integration technologies. These should include ETL/ELT, data replication/CDC.

Working with and optimising existing ETL processes, data integration and data preparation flows and helping to move them in production.

Wrangling and transforming data to perform meaningful analyses.

Data visualisation and interpretation and effectively communicating results and findings.

The application of statistical methods to draw scientific conclusions from data. Basic experience in working with message queuing technologies. Basic experience working with popular data discovery, analytics and BI software tools like Tableau, Power BI etc., for semantic-layer-based data discovery.



