Data Scientist required for a private health care provider in Johannesburg.

Duties include but not limited to

  • Assist with research on trends in Data Science, specifically for the application in the healthcare industry.
  • Collaborate with business to identify requirements as well as opportunities for improving business processes and creating value to business.
  • Partner with business stakeholders to define approaches to resolving key business problems and focus on the development of new business strategies.
  • Identify expected outcomes of modelling.
  • Assist in developing conceptual designs or models to address business requirements.
  • Collaborate with subject matter experts to select the relevant sources of data and information.
  • Identify available and relevant data leveraging collection processes and identify new data collection processes such as social media.
  • Partner with the Data Engineering team, where required, to manage data ingestion.
  • Recommend third party sources of information to extend company’s data when required.
  • Perform pre-processing of data which includes, but is not limited to:

Data manipulation
Transformation
Normalisation
Standardisation
Visualisation

  • Derivation of new variables/features as applicable to developing specific algorithms or models.
  • Execute code reviews on existing assets and highlight relevant performance or data related issues.
  • Use data profiling and visualisation to understand and explain data characteristics that will inform modelling approaches.
  • Perform feature engineering as applicable for building algorithms and models using machine learning techniques.
  • Identify, create and implement the appropriate algorithm to discover patterns.
  • Identify and implement the appropriate data mining/statistics/machine learning techniques.
  • Enhance, find patterns in and build models on large data sets using distributed data processing and analysis methodologies
  • Apply data mining techniques and perform statistical analysis on large data sets.
  • Develop experimental design approaches to validate findings or test hypotheses.
  • Analyse, interpret and explain results using appropriate statistical tools and techniques which can translate findings into clear, actionable and timely insights.
  • Validate analysis using appropriate techniques (applying test data sets, A/B testing, scenario modelling, etc.).
  • Productionising models using standard processes and techniques.
  • Monitor the predicted outcomes of models.
  • Understand business requirements to ensure that models are delivered in an appropriate format.

Business Support

  • The ability to build, analyse and interpret numerical and non-numerical data to determine potential statistical inferences to inform business and clinical decisions.
  • Ability in applying statistical machine learning techniques to predictive modelling.
  • Ability to clean and unify messy and complex data sets for easy access and analysis. Combining structured and unstructured data.
  • Ability to provide detailed explanations (visually and verbally), representing information in the form of a chart, diagram, picture, using tools such as Kibana, Tableau, Power BI, etc.
  • Write programming code based on a prepared design.
  • Understand leading edge technologies and best practice around Big Data, platforms and distributed data processing i.e. Hadoop ecosystem (distributed computational power)-HDFS/Spark/Kafka.
  • Ability to conceptualise and frame a problem, develop hypothesis and identify objective measures to estimate accuracy of machine learning/statistical processes and perform testing and validation with careful experiments.
  • Understanding of data flows, ETL and processing of structured and unstructured data within the data architecture.
  • Comprehensive solution design based on a good understanding of the Big Data Architecture.

Minimum requirements

  • NQF Level 7 – Bachelor’s degree or Advanced Diploma in the area of statistics, computer science, engineering or mathematics.
  • Relevant data science certifications such as Python, Microsoft ML, AWS, Hadoop, big data, machine learning, cloud infrastructure.
  • Certification in SQL and working with large-scale data sets.
  • Project Management qualification or agile certification such as Scrum or Prince2.
  • An advanced level of Computer Literacy and proficiency in MS Office applications.
  • A minimum of 4 years’ experience in data science related initiatives or projects.
  • Experience with SQL and working with large-scale data sets.
  • Practical experience applying machine learning techniques.
  • Experience working in agile development teams.
  • Experience in operationalising data science solutions or similar product development.
  • Experience in a high-scale production environment is critical.
  • Experience with Python/Microsoft ML and tools available within the machine learning ecosystem.
  • Proven track record in business process analysis, systems and data analysis.
  • Solution focused and strong collaborative mind-set.
  • Demonstrates excellent organisational skills: organised and structured.
  • Outstanding problem solving and analytical skills.
  • Knowledge and understanding of the data science process including but not limited to:
    Data profiling
    Feature selection
    Data modelling
    Model evaluation
    Production and implementation
    Monitoring
  • Knowledge of trends and developments in the health care industry.
  • Business and clinical knowledge that will contribute to exposing patterns.

Desired Skills:

  • data science
  • scientist

Desired Work Experience:

  • 2 to 5 years

Learn more/Apply for this position