Roles and Responsibilities: Data
- Assist in creating and maintaining optimal data pipeline architecture and creating databases optimized for performance, implementing schema changes, and maintaining data architecture standards across the required databases. Work alongside data scientists to help make use of the data they collect.
- Assemble large, complex data sets that meet functional / non-functional business requirements and align data architecture with business requirements.
- Assist in enabling and running data migrations across different databases and different servers and defines and implements data stores based on system requirements and consumer requirements.
- Assist in performing thorough testing and validation in order to support the accuracy of data transformations and data verification used in machine learning models.
- Perform ad-hoc analyses of data stored in databases and writes SQL scripts, stored procedures, functions, and views. Proactively analyses and evaluate the databases in order to identify and recommend improvements and optimisation. Deploy sophisticated analytics programs, machine learning and statistical methods.
- Analyse complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models.
- Liaise and collaborate with the entire team, providing support to the entire department for its data centric needs. Collaborate with subject matter experts to select the relevant sources of information and translates the business requirements into data mining/science outcomes. Presents findings and observations to team for development of recommendations.
- Utilise data under supervision to discover tasks that can be automated and identify, design, and implement internal process improvements, automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Designing and developing scalable ETL packages from the business source systems and the development of ETL routines in order to populate databases from sources and to create aggregates. Manager large-scale data Hadoop platforms and to support the fast-growing data within the business.
- Build analytics tools that utilise the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics. Create data tools for analytics and data scientist team members that assist them in building and optimising the organization into an innovative industry leader.