Responsible for driving, designing and building scalable ETL systems for a big data warehouse to implement robust & trustworthy data to support high performing ML algorithms, predictive models and support real-time data visualisation requirements across the organisation to enable self-help analytics
Responsibilities:
Systematic solution design of the ETL and data pipeline inline with business user specifications
- Ensure highest data quality assurance, data accuracy and data completeness through regular and in-depth review and testing of work
- Create easily understandable technical documentation that are kept up to date
- Conduct data design, database architecture, metadata and repository creation activities and tasks as required by business stakeholder.
- Translates business needs into long-term architecture solutions.
- Define, design and build dimensional databases.
- Design the ETL pipelines
- Responsible for developing data warehousing blueprints, evaluating hardware and software platforms and integrating systems.
- Evaluates reusability of current data for additional analyses.
- Conducts data cleaning to rid the system of old, unused, or duplicate data.
- Review object and data models and the metadata repository to structure the data for better management and quicker access.
- Determine processes to ensure execution of relevant data application requirements for various business needs
- Utilise relevant templates that outlines the requirements for each step within the data modelling journey
- Conduct testing and quality control of databases to ensure accurate and appropriate use of data.
- Initiate and successfully motivate improved ways of operating
Develop and implement ETL pipelines aligned to the approved solution design
- Enhance and maintain our existing ETL frameworks in line with agreed design patterns and internal governance standards to improve our EDW product offering and to remain scalable
- Implement the ETL pipeline in a timely manner
- Utilise most accurate data source to remodel into a set of data that is understandable to the end user.
- Understand data structures to deliver data sets that are deliver to exact requirements of end user brief
- Ensure data is precise and is benchmarked and validated against financial records
- Utilise consistent data sources which result in one version of the truth
- Deliver on standard data marts that can be utilised for reporting and analysis which is well documented and understood by business users.
- Translate Meta data into explanatory reports and visuals for easy understanding to end user.
- Perform data pre-processing which includes data manipulation, transformation, normalisation, standardisation, visualisation and derivation of new variables/features, as applicable to developing specific algorithms or models.
Ensure data governance and data quality assurance standards are upheld
Deal with customers in a customer centric manner
Effective self-management and teamwork
Requirements:
3 year IT related degree
Post graduate qualification (advantageous)
5-10 years’ Experience and understanding in designing and developing data warehouses according to the Kimball methodology. Adept at design and development of ETL processes. SQL development experience, preferably SAS data studio and AWS experience The ability to ingest/output CSV, JSON and other flat file types and any related data sources. Proficient in Python or R or willingness to learn. Experience within Retail, Financial Services and Logistics environments
Desired Skills:
- SQL
- DATA Engineer
- Kimball methodology
- SAS
- AWS
- ETL
- Hive
- Data engineering
Desired Work Experience:
- 5 to 10 years
Desired Qualification Level:
- Degree