- Develop ETL pipelines. The data transformations will be developed in Azure Databricks using Python and on Azure SQL using T-SQL and deployed using ARM templates
- Combine and curate data in a central data lake
- Serve data for application and analytics through a variety of technologies such as SQL, Server Synapse, CosmosDB and TSI
- Build transformation pipelines into dimensions and facts and therefore a strong knowledge of standard BI concepts is mandatory
- Build stream pipelines leverage IoT Hub, Event Hub, Databricks streaming and other Azure stream technologies.
- Work in a fluid environment with changing requirements whilst maintaining absolute attention to detail.
Minimum Requirements:
Key Skills:
- Phyton – Proficient
- PySpark – Proficient
- SQL – Competent
- Solution Architecture – Competent
- API Design – Competent
- Containers – Competent
- CI/CD – Competent
- Azure Cloud – Competent
- Data Stream patterns and technology – Proficient
- Data engineering design patterns – Competent
- Mining data – Beneficialy
Formal qualifications:
An undergraduate qualification (Bachelor’s degree or equivalent) in the relevant IM discipline and/or Technical competencies and certification with relevant years of experience in a similar role.
Role-specific knowledge:
- Data Lake
- Data Modeling
- Data Architecture
- Azure Data Environment
Specialist Areas:
Unstructured Data – Applies to /wiki/spaces/DAGDG/pages/[Phone Number Removed]; and any products that handles large scale data such as images.
Strong experience of building large scale file shipping and pipelines, ideally using Azure services such as AzCopy and Azure Data Lake. Experience of managing unstructured file meta-data, conversions, standardisation and related workflows. Experience of building analysis jobs that scale on technologies such as Databricks or Azure Batch.
Safety Knowledge:
Provides a consistent outstanding role model concerning safety practices with a deep understanding of the importance of safety
Desired Skills:
