Would you like to join our Dynamic Data and Analytics Cluster! We have a Azure Architect / Engineer opportunity available.

Remote/Hybrid

Preferred Qualifications:

  • Grade 12
  • Relevant Certification

Experience Required:

  • 5 years experience
  • Function-related experience within Data and Analytics

Duties/ Responsibilities:

  1. Well-versed in the following:

  • In depth understanding of Azure platform, resources, security, monitoring, deployments (using ARM templates or Powershell)
  • Well versed in Kimball Dimensional modelling as well as an in depth understand of all data platform resources
  • Solid, solid, solid SQL skills. The required solution is architected using advanced SQL concepts – need to test
  • Azure Data Lake Storage

    • Understand container structures and hierarchical folders and how to optimise for read and write operations
    • How to read from and output to various file types (json, csv, parquet, blob)
    • Configure access for various tools (ADF, DataBricks, Synapse Delta Lake)

  • Azure Data Factory

    • Must have developed ADF Pipelines end-to-end using various client frameworks
    • Develop complex dataflows
    • Read from hierarchical ADLS folder structures as single data sets from csv and parquet formats
    • Output to hierarchical ADLS folders in json, csv parquet formats
    • Output to Azure SQL, Synapse Dedicated SQL Pools
    • Perform incremental loads
    • Orchestrate pipelines within Pipelines as well as DataBricks Notebooks
    • Use various types triggers for relevant scenarios

  • Azure DataBricks:

    • Must understand concepts of DataBricks Clusters including mounting
    • Must have developed notebooks using Python, Scala and SQL and doing complex transformations reading from different disparate sources, and output to various destinations including ADLS (json, csv parquet), Azure SQL, Synapse Dedicated SQL pools and Delta Lake
    • Perform incremental loads

  • Delta Lake

    • Understand Lake House architecture
    • Use DDL to create Delta Lake objects (managed- and external tables)
    • Understand and use bronze, silver and gold tiers
    • Read from various sources and ingest into managed tables or to reference external tables
    • How to utilise ACID transactions (update, delete, upsert merge), history, time travel, vacuum
    • Perform ingestion (circuit files, result files other files)
    • Transform using PySpark and SQL Notebooks

  • Synapse

    • Provisioning of workspaces
    • Create SQL Pools
    • Create database objects (tables, views, external tables)
    • How to optimise storage using various distribution architectures
    • Advanced SQL concepts specific to Synapse
    • Development of Synapse Pipelines (same skillset as for ADF)

  • DevOps:

    • Source control for all components developed – must know have to integrate dev tools with DevOps Repos
    • Know how to build and use Release / Build Pipelines

  • Able to hit the ground running
  • Have the ability to quickly adapt and fit into the client’s culture
  • Quickly grasp new concepts not familiar to him /her

Desired Skills:

  • Systems Analysis
  • Complex Problem Solving
  • Programming
  • C#
  • Java
  • SQL
  • HTML

Learn more/Apply for this position