Our client is looking for a Senior-Level DevOps Engineer to join their engineering team. This role is suited to a highly experienced, hands-on, and technically strong DevOps professional with deep cloud infrastructure expertise and a passion for building and maintaining scalable, high-availability production environments.

The successful candidate will take ownership of complex multi-cloud infrastructure, lead deployment and monitoring strategies, support mission-critical production systems, and collaborate closely with development, QA, and engineering teams to ensure reliable, secure, and efficient platform operations across global environments.

Key Responsibilities:

  • Design, implement, maintain, and optimise highly available multi-cloud infrastructure environments across AWS and supporting cloud platforms
  • Manage and scale production workloads across multiple AWS regions with a strong focus on uptime, reliability, and security
  • Build, maintain, and improve Infrastructure-as-Code using Terraform across Development, Testing, and Production environments
  • Design and maintain CI/CD pipelines using Jenkins and deployment orchestration tools such as Spinnaker, ArgoCD, or Harness
  • Implement and manage Blue/Green and Red/Black deployment strategies, including rollback and artifact promotion processes
  • Administer and optimise AWS RDS/Aurora MySQL environments, including upgrades, migrations, backups, restores, and performance tuning
  • Manage and monitor messaging systems such as RabbitMQ, including scaling consumers and load balancing using HAProxy and Nginx
  • Monitor infrastructure health using Prometheus, Grafana, ELK Stack, and related monitoring tools
  • Troubleshoot complex production issues, conduct root cause analysis, and lead post-mortem investigations to reduce MTTR
  • Perform advanced Linux administration, Bash scripting, networking troubleshooting, and performance optimisation
  • Support platform deployments and debugging across PHP, Python, and JavaScript-based services
  • Collaborate with software engineers and product teams to ensure smooth deployments and operational excellence
  • Contribute to infrastructure architecture, technical strategy, scalability planning, and cost optimisation initiatives
  • Participate in on-call rotations and act as an escalation point during production incidents
  • Maintain and improve AI/ML infrastructure pipelines, GPU workloads, and distributed processing environments where applicable

Requirements:

  • 5+ years’ hands-on DevOps and cloud infrastructure experience
  • Advanced AWS experience managing production systems across multiple regions
  • Strong expertise with:
    • EC2
    • RDS/Aurora (MySQL)
    • VPC design, routing, peering, and ACLs
    • IAM roles and policies
    • S3 and CloudFront
    • Security Groups

  • Extensive Terraform experience, including:
    • Modular infrastructure design
    • Remote state management
    • Environment separation
    • Infrastructure code reviews and refactoring

  • Strong Jenkins pipeline creation and CI/CD automation experience
  • Experience with deployment orchestration tools such as Spinnaker, ArgoCD, or Harness
  • Experience implementing Blue/Green or Red/Black deployment methodologies
  • Strong MySQL database administration experience, including:
    • Production upgrades and migrations
    • Backup and restore procedures
    • Performance tuning

  • Proven RabbitMQ production support experience
  • Experience with HAProxy and Nginx load balancing
  • Strong monitoring and logging experience using Prometheus, Grafana, ELK Stack, or equivalent
  • Proven production incident response and on-call support experience
  • Advanced Linux administration skills (Ubuntu CLI)
  • Strong Bash scripting and troubleshooting capabilities
  • Solid networking fundamentals
  • Comfortable supporting and debugging:
    • PHP applications
    • Python automation and AI integrations
    • JavaScript-based deployment environments

  • Experience with Docker or containerised environments
  • Exposure to multi-cloud infrastructure environments (AWS and GCP preferred)
  • Experience operating high-availability systems with 24/7 uptime requirements
  • Exposure to AI/ML infrastructure, GPU workloads, or video/media processing systems (advantageous)
  • AWS certifications (advantageous)

Technical & Professional Skills:

  • Advanced AWS cloud infrastructure management
  • Strong Terraform and Infrastructure-as-Code expertise
  • CI/CD pipeline architecture and deployment automation
  • Database administration and performance optimisation
  • Monitoring, observability, and incident response management
  • Linux systems administration and troubleshooting
  • Messaging systems and distributed architecture support
  • Infrastructure scalability and cost optimisation
  • Strong networking and load balancing knowledge
  • Experience supporting AI/ML and high-throughput environments
  • Multi-cloud platform exposure and operational support

Preferred Qualifications:

  • Tertiary qualification in Computer Science, Information Technology, Engineering, or a related field
  • Relevant AWS, DevOps, or cloud certifications
  • Experience working in fast-paced Agile or product-based environments
  • Experience operating large-scale, customer-facing production systems

Key Competencies:

  • Strong analytical and troubleshooting abilities
  • High attention to detail and operational excellence
  • Strong communication and collaboration skills
  • Ability to work effectively under pressure in high-availability environments
  • Strong ownership mentality and accountability
  • Proactive, solution-driven mindset
  • Ability to lead during critical production incidents
  • Passion for automation, scalability, and continuous improvement
  • Strong mentoring and knowledge-sharing approach

For more exciting IT vacancies, visit:
Network Recruitment International IT Jobs

We also specialise in recruiting for:

  • Software Developers (Back-End, Front-End, Full Stack)
  • Mobile Developers
  • Business & Systems Analysts
  • BI & SQL Experts
  • UI/UX Professionals
  • Data Scientists & Data Analysts
  • Big Data Professionals
  • Cloud Experts
  • Infrastructure Specialists
  • DevOps & SecOps Engineers
  • Cybersecurity Specialists
  • SEO / Digital Designers

Please note: If you have not received feedback within two weeks, please consider your application unsuccessful. Your profile will remain in our database for future opportunities.

For more information, contact:
Reinie Du Preez
Senior Specialist Recruitment Consultant
[Email Address Removed]

Desired Skills:

  • devops
  • engineer
  • aws
  • docker
  • kubernetes

Learn more/Apply for this position