Deliverables:

  • Implement and maintain infrastructure required for implementing DevOps practices.
  • Enable automated deployment of applications and configurations.
  • Enable automated monitoring and alerting.
  • Enable automated end-to-end testing.
  • Enable continuous release processes, practices and pipelines.
  • Enable change management and audit requirements for release pipelines.
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems.
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • Ability to debug and optimize code and automate routine tasks.
  • Scale systems sustainably through mechanisms such as easy to use tooling and automation
  • Practice sustainable incident response and drive root case analysis

Competencies Required:

  • Client / stakeholder commitment
  • Drive for results
  • Leads change and innovation
  • Impact and influence
  • Self-awareness and insight
  • Diversity and inclusiveness
  • Collaboration
  • Governance
  • Strong critical, analytical and research skills
  • Desire to teach and mentor others
  • Self-motivated, organized and able to work independently and as part of a team

Technology and Skill requirement:

  • Linux – Be proficient in shell scripting
  • Have a very good understanding of Linux operating systems able to identify OS level issues and resolve them with minimal down-time
  • Be able to identify services running and their network configuration 2) WAS – Understand the basic operation of the websphere application server
  • Be able to identify fault in particular node – Be able to view logs via ssh on file mount, as well as via Kibana 3)
  • Queues – Have a good understanding of queuing and queuing systems such as IBM MQ 4) Jenkins
  • Have a very good understanding of Jenkins
  • Be able to find and identify faults with slaves running on remote docker servers
  • Be able to find slave ssh access key issues 5) Ansible
  • Have experience with creating and maintaining Ansible jobs 6) NginX – Understand reverse proxies
  • Be able to read the nginx documentation and use it to extend our automated deployments and configuration
  • Be able to pull metrics and identify trends and faults from nginx logs in Kibana
  • Understand the impact of DNS resolution and nginx upstreams 7) Consul
  • Understand the concept of a central key-value store – Understand multi-node single-leader clusters
  • Be able to identify server-client communication faults – Understand service registration
  • Understand configuration templates 8) Docker
  • Have a very good understanding of containerization Understand multi-tenant systems and the implications of load balancing across multiple instances
  • Be able to find faults in container setup and deployments – Have a good understanding of volume mounts and layered file systems 9) Kubernetes
  • Have a good understanding of container orchestration – Understand cluster DNS
  • Have experience with Istio service mesh
  • Have a good understanding of namespaces and quotas – Understand kubernetes secrets and mounts
  • Have experience with log trailing and event monitoring – Be able to manage an EKS cluster
  • 10) Networking
  • Know what a CIDR is
  • Have a good understanding of general networking
  • Be able to identify network faults
  • Have a good understanding of firewalls
  • Be able to set up and debug AWS Security Groups – Understand AWS VPCs and subnets
  • Monitoring – Be proficient with KQL and the ElasticSearch DSL
  • Be proficient with Prometheus queries and configuration – Understand Grafana or similar monitoring and alerting tools
  • Be proficient with Cloudwatch metrics and logs
  • Have a good understanding of tracing using tools such as Jaeger 12) Repositories
  • Have a very good proficiency with Git
  • Be proficient with Gitlab administration and Gitlab pipelines
  • Understand docker and Maven registries and repositories such as Nexus and Artifactory 13) Databases – Be proficient with MongoDB and MongoDB Ops manager – Be proficient in SQL
  • Have a good understanding of the PostgresQL DBMS
  • Have experience with AWS RDS Aurora PostgresQL 14) AWS – Understand EC2 features, such as instance types, snapshots, ELB, and EBS – Be proficient in Cloudformation
  • Understanding autoscaling and the cost implications
  • Be proficient with creating and deploying AWS Lambda functions
  • Understand IAM policies, users and roles – Have experience with Route53 and a good understanding of DNS in general
  • Understand object storage with S3 15) Programming Languages – Python – Java – Javascript – Go Template Language

QUALIFICATIONS:

  • Relavant IT degree/diploma/certification
  • 4+ years of experience as a Site Reliability Engineer or similar role as an enabled of DevOps practices.
  • 4+ years of experience as a Software Engineer or Java or Developer Middleware administrator

Desired Skills:

  • AWS
  • SQL
  • Java
  • Javascript
  • Python

Desired Work Experience:

  • 2 to 5 years Financial Advisory & Consulting Service
  • 2 to 5 years Software Development

Desired Qualification Level:

  • Degree

About The Employer:

An Insurance and financial based industry company located in Centurion.

Learn more/Apply for this position