Senior DevOps Engineer

This position is responsible for implementing, maintaining, enabling and facilitating DevOps practices as well as optimizing the architecture and processes of the product and platforms required to meet business goals and objectives.

Implement and maintain infrastructure required for implementing DevOps

Enable automated deployment of applications and

Enable automated monitoring and

Enable automated end-to-end

Enable continuous release processes, practices and

Enable change management and audit requirements for release

Interest in designing, analyzing and troubleshooting large-scale distributed

Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and

Ability to debug and optimize code and automate routine

Scale systems sustainably through mechanisms such as easy to use tooling and automation

Practice sustainable incident response and drive root case analysis

Competencies Required

Client / stakeholder commitment

Drive for results

Leads change and innovation

Impact and influence

Self-awareness and insight

Diversity and inclusiveness

Collaboration

Governance

Strong critical, analytical and research skills

Desire to teach and mentor others

Self-motivated, organized and able to work independently and as part of a team

Linux

Be proficient in shell scripting

Have a very good understanding of Linux operating systems

Be able to identify OS level issues and resolve them with minimal down-time

Be able to identify services running and their network configuration

WAS

Understand the basic operation of the websphere application server

Be able to identify fault in particular node

Be able to view logs via ssh on file mount, as well as via Kibana

Queues

Have a good understanding of queuing and queuing systems such as IBM MQ

Jenkins

Have a very good understanding of Jenkins

Be able to find and identify faults with slaves running on remote docker servers

Be able to find slave ssh access key issues

Ansible

Have experience with creating and maintaining Ansible jobs

NginX

Understand reverse proxies

Be able to read the nginx documentation and use it to extend our automated deployments and configuration

Be able to pull metrics and identify trends and faults from nginx logs in Kibana

Understand the impact of DNS resolution and nginx upstreams

Consul

Understand the concept of a central key-value store

Understand multi-node single-leader clusters

Be able to identify server-client communication faults

Understand service registration

Understand configuration templates

Docker

Have a very good understanding of containerization

Understand multi-tenant systems and the implications of load balancing across multiple instances

Be able to find faults in container setup and deployments

Have a good understanding of volume mounts and layered file systems

Kubernetes

Have a good understanding of container orchestration

Understand cluster DNS

Have experience with Istio service mesh

Have a good understanding of namespaces and quotas

Understand kubernetes secrets and mounts

Have experience with log trailing and event monitoring

Be able to manage an EKS cluster

Networking

Know what a CIDR is

Have a good understanding of general networking

Be able to identify network faults

Have a good understanding of firewalls

Be able to set up and debug AWS Security Groups

Understand AWS VPCs and subnets

11) Monitoring- Be proficient with KQL and the ElasticSearch DSL- Be proficient with Prometheus queries and configuration- Understand Grafana or similar monitoring and alerting tools- Be proficient with Cloudwatch metrics and logs- Have a good understanding of tracing using tools such as Jaeger 12) Repositories- Have a very good proficiency with Git- Be proficient with Gitlab administration and Gitlab pipelines- Understand docker and Maven registries and repositories such as Nexus and Artifactory 13) Databases- Be proficient with MongoDB and MongoDB Ops manager- Be proficient in SQL- Have a good understanding of the PostgresQL DBMS- Have experience with AWS RDS Aurora PostgresQL 14) AWS- Understand EC2 features, such as instance types, snapshots, ELB, and EBS- Be proficient in Cloudformation- Understanding autoscaling and the cost implications- Be proficient with creating and deploying AWS Lambda functions- Understand IAM policies, users and roles- Have experience with Route53 and a good understanding of DNS in general- Understand object storage with S3 15) Programming Languages- Python- Java- Javascript- Go Template Language

Qualifications and Experience – Relavant IT degree/diploma/certification- 4+ years of experience as a Site Reliability Engineer or similar role as an enabled of DevOps practices.- 4+ years of experience as a Software Engineer or Java or Developer Middlewareadministrator.

Desired Skills:

DevOps Engineering

Python

JAVA

JavaScript

AWS

CLOUD

GIT

About The Employer:

