Senior DevOps Engineer

This position is responsible for implementing, maintaining, enabling and facilitating DevOps practices as well as optimizing the architecture and processes of the product and platforms required to meet business goals and objectives.

Implement and maintain infrastructure required for implementing DevOps
Enable automated deployment of applications and
Enable automated monitoring and
Enable automated end-to-end
Enable continuous release processes, practices and
Enable change management and audit requirements for release
Interest in designing, analyzing and troubleshooting large-scale distributed
Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and
Ability to debug and optimize code and automate routine
Scale systems sustainably through mechanisms such as easy to use tooling and automation
Practice sustainable incident response and drive root case analysis

Competencies Required

Client / stakeholder commitment
Drive for results
Leads change and innovation
Impact and influence
Self-awareness and insight
Diversity and inclusiveness
Collaboration
Governance
Strong critical, analytical and research skills
Desire to teach and mentor others
Self-motivated, organized and able to work independently and as part of a team
Linux
Be proficient in shell scripting
Have a very good understanding of Linux operating systems
Be able to identify OS level issues and resolve them with minimal down-time
Be able to identify services running and their network configuration
WAS
Understand the basic operation of the websphere application server
Be able to identify fault in particular node
Be able to view logs via ssh on file mount, as well as via Kibana
Queues
Have a good understanding of queuing and queuing systems such as IBM MQ
Jenkins
Have a very good understanding of Jenkins
Be able to find and identify faults with slaves running on remote docker servers
Be able to find slave ssh access key issues
Ansible
Have experience with creating and maintaining Ansible jobs
NginX
Understand reverse proxies
Be able to read the nginx documentation and use it to extend our automated deployments and configuration
Be able to pull metrics and identify trends and faults from nginx logs in Kibana
Understand the impact of DNS resolution and nginx upstreams
Consul
Understand the concept of a central key-value store
Understand multi-node single-leader clusters
Be able to identify server-client communication faults
Understand service registration
Understand configuration templates
Docker
Have a very good understanding of containerization
Understand multi-tenant systems and the implications of load balancing across multiple instances
Be able to find faults in container setup and deployments
Have a good understanding of volume mounts and layered file systems
Kubernetes
Have a good understanding of container orchestration
Understand cluster DNS
Have experience with Istio service mesh
Have a good understanding of namespaces and quotas
Understand kubernetes secrets and mounts
Have experience with log trailing and event monitoring
Be able to manage an EKS cluster
Networking
Know what a CIDR is
Have a good understanding of general networking
Be able to identify network faults
Have a good understanding of firewalls
Be able to set up and debug AWS Security Groups
Understand AWS VPCs and subnets
11) Monitoring- Be proficient with KQL and the ElasticSearch DSL- Be proficient with Prometheus queries and configuration- Understand Grafana or similar monitoring and alerting tools- Be proficient with Cloudwatch metrics and logs- Have a good understanding of tracing using tools such as Jaeger 12) Repositories- Have a very good proficiency with Git- Be proficient with Gitlab administration and Gitlab pipelines- Understand docker and Maven registries and repositories such as Nexus and Artifactory 13) Databases- Be proficient with MongoDB and MongoDB Ops manager- Be proficient in SQL- Have a good understanding of the PostgresQL DBMS- Have experience with AWS RDS Aurora PostgresQL 14) AWS- Understand EC2 features, such as instance types, snapshots, ELB, and EBS- Be proficient in Cloudformation- Understanding autoscaling and the cost implications- Be proficient with creating and deploying AWS Lambda functions- Understand IAM policies, users and roles- Have experience with Route53 and a good understanding of DNS in general- Understand object storage with S3 15) Programming Languages- Python- Java- Javascript- Go Template Language
Qualifications and Experience – Relavant IT degree/diploma/certification- 4+ years of experience as a Site Reliability Engineer or similar role as an enabled of DevOps practices.- 4+ years of experience as a Software Engineer or Java or Developer Middlewareadministrator.

Desired Skills:

DevOps Engineering
Python
JAVA
JavaScript
AWS
CLOUD
GIT

About The Employer:

– Client / stakeholder commitment
– Drive for results
– Leads change and innovation
– Impact and influence
– Self-awareness and insight
– Diversity and inclusiveness
– Collaboration
– Governance
– Strong critical, analytical and research skills
– Desire to teach and mentor others
– Self-motivated, organized and able to work independently and as part of a team

Learn more/Apply for this position