• Analyse and locate root causes, recognize, and address systemic factors, and diagnose and mitigate weaknesses before they become disruptive.
  • Respond to alerts and service requests, resolving incidents to ensure system uptime and expected service levels.
  • Deep dives into stability issues, providing solutions to promote system stability and optimization thereof.
  • Provide operational support on a rotating, on-call schedule as part of an Operations team.
  • Building and setting up new development tools, infrastructure, and cloud services
  • Working with software developers and software engineers to ensure that development follows established processes and works as intended.
  • Increase coverage of monitoring and alerting capability.
  • Trend analysis and real time analysis resulting in production solutions for increased system and functional stability.
  • Investigate, analyse, and document production incidents, according to SLA agreement and urgency.
  • Escalation of incidents to 3rd line support, if needed (squad, developers, Infra, Network, DB Admins). Work closely with all parties necessary in order to solve problems.
  • Create and maintain an up-to-date documentation and procedures of your area of support.
  • Ability to effectively interface with technical and nontechnical staff at all organizational levels.
  • Excellent problem solving/analytical skills and knowledge of analytical tools.
  • Logging all incidents accurately and documenting all investigative activities, including all technical means employed to ascertain the nature of the fault and remedial action taken. • Build and maintain monitoring dashboards and alerts to ensure production and system uptime for all systems both on premises and in the cloud
  • Review security alerts to decide relevancy and urgency of potential threats and take appropriate action to mitigate risk.
  • Run vulnerability scans and review vulnerability assessment reports to assess, address and report vulnerabilities to the development teams.
  • Manage and configure security monitoring tools (net flows, IDS, correlation rules, etc.) to ensure optimal use and coverage.
  • Monitor security access to identify potential risk and address with appropriate actions.
  • Define access privileges, control structures and resources to protect systems.
  • Contribute to the development and maintenance of security policies, procedures, standards, and awareness, by providing data insights through analysis.
  • Ensuring that systems are safe and secure against cybersecurity threats

Minimum Requirements:

  • Minimum 1 object-oriented and 1 scripting language (Python PowerShell, Bash and .NET)
  • Familiarity using Docker, Kubernetes & Helm
  • Knowledge of Cloud network topologies and configuration techniques such as VLANs, VPNs or VNETs
  • Comprehensive knowledge of network protocols and services such as TCP/IP, DNS, and DHCP
  • Online version control systems (Subversion, GitHub, Bitbucket)
  • An understanding of Azure Cloud infrastructure
  • Comfortable working with a small team in a fastpaced environment
  • Configuration management and containerization tools
  • Common data stores, both relational and NoSQL
  • Data integrity, security, and continuity of business

Educational Requirements

• 3 years IT Software & System experience, with a focus or on IT Operations & DevSecOps

• Good understanding of Unix & Windows Operating systems

• Experience with SSL & TLS

• Proven experience in monitoring capabilities

• Strong understanding of middleware technologies

• Cloud Certifications are an advantage.

• Experience migrating on-prem services to cloud.

• Experience working in an AGILE environment with experience in agile based tooling

Desired Skills:

  • Python
  • Unix/Linux
  • Cloud network
  • Data integrity

Learn more/Apply for this position