Support all aspects of Cloud operations on a day-to-day basis and monitor continuous availability, durability, performance, scale, and up-time of multi-cloud JFrog SAAS services.
Analyse events, perform troubleshooting and incident response, communicate with internal stakeholders, and track events and problems through to resolution
Build and maintain SOP, and tribe knowledge for necessary monitoring, and automation activities.
Closely collaborating with SRE/Production engineering and Cloud engineering on SAAS improvements.
Requirements:
Minimum 1 years of experience as a Cloud-Ops engineer/NOC engineer and /or Linux sysadmin.
Working knowledge of public cloud environments(AWS/Azure/GCP) and Kubernetes orchestrated containerized workloads.
Experience with any monitoring and instrumentation tools (Prometheus/Grafana, New Relic, Elastic, or equivalent) is a plus.
Experience with automation using (Jenkins, Python, and Shell scripting) is a plus
Strong written and verbal communication skills.
Strong attention to detail and ability to handle concurrently multiple tasks.
Ability to work independently under minimal supervision.