Search by job, company or skills
Responsibilities
Reliability, Uptime, Availability of the Infrastructure
Develop tools and applications to improve the developer experience.
Eliminating toils
Ensure the availability, scalability, and performance of services by
implementing best practices in site reliability engineering. Implement the
golden signals
Collaborate with internal teams and vendors to fix and improve tools,
processes and increase productivity
Manage infrastructure using code and configuration management tools like
Terraform, Ansible, or similar technologies.
Implement and maintain security best practices in system and network
configurations.
Requirements
Proficient in one of the programming languages (GoLang and Python
preferred). This is a must have requirement.
Extensive experience with Networking at layer 3, 4 and layer 7.
Experience with System Design and has proven experience with large scale
distributed systems.
Extensive with cloud platforms (AWS, GCP, Azure) and container
orchestration (Kubernetes, Docker) and designing CI CD for the same.
Experience with Kubernetes operators, upgrades and can write operators.
Extensive experience with AWS or any other cloud provider
Experience using system monitoring tools such as prometheus, grafana,
mimir, loki, new relic, ELK .
In depth knowledge of a few databases out of these. (MongoDB, MySQL,
Postgres)
Familiarity with linux operating systems
If you have contributed to Open source projects, don't miss to mention that
Date Posted: 27/06/2024
Job ID: 83223511