Search by job, company or skills
Job Purpose
Analysing, troubleshooting, and designing vital services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, security, and performance. Lead a team of SRE engineers
Job Responsibilities(JR) : 6 8 Areas:
Help build a Site Reliability Engineering culture by sharing the best practices, approaches, documentation, and code with other engineering teams
Apply automation and software to any tasks or parts of the system which are performed manually
Able to troubleshoot complicated, cross platform issues handling OS, Networking, Database in a cloud-based SaaS environment and handle live production incidents
Monitor application performance take steps to improve overall application performance and stability and follow through with implementation
Conduct system analysis, configuration management and develops improvements for system software performance, availability and reliability
Actionable (4-6):
Design, write, ship, and motivate the creation of software and systems to increase observability, product reliability and organizational efficiency
Maintain and monitoring deployment, orchestration, of the servers, docker containers, databases, and general backend infrastructure
Develop Run Books/Standard Operating Procedure for recurring Production issues, also working on a permanent solve.
Perform Incident Analysis on a regular basis with the intention of preventing and finding a long term solve for Incidents.
Educational Qualification:
Total Yrs of experience: 8-13
Educ:
B Tech in Computer Science or related
discipline preferred.
Skills:
Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools
Demonstrable experience in Containerization-Docker and orchestration (Kubernetes)
Experience with Infrastructure As Code (Terraform, Cloud Formation, Ansible)
Knowledge and proven hands-on experience in large-scale databases and distributed technologies, such as Kafka and Confluent Platform Kafka
Basic programming and scripting skill
Industry:Other
Job Type:Permanent Job
Date Posted: 06/11/2024
Job ID: 99375367