Position Title: Senior Site Reliability Engineer (SRE)
- Experience: 7-11 years
- Location: Pune, India (Hybrid)
Job Overview:
We are looking for a highly skilled Senior Site Reliability Engineer (SRE) to join our dynamic team in Pune. You will be responsible for maintaining the stability, availability, and reliability of critical systems by implementing the best DevOps and SRE practices. Your expertise will ensure continuous monitoring, automation, and proactive incident management, allowing us to deliver a seamless experience to users and clients.
Key Responsibilities:
- System Reliability & Performance: Ensure high availability, scalability, and performance of infrastructure and applications.
- Monitoring & Incident Management: Implement and manage monitoring tools (Prometheus, Grafana, etc.) to detect issues early and proactively resolve them.
- Automation: Automate operational tasks, including infrastructure provisioning, CI/CD pipelines, and incident resolution.
- Incident Response: Lead incident response and post-mortem investigations to prevent future outages and improve reliability.
- Collaborate with Teams: Work closely with development and operations teams to ensure seamless deployment and reliable services.
- Capacity Planning: Analyze system performance and design solutions for improving service scalability.
- Continuous Improvement: Continuously identify and implement performance optimization strategies for infrastructure.
Required Skills:
- DevOps Expertise: Strong experience in DevOps methodologies and tools, including CI/CD pipelines and infrastructure as code.
- Monitoring Tools: Hands-on experience with monitoring tools like Prometheus, Grafana, Datadog, or similar platforms.
- Cloud Platforms: Proficiency with cloud services (AWS, Azure, GCP) for deploying and managing applications.
- Automation & Scripting: Expertise in automation tools and scripting languages (e.g., Bash, Python, Ansible, Terraform, etc.).
- Problem Solving: Strong troubleshooting skills to diagnose, analyze, and resolve complex technical issues.
- Collaboration: Experience working in cross-functional teams and excellent communication skills to work effectively with developers and stakeholders.
Qualifications:
- Bachelors degree in Computer Science, Engineering, or related field.
- 7-11 years of experience in SRE, DevOps, or Systems Engineering.
- Proven experience managing complex systems with high availability.
- Knowledge of containerization technologies like Docker and Kubernetes is a plus.
- Experience with incident management and on-call rotations.
Why Join Us
- Innovative Environment: Work with cutting-edge technology and tools.
- Growth Opportunities: Be part of a team where your ideas and contributions are valued, and where you can grow in your career.
- Work-Life Balance: We offer a flexible and supportive work environment with opportunities for remote work and professional growth.
You can directly drop your CV to [Confidential Information]
note: Please mention your Full Name, Exp, Location preference in Subject Line