SRE II - Observability & Reliability

Movius

Early Applicant

a month ago
Be among the first 50 applicants

Full time

Bengaluru / Bangalore, India

Job Description

Job Summary

We are seeking a Senior Software Engineer to join our Site Reliability Engineering team, with a focus on Observability and Reliability. As a key member of our SRE team, you will play a critical role in ensuring the performance, stability, and availability of our applications and systems with a focused approach in Application Performance Management, Observability & Reliability of the platform.

The Senior Software Engineer will be responsible for the design, implementation, and maintenance of our observability and reliability infrastructure, with a primary focus on the ELK stack (Elasticsearch, Logstash, and Kibana). The role involves configuring, fine-tuning, and automating alerts, integrating Elastic solutions with other tools and applications, generating reports, and optimizing the observability and monitoring systems.

Key Duties & Responsibilities

1

Collaborate with cross-functional teams to define and implement observability and reliability standards and best practices.

2

Design, deploy, and maintain the ELK stack for log aggregation, monitoring, and analysis.

3

Develop and maintain alerts and monitoring systems, ensuring early detection of issues and rapid incident response.

4

Create, customize, and maintain dashboards in Kibana for different stakeholders.

5

Collaborate with software development teams to identify performance bottlenecks and recommend solutions.

6

Automate manual tasks and workflows to streamline observability and reliability processes.

7

Conduct regular system and application performance analysis and optimization, effective automation & tooling, capacity planning and optimization, security practices and compliance adherence, documentation and knowledge sharing, Disaster Recovery and backup.

8

Generate and deliver detailed reports on system performance and reliability metrics.

9

Stay up to date with industry trends and best practices in observability and reliability engineering.

Qualifications/Skills/Abilities

Minimum Requirements

Formal Education

Bachelors degree in computer science, Information Technology, or a related field (or equivalent experience).

Experience (type & duration)

5+ years of experience in Site Reliability Engineering, Obervability & reliability, DevOps

Skills

Proficiency in configuring and maintaining the ELK stack (Elasticsearch, Logstash, Kibana) is mandatory.
Strong scripting and automation skills, with expertise in Python, Bash, or similar languages.
Experience in Data structures using Elasticsearch Indices.
Experience in writing Data Ingestion Pipelines using Logstash.
Experience with infrastructure as code (IaC) and configuration management tools (e.g., Ansible, Terraform).
Handson and experience with cloud platforms ( AWS preferred) and containerization technologies (e.g., Docker, Kubernetes).
Good to have Telecom domain expertise but not mandatory
Strong problem-solving skills and the ability to troubleshoot complex issues in a production environment.
Excellent communication and collaboration skills.

Accreditation/certifications/licenses

Relevant certifications (e.g., Elastic Certified Engineer) are a plus.

More Info

Industry:Other

Job Type:Permanent Job

Date Posted: 08/10/2024

Job ID: 95391095

Report Job

About Company

MoviusJob Source: movius.ai

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs

SDE II I Site Reliability Engineering Consumer Technology

OesonCompany Name Confidential

0-2 yrs

Mumbai, India

4 months ago

Site Reliability Engineer SRE

TATA Consultancy Services Ltd Company Name Confidential

0-0 yrs

Pune, India

2 weeks ago

Last Updated: 20-10-2024 07:52:29 PM

Home Jobs in Bengaluru / Bangalore SRE II - Observability & Reliability

Jobs by Skill - IT

Jobs by Skill - Non IT

International Jobs

Do you want to see more relevant and perfect job for you?

Beware of Scammers

We don’t charge any money for job offers

What it feels like to have

48% more interview calls?

To get 5X more recruiter views on your profile

SRE II - Observability & Reliability

Job Description

More Info

About Company

Similar Jobs

SDE II I Site Reliability Engineering Consumer Technology

Site Reliability Engineer SRE

Data Reliability Engineer II

Site Reliability Engineer SRE

Site Reliability Engineer II BKS Operate

Site Reliability Engineering SRE

SRE Observability Engineer

Software Reliability Engineer II

Associate SRE Engineer Site Reliability Engineering

Reliability Engineer II

Site Reliability Engineer Senior II Remote

AWS SRE Site Reliability Engineer

Senior Site Reliability Engineer SRE

Sr Site Reliability Engineer II Cloud FinOps

Site Reliability Engineer SRE Manager

Site Reliability Engineer SRE 28653