Search by job, company or skills

athenahealth

Senior Site Reliability Engineer -AWS

Early Applicant
  • 5 months ago
  • Be among the first 50 applicants

Job Description

We are looking for a Senior Site Reliability Engineer to join our Cloud Infrastructure Engineering division. Cloud Infrastructure Engineering ensures the continuous availability of the technologies and systems that are the foundation of athenahealth's services. We are directly responsible for thousands of servers, petabytes of storage, and handling thousands of web requests per second, all while sustaining growth at a meteoric rate. We enable an operating system for the medical office that abstracts away administrative complexity, leaving doctors free to practice medicine.

What we are looking for is :

You're a seasoned engineer with a passion for identifying and resolving reliability and scalability challenges. You are a curious team player, someone who loves to explore, learn, and make things better. You are excited to uncover inefficiencies in business processes, creative in finding ways to automate solutions, and relentless in your pursuit of greatness. You're a nimble learner capable of quickly absorbing complex solutions and an excellent communicator who can help evangelize engineering excellence.

The Team:

We are a group of Site Reliability Engineers who are passionate about reliability, automation, and scalability. We use an agile based framework to execute our work, ensuring we are always focused on the most important and impactful needs of the business. We support systems in both private and public cloud and make data-driven decisions for which one best suits the needs of the business. We are relentless in automating away manual, repetitive work so we can focus on projects that help move the business forward.

Primary Responsibilities

  • Deploying, maintaining, and managing: Deploying, automating, and managing an AWS production system
  • Ensuring reliability: Ensuring that AWS production systems are reliable, secure, and scalable.
  • Resolving problems: Resolving problems across multiple platforms and application domains using system troubleshooting and problem-solving techniques
  • Provide primary operational support and engineering for all Cloud and Enterprise deployments.
  • Monitoring system performance: Monitoring system performance and identifying downtimes along with the underlying causes.
  • Create and develop cost-effective systems within an account.

Secondary Responsibilities

  • Working closely with developers, testers, and system administrators
  • Introducing processes, tools, and methodologies to balance needs throughout the SDLC and/or pipeline management and data flow.
  • Integrating security measures: Integrating security measures in the development lifecycle.

Typical Qualifications

  • 7+ years of experience building, scaling, and supporting highly available systems and services.
  • Expertise in the delivery, maintenance, and support of Linux systems and infrastructure.
  • Experience building AWS platforms.
  • Extensive AWS experience: Working familiarity with AWS commonly used services (Computing/EC2, Networking, Content delivery, Containers/ECS/EKS, storage/S3, CloudFormation, Serverless computing/Lambda, Load balancing, AMIs, Operation management best practices etc.) required.
  • Expertise in configuration management tools like puppet. Experience with Infrastructure-as-Code, Linux, and API integration. Familiarity with Terraform desired.
  • Proficiency in at least one scripting or programming language (Ansible, Python, Go, Ruby, Shell etc.)
  • Experience implementing solutions using SRE, DevOps principles, Continuous integration & continuous delivery, source code management/version control/bitbucket/github.
  • Familiarity with telemetry, observability, latest monitoring, visualization tools e.g., Prometheus, Alertmanager, Grafana or similar tools desired.
  • Expertise in promoting and driving system visibility to aid in the rapid detection and resolution of issues.

Behaviors & Abilities Required:

  • Ability to learn and adapt in a fast-paced environment.
  • Ability to work collaboratively on a cross-functional team with a wide range of experience levels.
  • Ability to prioritize both individual time and the time of the team.
  • Strong negotiation and problem-solving skills
  • Ability to keep projects on track and provide regular progress updates.
  • Ability to context-switch when required and manage multiple projects simultaneously.
  • Participation in Rotational On-Calls and/or work Shift

More Info

Skills Required

Login to check your skill match score

Login

Date Posted: 10/06/2024

Job ID: 81339983

Report Job

About Company

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs

Senior Site Reliability Engineer Logging Metrics and Monitoring

athenahealthCompany Name Confidential

Senior Site Reliability Engineer AWS

Lufkin GearsCompany Name Confidential
Last Updated: 10-06-2024 01:18:01 PM
Home Jobs in Bengaluru / Bangalore Senior Site Reliability Engineer -AWS