Search by job, company or skills

**********

Senior Site Reliability Engineer

Company name confidential
Quick Apply
  • Posted 24 days ago
  • Over 100 applicants
3-6 Years
INR 9.1 - 15.2 LPA

ITES/BPO/Call Center

Job Description

HIRING FOR TATA CONSULTANCY SERVICES (TCS)

Responsibilities:
  • You will create infrastructure as a code (IaC) and automate manual processes using tools like Bash.
  • You will automate the deployment of applications and services to staging and production environments. This includes building CI, and CD pipelines, containerization and orchestration of workloads, configuration management, etc.
  • You will build auto-scaling systems that scale up or down based on user demands.
  • You will build observability into systems, making it easier to find and resolve issues before they blow up in production.
  • You will implement ways to improve system performance and optimize cloud costs.
  • Meticulously create RCAs, runbooks, and checklists and follow them diligently.
  • You own the reliability of systems that are live on production.
  • Auto-scaling eKYC Machine Learning workloads to handle 2 Million API requests/day.
  • Migrating 1.3TB of primary data from self-hosted MySQL to GCP CloudSQL.
  • Building a control plane for multi-cluster Kubernetes setup.
  • Implementing GitOps for continuous deployment of microservices.
  • Migrating background jobs from VMs to Kubernetes using KEDA.

Requirements:
  • Understanding of basic bash scripting and computer networking (SSH, TCP, HTTP).
  • Experience with using a programming language (we primarily use Go) to build a basic REST API.
  • Experience with using Git as a version control system.
  • Experience with any of the cloud providers (AWS, GCP, etc) to deploy a three-tier web app.
  • The high-level idea of system components (databases, cache, reverse proxies, CDNs) to understand how and where they fit in the big picture.
  • Experience in creating CI, and CD pipelines to build and deploy at least a simple REST API application to dev/prod environments.
  • Ability to take code from the local to prod by implementing Continuous Integration and Delivery principles.
  • Exposure to building, scaling, and deploying software using a 12-factor app (https://12factor.net/principles.
  • Experience in working with Microservices and use of container orchestration tools like Kubernetes/Nomad.
  • Experience with using Observability tools and setting up monitoring and alerting for microservices using Prometheus, Grafana, Loki, ELK stack Datadog, and the like.
  • Implementing everything as code - from infra to policies, security, configuration, etc. using relevant tools such as Terraform, OPA, Ansible, etc.
  • Experience with building cloud-agnostic homogenous deployment solutions.

Skills Required

Login to check your skill match score

Login

Date Posted: 18/06/2024

Job ID: 82076495

Report Job

Hi , want to stand out? Get your resume crafted by experts.

Last Updated: 18-02-2025 09:38:54 AM
Home Jobs in Pune Senior Site Reliability Engineer