Site reliability engineer

AlgoTale

Early Applicant

5 months ago
Be among the first 50 applicants

Exp: 0-2 Years

Full time

India

Job Description

Overview

As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and applications. You will work closely with our development and operations teams to build and maintain the necessary tools and systems to support our growing platform.

Key Responsibilities

Design, build, and maintain infrastructure and tools for optimal operation, monitoring, and reliability of our systems.
Collaborate with development teams to improve and support our continuous integration and delivery processes.
Develop automation tools for provisioning, configuration, and deployment.
Monitor system performance and troubleshoot issues as they arise.
Participate in on-call rotation and incident response, resolving production issues in a timely manner.
Implement best practices for security, reliability, and fault tolerance.
Conduct capacity planning and performance analysis to support growth and scalability.
Provide technical guidance and support to cross-functional teams.
Participate in the design and implementation of disaster recovery and backup processes.
Contribute to the documentation and dissemination of best practices.

Required Qualifications

Bachelor's degree in Computer Science, Engineering, or related field.
Proven experience in infrastructure management and operations.
Proficiency in one or more scripting languages such as Python, Ruby, or Bash.
Experience with automation tools like Ansible, Chef, or Puppet.
Deep understanding of cloud technologies and providers like AWS, Azure, or GCP.
Strong knowledge of monitoring systems such as Nagios, Zabbix, or Prometheus.
Demonstrated ability in incident response and on-call support.
Understanding of networking basics and protocols.
Excellent collaboration and communication skills.
Experience in implementing and maintaining security best practices.
Knowledge of containerization tools like Docker or Kubernetes.
Ability to work in a fast-paced, dynamic environment and prioritize tasks effectively.
Experience with infrastructure as code principles using tools like Terraform or CloudFormation.
Familiarity with version control systems such as Git.
Strong problem-solving and troubleshooting skills.

Skills: infrastructure management,scripting languages,troubleshooting,automation tools,cloud technologies,incident response,collaboration,devops