Search by job, company or skills
Key Responsibilities:
- Maintain and enhance the reliability, availability, and performance of large-scale distributed systems.
- Automate deployment, monitoring, and management of production systems. - Implement and manage CI/CD pipelines for software delivery.
- Collaborate with software engineers to design, build, and manage scalable and resilient infrastructure.
- Troubleshoot complex system issues, identify root causes and implement long-term solutions.
- Monitor system performance and optimize configurations for better performance and cost efficiency.
- Implement security best practices and ensure compliance with industry standards.
Required Skills:
- Proficiency in cloud platforms (AWS, Google Cloud, or Azure) and containerization technologies like Docker and Kubernetes.
- Strong scripting and automation skills using Python, Bash, or similar languages.
- Experience with infrastructure as code (IaC) tools such as Terraform or Ansible.
- Deep understanding of monitoring and logging tools (Prometheus, Grafana, ELK Stack).
- Knowledge of database management (SQL/NoSQL) and networking fundamentals.
- Experience with CI/CD tools like Jenkins, GitLab CI, or CircleCI.
- Strong problem-solving skills and experience in troubleshooting large-scale systems.
Education:
- A degree in Computer Science, Engineering, or a related field from a recognized institution.
- Ideally, 3 to 6 years of experience in a similar role at a product company.
Login to check your skill match score
Date Posted: 20/10/2024
Job ID: 97157381