Search by job, company or skills
Designation: Lead DevOps Engineer
Location: Hyderabad
Work Mode: Office
Reporting to: Associate Director - DevOps
About the Role:
At Foundation AI, As a Lead DevOps Engineer, your role spans a spectrum of responsibilities, from
technical expertise in version control and coding to leadership in CI/CD implementation. Moreover,
your effective troubleshooting abilities and strong communication skills are integral to ensuring
seamless operations in a dynamic, cloud-native environment with a commitment to best practices and
efficient collaboration.
Responsibilities:
Work Location Commitment: As a Lead DevOps Engineer, you'll be expected to work from
our office in Hyderabad. This reflects our preference for in-person collaboration and a
commitment to team cohesion.
Rich Industry Experience: You should possess a substantial 7-10 years of experience in
DevOps and Site Reliability Engineering (SRE) & should have worked for product-based
companies (Startup/Scaleup). This extensive experience underscores your ability to navigate
complex DevOps challenges effectively.
Mastery of Version Control: A critical aspect of your role involves demonstrating an in-depth
mastery of version control systems. Your proficiency in this area ensures the proper
management of code repositories and versioning.
Operating System Expertise: Your command over operating systems is particularly vital,
with a strong emphasis on Linux. This expertise ensures a solid foundation for managing and
optimizing system-level operations.
DevOps Methodology: Your role will require you to not only apply DevOps concepts but also
effectively implement best practices. This includes streamlining processes and fostering a
culture of collaboration and continuous improvement.
CI/CD Leadership: You will be at the forefront of CI/CD (Continuous Integration and
Continuous Deployment) efforts. This leadership position involves overseeing the automation
of software delivery pipelines, enabling rapid and reliable releases.
Efficient Troubleshooting: Troubleshooting is a core aspect of your responsibilities. You'll
need to swiftly and efficiently diagnose and resolve issues that arise in the development and
production environments, minimizing downtime.
Effective Communication and Collaboration: Exceptional communication and collaboration
skills are essential. You'll work closely with cross-functional teams, bridging the gap between
development and operations, and ensuring smooth coordination.
Cloud-Native Proficiency: Proficiency in Cloud-native applications is crucial. You'll be tasked
with architecting, deploying, and managing applications in cloud environments, harnessing
the benefits of scalability and resilience.
Understanding Distributed Computing: A solid grasp of Distributed Computing principles is
fundamental. It enables you to design and implement systems that can handle complex,
distributed workloads effectively.
Coding Prowess: Your coding skills, particularly in Bash Shell Scripting and Python, will play
a pivotal role. These skills empower you to automate tasks and develop tools to enhance
system reliability and efficiency.
Technical Guidance and Support: Provide technical guidance to the team, helping to
resolve complex technical issues and production problems.
Role:
Manage code deployment, configuration, and monitoring.
Ensure service availability, latency, change management, and capacity management.
Utilize SLAs, SLIs, and SLOs to define system reliability.
Collect and share data with development teams to improve code quality.
Focus on monitoring and logging for proactive issue resolution.
Implement automation to reduce manual work and toil.
Balance operations and development work.
Experience with Infrastructure as a Service (e.g., Terraform).
Proficiency in Shell scripting and Linux OS.
Database configuration experience.
Exposure to ELK stack.
Ability to enhance and maintain CI/CD pipelines (e.g., Jenkins).
Proficiency in managing GIT repositories.
Knowledge of AWS services and access provisioning.
Optimize product scalability and availability.
Deploy and maintain monitoring tools for resources.
Collaborate with support and engineering teams to meet SLAs.
Utilize cloud services for efficient deployments.
Act as a configuration manager and maintain proper documentation.
Optimize cloud resources and reduce spending.
AWS Certification is a plus.
Knowledge of Airflow, Helm Charts, AWS SageMaker, and MLOps is a plus.
Master's degree in Computer Science is preferred.
Qualifications:
Experience of 7-10 years.
Master's degree in Computer Science is preferred.
Experience developing engineering applications for a large organization.
Demonstrated project development and leadership skills.
Understanding of best practices regarding system security measures.
Proficient in coding and scripting.
Detailed knowledge of web and application servers.
Understanding of AWS and Azure services.
Experience building CI/CD pipelines.
Familiarity with Linux and Windows operating systems.
Experience with containerized platforms and container orchestration.
Education: A BTech degree in Computer Science or equivalent experience relevant to the functional
area
Login to check your skill match score
Date Posted: 20/06/2024
Job ID: 82381251