DevOps Architect with 12-16 years of experience
Expert on setting up K8s clusters for large scale infrastructure
Expert or at least aware of Ansible, Prometheus, Open Telemetry, Logstash, Kafka, ElasticSearch setup and administration perspective (if not aware of any particular thing, should be able to learn quickly)
Having hands on experience on infrastructure, security, monitoring for enterprise applications and knowledge of what options are appropriate for different scenarios will be needed.
Hands on experience on setting up CICD pipelines.
Must have extensive experience on deploying the microservices/web-application on Kubernetes platform.
Should be capable to design CICD and release management process.
Must be familiar with security and DevOps best practices on K8s platform.
Good concept on Docker and orchestration tools.
Ability to explore DevOps tools/technologies and guide in taking decision on it.
Must have exposure to python or shell scripting and familiar with Linux OS.
Must have exposure to observability tools.
Ability to analyze logs for error and exceptions Ability to drill down errors at application level etc.
Should be familiar with various monitoring tools Splunk/Kibana/Grafana/Prometheus etc.
General operational exposure such as good troubleshooting skills, understanding of system's capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
Strong verbal and written communication skills are mandatory.
Excellent analytical and problem-solving skills are mandatory.
Good knowledge of Agile or Scrum methodologies
Should be self-motivated and able to lead Devops team.
Good aptitude and attitude, Flexible to upskill and cross-train.
Willing to provide onsite/night overlaps.
Must be able to lead and guide the team on technical challenges.
Manage the team of 5+ plus engineer and keep high level track of their work/deliverables.
Ability to apply and share DevOps culture of industry trends and developments to improve software delivery practice at scale
Develop scripts for provisioning cloud resources.
Assist in operational enablement in different environments.
Assist use cases team in deploying artifacts in cloud environments.
Automate the creation of CICD pipelines for build/Deploy from Dev into UAT environment and then onto production
Creation/customization of Docker images on Kubernetes cluster.
Work with Infra, security & networking teams to resolve firewall and port issues in cloud.
Monitor daily operations service restoration, Debug job failures.
Assist use cases teams in troubleshooting failures.
Identify manual process and activities and automate using shell, Python, etc.
Continuous monitoring, Troubleshooting, and debugging of issues in the eco-system.
Prepare knowledge base and documents on environment configuration, deployment, etc.
Contribute to improve the efficiency of the assignment by quality improvements & innovative suggestions.
Contribute to developing a knowledge base on collaboration with other team members.