Job Location: Pune, Hybrid Mode
Joining: Immediate
Responsibilities:
- Establish and advocate the Resiliency Program strategy, ensuring alignment with IT teams.
- Develop and implement push-button DR automation for applications and business capabilities.
- Design and implement policies and procedures for effective disaster recovery planning.
- Manage the development and maintenance of recovery plans to ensure they remain current and compliant.
- Oversee the test lifecycle, facilitating testing with infrastructure and application teams, in line with the DR strategy.
- Engage with infrastructure and application teams to gather requirements and discuss disaster recovery strategies.
- Possess a solid understanding of IT infrastructure, applications, and operational environment structures.
- Work effectively as a collaborative team member in supporting DR activities.
- Influence and collaborate with internal and external resources to achieve DR objectives.
- Make decisions on complex issues and resolve problems that arise in the DR process.
- Manage multiple assignments and priorities in a fast-paced environment using strong organizational and time-management skills.
- Take initiative in determining objectives and approaches to assignments.
- Provide project leadership when required, directing and maintaining the Resiliency program.
- Apply knowledge of Crisis Management, Business Continuity, and Event Management best practices.
- Be available to support test exercises or unplanned outages, even outside of regular business hours.
- Demonstrate strong verbal and written communication skills, reporting progress and status to stakeholders.
- Minimum Qualifications:
- Bachelor's degree in Computer Science, Mathematics, or a related technical field.
- 5 years of professional experience in technical engineering.
- 5 years of automation experience.
- 3 years of professional experience with cloud computing (IAC, Microsoft Azure, AWS).
- Primary Skills (Must Have):
- Automated solutions for various infrastructure components using Ansible.
- Develop custom Ansible modules/roles as per business requirements.
- Ansible AWX: Configure Job Templates, workflows, and schedules.
- Manage dynamic/static inventory and notifications in Ansible.
- Strong knowledge of Linux, Windows platforms, and shell scripting (e.g., Bash), PowerShell scripting.
- Ability to analyze logs, metrics, and events to identify and resolve errors.
- Integration with external systems like CyberArk, Service Now, etc.
- Strong knowledge of Python and Python scripting.
- Expertise in REST APIs, Ansible collections, Ansible Galaxy.
- Source control management knowledge with tools like Git, GitHub, or Bitbucket.
- Collaborate with infrastructure teams to identify new scope or enhance existing automations.
- Good understanding of disaster recovery concepts.
- Familiar with YML, JSON, Jinja templates.
- Excellent communication and documentation skills leveraging the Atlassian Stack (JIRA, Confluence).
- Secondary Skills(Good to have):
- Expertise in VMware automations. Code Stream Pipelines and Service broker.
- Integrate automation with any DR tool.
- Knowledge on cloud services using AWS or Azure etc