Job Responsibilities:
- Develop complex global distributed infrastructure monitoring, management, and automation solutions to manage our global network.
- Lead design, write, and build tools to improve the reliability, availability, and scalability of Datacenter/Cloud Networks, Property Networks, and Corporate Networks
- Serve as technical lead for the development of complex global distributed infrastructure monitoring, management, and automation solutions to manage our global network.
- Serve as technical lead for the design new tools to monitor and smart alerts that help discover failures or issues before our customers.
- Collaborate with other Network teams to develop network SRE solutions with a focus on production integration
- Conduct network analysis, configuration management and develop improvements for system software performance, availability, and reliability
- Provide program management assistance and contribute input to help manage project schedules, risks, and costs.
- Manage Network SRE products and solutions, including the design, low level engineering, and delivery of new hardware systems for Marriott applications across the network.
- Define and implement an operational Recovery Time Objective (RTO) and Recovery Point Objective (RPO) strategy for all Network Infrastructure areas.
- Establish management level relationships and partnering with all Business disciplines and other MI teams to define Network SRE services, meet service level requirements, and serve as an escalation point to resolve service delivery and operational issues.
- Develop, document, and manage the requirements gathering process and provide detailed design and business processes to support the requirements throughout the project life cycle
- Drive accountability with strategic sourcing partners, vendors, telco/ISPs, etc., launching and managing Performance Improvement initiatives where appropriate.
- Create functional strategies and specific objectives for the sub-function and contributes to development of budgets/policies/procedures to support the functional Network SRE tools, systems, and infrastructure.
- Perform network troubleshooting and upgrades. Coordinate with local teams and vendors, solve problems and restore services as needed
- Foster an environment of continuous improvement and structured processes and procedures that support a zero-fault culture.
- Maintaining Goals
- Submits reports in a timely manner, ensuring delivery deadlines are met.
- Promotes the documenting of project progress accurately.
- Provides input and assistance to other teams regarding projects.
- Demonstrating and Applying Discipline Knowledge
- Provides technical expertise and support to persons inside and outside of the department.
- Demonstrates knowledge of job-relevant issues, products, systems, and processes.
- Demonstrates knowledge of function-specific procedures.
- Keeps up-to-date technically and applies new knowledge to job.
- Uses computers and computer systems (including hardware and software) to enter data and/ or process information.
- Delivering on the Needs of Key Stakeholders
- Understands and meets the needs of key stakeholders.
- Develops specific goals and plans to prioritize, organize, and accomplish work.
- Determines priorities, schedules, plans and necessary resources to ensure completion of any projects on schedule.
- Collaborates with internal partners and stakeholders to support business/initiative strategies
- Communicates concepts in a clear and persuasive manner that is easy to understand.
- Generates and provides accurate and timely results in the form of reports, presentations, etc.
- Demonstrates an understanding of business priorities
Skill and Experience:
- 4-6 years experience in collecting, processing, and monitoring telemetry data with a focus on analyzing, troubleshooting, and driving continuous improvements in mission critical networks.
- 6+ years experience with network and application monitoring tools related products
- Experience in installing, configuring, and troubleshooting of network and application monitoring tools (NetScout, ThousandEyes, Solarwinds/Broadcom DX NetOps, BigPanda and AI/ML based network performance monitoring tools or other similar tools)
- Experience in developing, documenting, and managing the requirements gathering process and provide detailed design and implementation plan to support the requirements throughout the project life cycle
- Field experience and knowledge of foundational data networking and IP technologies including (ARP, TCP/IP, UDP, DHCP, DNS, NAT and others)
- Experience with common routing and switching platforms (Cisco, Juniper, HP/Aruba etc.)
- Experience with one or more Cloud Computing platforms (e.g. Amazon AWS, Microsoft Azure, Google Compute Engine)
- Demonstrated experience in delivering written documents detailing network solutions and diagrams
- Knowledge and experience in NetFlow related configuration practices
- Must possess expertise in administration of devices and policies in Network Tools.
- Technical knowledge of common routing protocols (e.g., OSPF, BGP)
- Experience in Agile methodologies, daily stand-up meetings, sprint planning sessions and user story preparations
- Advanced Degree (e.g., MS, PhD) in Computer Science or other technical discipline or MBA, preferably with a focus on technology
- Experience with managing monitoring tools in a hospitality industry a plus
- Experience in leveraging public APIs for developing automation scripts
- Team player with the ability to collaborate and work with cross functional teams in multiple time zones
- Experience in researching emerging technologies and trends, standards, and products and synthesizing into clear technology roadmaps and strategies
- Strong knowledge of emerging tools, applications, and systems for attaining best-in-class network observability across the enterprise
- Excellent problem-solving skills working independently and through leading outcomes for cross functional teams
- Excellent understanding of change management, testing requirements and techniques, to ensure high availability and business readiness of platforms
- Strong attention to detail with an ability to operate effectively across multiple priorities
- Ability to perform independently as a member of a team and through cross functional initiatives
- Proven track record of driving transformation in network technologies, tools, and processes through a data driven continuous improvement methodology
- Demonstrated experience in improving reliability, performance, and agility of complex enterprise networks
- Strong understanding of network infrastructure automation, instrumentation, and monitoring platforms and the emerging technologies in this area
- Strong influencing skills and an ability to overcome barriers while driving change
- Excellent verbal and written communication skills for a wide range of audiences including executives, business stakeholders, and IT teams
Education and Certifications:
- Undergraduate degree in an engineering or computer science discipline and/or equivalent experience/certification
Work location: Hyderabad, India.
Work mode: Hybrid