This job is with Amazon, an inclusive employer and a member of myGwork the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.
Description
Amazon Music
Imagine being a part of an agile team where your ideas have the potential to reach millions. Picture working on cutting-edge consumer-facing products, where every single team member is a critical voice in the decision-making process. Envision being able to leverage the resources of a Fortune-500 company within the atmosphere of a start-up. Welcome to Amazon Music, where ideas are born and come to life.
The Region Flexibility team is focused on transforming how Music serves customer traffic by enabling seamless expansion into new AWS regions. This initiative involves scaling beyond primary regions such as DUB, IAD, and PDX to accommodate growth, regulatory changes, and minimize risks from region-specific outages. The team will develop core components that simplify the process of migrating services into new regions, enabling the use of distant availability zones while balancing latency and cost. This includes experimenting with latency-sensitive and latency-insensitive workloads, finding optimal placements for business logic and data, and reducing the complexity of managing cross-region systems.
Additionally, the team supports the creation of multi-region marketplaces, allowing services to scale smoothly as demand grows in key markets like the US and when new localization laws are introduced in smaller markets. The focus will be on abstracting complex networking and infrastructure management tasks, improving telemetry and observability, and centralizing decisions around capacity placement and data migrations. Collaboration with various teams is essential to ensure that solutions align with existing architectural needs while driving towards a more flexible and resilient future for SDO services. Through these efforts, the team aims to maintain high performance and availability for customers globally while optimizing costs and enhancing the overall developer experience.
Come innovate with the Amazon Music team!
This is a highly impactful and visible role where you will serve as a key contact for critical Music stakeholders involved in regional expansion efforts. Primary responsibilities include owning the deployment and migration processes for new AWS regions, developing monitoring solutions to ensure optimal service performance, onboarding new validations into the system, performing deep dives , automating manual tasks, and addressing unique technical challenges associated with cross-region operations.
The ideal candidate must be detail-oriented, possess superior verbal and written communication skills, and demonstrate strong organizational abilities. You should be adept at juggling and prioritizing multiple tasks while working independently and maintaining professionalism under pressure. You must be proactive in identifying potential problems before they occur and implementing solutions to detect and prevent outages. Additionally, the ability to make sound judgments, enhance the customer experience, and collaborate effectively with team members is essential for success in this role.
Key job responsibilities
- Diagnose and resolve complex technical issues in enterprise-level applications, performing deep-dive analysis on logs and metrics.
- Maintain the health of systems, including Linux/UNIX servers and distributed applications, ensuring consistent uptime and availability.
- Develop scripts and automation tools to streamline routine support tasks, improving efficiency and reducing manual intervention.
- Participate in on-call rotations, responding to and resolving critical system issues to minimize downtime and impact on services.
- Work closely with software developers to troubleshoot escalated issues, optimize code, and implement fixes that improve system stability.
- Address customer support tickets promptly, providing clear communication and effective solutions to technical problems.
- Identify areas for improvement, driving initiatives to enhance the reliability, scalability, and performance of systems and processes.
- Set up and manage monitoring tools to proactively detect issues and ensure the system meets performance and availability targets.
- Create and maintain documentation, including standard operating procedures, troubleshooting guides, and knowledge-sharing resources.
- Provide guidance and mentorship to junior engineers, sharing best practices and helping to build a stronger support team.
About The Team
The Region Flexibility team is at the forefront of Amazon's next phase of growth, building solutions that will define how Music operates in new regions for the next decade. Your work will have a direct impact on the scalability and resilience of Amazon's services, ensuring that we can continue to delight customers globally. If you're passionate about solving complex problems in distributed systems and want to be a part of a team shaping the future of cloud infrastructure, we'd love to hear from you.
Basic Qualifications
- 4+ years of software development, or 4+ years of technical support experience
- Experience troubleshooting and debugging technical systems
- Experience in Unix
- Experience scripting in modern program languages
- B.E. or B.S in Computer Science or a related field
- Proven ability to troubleshoot and identify root causes in complex enterprise-level applications
- Experience in agile/scrum or related collaborative workflows
Preferred Qualifications
- Advanced knowledge of UNIX/Linux operating system and tools
- Experience analyzing and troubleshooting RESTful web API calls
- Strong knowledge of programming, operating systems, and data structures concepts
- Development experience in Java, with understanding of XML/SOAP, web services, and web application development
- Proven track record in working on large-scale, enterprise-level n-tier applications
- Demonstrated experience in Perl, Python, or shell scripts, and web technologies
- Ability to understand technical specifications, distributed systems architecture, and to deep dive into service/application logs
- Demonstrated skill and passion for problem-solving and operational excellence
- Knowledge of AWS products and technologies
- Comfortable communicating cross-functionally and across management levels in formal and informal settings
- Shows creativity and initiative to improve productivity and develop defect reduction strategies using automation or creative processes