- The Performance, Resiliency and Scalability Engineering team is essential to the success of our distributed microservices. As a Software engineer for performance,
- Resiliency and Scalability on this team, you will be working on complex systems running on-prem, relational databases and large and complex datasets .
- You will focus on optimizing overall product performance and reliability.
- You will focus on defining and enhancing an Chaos testing framework to understand and optimize application performance & resiliency as data volume grows.
- The key qualities we re looking for is a true self-started, problem solver, someone who enjoys digging deeper and has a need to understand systems and how they re wired. Someone who has strong SRE, Performance & Resiliency engineering background would be ideal for this role.
- This person will have experience with performance & reliability analysis and the ability to develop your own analysis tools (with open-source technology) and performance testing tools.
This position is an individual contributor and reports to the Director of Performance & Resiliency Engineering.
How will you make an impact in this role
Enhancing the confidence and safety of deploying changes across the fleet of applications at American Express by validating system resiliency. Enabling teams to better understand and prepare for sudden spikes in traffic and other load scenarios, both at the application level and system level.
- Write code to build common tools, frameworks & infrastructure with a focus on application performance & resiliency.
- Implement API based Mock frameworks to generate production like load in lower tests environments.
- Implement shift left automated Performance and Resiliency testing as part of PR analysis to prevent defects from reaching production
- Work in initiatives and strategic project to ensure Amex products are highly scalable and performs well with sophisticated workload.
- Develop, execute and analyze performance tests with complex workload and large simulated test datasets for distributed system running in Kubernetes environment.
- Develop tools or use open-source frameworks to automate performance & Resiliency testing, application monitoring & analyzing results, and reporting issues
- Define performance & Resiliency test strategy, publish metrics and execute performance related requirements
- Setup test environment and test data required to execute performance & Resiliency tests
- Identify and analyze performance bottlenecks in the product and work with engineering to resolve the issue
- Own performance, Resiliency engineering Planning, design, implementation and execution
- Prepare and present performance benchmarks, best practices, test results, comparisons and analysis
- Provide day to day technical guidance and hands-on help to other engineers on the team
- Analyze performance results to identify performance bottlenecks and suggest optimizations for our systems
Minimum Qualifications
- 5+ years of experience in software development roles Performance Engineering, Resiliency Engineering or SRE
- Development experience with object-oriented programming languages & Frameworks like Java & Springboot
- Hands one experience with Resiliency testing tools like Chaos Monkey, Gremlin, LitmusChaos
- Strong Analytical and debugging skills
- Hands-on experience testing distributed microservices and event driven architecture.
- Hands-on performance testing timeseries and SQL datastores
- Ability to implement Industry standard performance testing tools & mock frameworks (Jmeter, Gatlin, Jmock, K6)
- Strong experience in bottleneck analysis, profiling, and distributed tracing tools
- Hands on experience on system monitoring & profiling tools and frameworks like Grafana, InfluxDB, Prometheus, Dynatrace, Splunk etc.)
- Experience developing & testing APIs and/or SDKs
- Comfortable using Linux, Docker, Kubernetes (any flavor of Kubernetes) & GIT
Benefits include:
- Competitive base salaries
- Bonus incentives
- Support for financial-well-being and retirement
- Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location)
- Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
- Generous paid parental leave policies (depending on your location)
- Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
- Free and confidential counseling support through our Healthy Minds program
- Career development and training opportunities