This job is with Oracle, an inclusive employer and a member of myGwork the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.
The ideal candidate will be a detail-oriented professional with a robust technical background, a proven track record in Site Reliability Engineering, and a passion for improving service reliability and performance. You should thrive in a fast-paced environment, be adept at collaborating with diverse teams, and have a proactive approach to problem-solving and continuous improvement.
My Oracle Support (MOS) is Oracle's Enterprise Support solution. The MOS Development (Dev) team is responsible for creating and maintaining the My Oracle Support application. This encompasses both the customer-facing and employee-facing portals, used by external customers and internal support engineers, respectively. Additionally, the MOS Dev team oversees the entire ecosystem that facilitates customer support and interactions.
We collaborate closely with Global Customer Support to understand their business needs and integrate these requirements into MOS releases.
Who are we looking for
As a Site Reliability Engineer (SRE) For My Oracle Support, You Will Be Pivotal In The Development, Implementation, And Maintenance Of Our Support Solution. We Seek An Individual Who Excels In
- Development and Deployment: Collaborating with a skilled team of engineers to design, deploy, and enhance the My Oracle Support application.
- Technical Expertise: Demonstrating proficiency in Linux-based systems, with a strong ability to troubleshoot and resolve complex issues.
- Reliability and Performance: Ensuring the reliability, performance, and security of the My Oracle Support solution through proactive management and optimization.
- Collaborative Problem Solving: Working closely with other professionals to maintain high standards and address any technical challenges effectively.
- Shift Flexibility: Ability to work as part of a global 24x7x365 DevOps team, including non-standard shifts, holidays, and weekends on a rotational basis. Primary standard shift is US daytime.
- Communication: Excellence in verbal and written communication. Capable of effectively communicating with all levels of technical and management staff during critical events.
- Technical Expertise: Strong technical background with the ability to troubleshoot issues affecting large-scale service architectures and application stacks.
- Scripting/Programming: Proficiency in one or more scripting/programming languages such as Python, Bash, Java, Ruby, or Go.
- Agile Development: Experience as a developer in an agile environment, using tools such as Jira and Git.
- Linux Proficiency: Hands-on experience with Linux environments, including troubleshooting, prototyping, and testing changes via systemd and sysctl.
- Cloud Experience: Background in designing, implementing, and troubleshooting within cloud environments such as Oracle Cloud Infrastructure (OCI), AWS, Azure, or GCP.
- Cloud Native Technologies: Solid understanding of cloud-native technologies including Prometheus, Kubernetes, Helm, and container runtimes.
- Database Knowledge: Working knowledge of databases such as Oracle Database or MySQL, with the ability to read and design SQL queries.
- Configuration Management: Experience with configuration management and orchestration tools such as Puppet, Chef, or Ansible.
- Interpersonal Skills: Good interpersonal skills with the ability to present ideas clearly in both business-friendly and user-friendly language. Team-oriented.
- Self-Motivation: Highly self-motivated with a keen attention to detail.
- Analytical Skills: Proven analytical and problem-solving skills with a strong customer service orientation.
- Task Management: Ability to effectively prioritize and execute tasks in a high-pressure environment.
Your role will be essential in maintaining the high-quality support that My Oracle Support offers to both internal and external users.
Career Level - IC3
Key Responsibilities
- Understand and Manage Support Solutions: Gain a comprehensive understanding of the end-to-end configuration, technical dependencies, and behavior of Oracle's Enterprise support services.
- Maximize Service Availability: Strive to maximize service availability by enhancing the service during non-crisis periods and minimizing impact during crises. Focus on hardening the service to extend the time between service-impacting events.
- Identify Hardening Opportunities: Identify and address opportunities to improve service reliability, including enhancing monitoring coverage and recognizing actionable events that require intervention.
- Enhance SOPs: Develop and refine Standard Operating Procedures (SOPs) by creating documented responses to alerts. Automate these responses and integrate them with actionable events for streamlined incident management.
- Drive Major Incident Response: Actively participate in Major Incident bridges during critical service-impacting events to lead and coordinate effective service mitigation efforts
- Post-Mortems and Critical Repairs: Engage in Post Mortems and Critical Repair Items following service-impacting events to prevent recurrence and ensure continuous improvement
- Monitor and Improve: Understand and communicate the scale, capacity, security, and performance attributes and requirements of the service stack. Continuously work on improving telemetry, automation, and overall service reliability.
- Troubleshooting and Issue Resolution: Act as the ultimate escalation point for complex or critical issues, utilizing deep knowledge of service topology and dependencies to troubleshoot and define mitigations.
- Automation and Orchestration: Demonstrate a strong understanding of automation and orchestration principles to improve service availability, reduce time to mitigate issues, and enhance development velocity.
- Drive Continuous Improvement: Develop tools, drive down incident counts, reduce event severity, and minimize time to mitigate. Foster a Site Up culture and continuously review and enhance systems and methods to improve custo
- Technological Analysis: Contribute to the analysis and enhancement of MOS applications and internal tools, identifying and implementing durable solutions to complex challenges
- Collaborate with Development Teams: Partner with development teams to define and implement improvements in the support service architecture, ensuring that enhancements are aligned with overall goals.
- Articulate Technical Characteristics: Clearly communicate the technical characteristics of services and technology areas, guiding development teams in engineering and integrating advanced capabilities.
- Communication and Problem Solving: Employ excellent communication, technical analysis, and problem-solving skills to methodically address and resolve issues. Communicate clearly and professionally with internal stakeholders during high-priority situations, both in written and spoken forms.
- Team Development: Support the training and development of junior team members, sharing knowledge and best practices to foster growth within the team.
Qualifications
- Educational Background: Bachelor's degree in Computer Science, Information Technology, or a related field. Relevant work experience may be considered in place of a degree
- Experience: Proven experience as a System Engineer, Software Engineer, or in a similar role, preferably with a focus on complex enterprise software solutions. Understanding of the Enterprise Cloud solutions and the ability to delve into complex services.
- Communication Skills: Excellent communication skills, analytical thinking, problem-solving capabilities, and attention to detail.
- Technical Skills: Proficiency in Linux-based systems, including administration, scripting, and troubleshooting.
- Judgment and Independence: Ability to handle varied and complex tasks independently, demonstrating sound judgment in decision-making.
- Monitoring and Performance: Knowledge of system monitoring tools, performance tuning, and capacity planning.
- Problem-Solving: Strong problem-solving abilities with a proven track record of analyzing and resolving complex technical issues.
As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's problems. True innovation starts with diverse perspectives and various abilities and backgrounds.
When everyone's voice is heard, we're inspired to go beyond what's been done before. It's why we're committed to expanding our inclusive workforce that promotes diverse insights and perspectives.
We've partnered with industry-leaders in almost every sectorand continue to thrive after 40+ years of change by operating with integrity.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer a highly competitive suite of employee benefits designed on the principles of parity and consistency. We put our people first with flexible medical, life insurance and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by calling +1 888 404 2494, option one.
Disclaimer
Oracle is an Equal Employment Opportunity Employer*. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
- Which includes being a United States Affirmative Action Employer