Principal Site Reliability Engineer
Dell Technologies customers rely on our products and services to drive progress. So we take the service we provide extremely seriously. Service Delivery is all about making sure our technical solutions help clients fulfil their priorities, challenges and initiatives. As trusted advisors, we build in-depth knowledge of what each client wants to achieve. Then we make sure the services delivered by Dell Technologies deliver on all our promises. We also work closely with Sales and Global Services colleagues to develop strategic account growth plans, and to identify and pursue sales opportunities.
Join us as a
Principal Site Reliability Engineer on our DTMS team in Bangalore to do the best work of your career and make a profound social impact.
What You'll Achieve
The role will be expected to work in a positive and collaborative fashion with fellow team members, senior engineering/architect staff, vendors, and customers. The Infra Delivery Specialist will assist with process maturation, development, technical standards creation, and drive operational excellence through consistent delivery and best practices. This role will also include the core fundamentals of day-to-day operations, based off the ITIL v3 methodology.
You will:
- Participate in 24x7x365 shift coverage for infrastructure support for Redhat OpenShift, specialist in NVIDIA, containers, and Kubernetes skills.
- Perform daily system monitoring, verifying the integrity and availability of all hardware, resources, systems, and key processes, reviewing system and application logs, and verifying completion of scheduled jobs.
- Provide support per request from various constituencies. Investigate and troubleshoot issues.
- Repair and recover from hardware or software failures. Coordinate and communicate with impacted infrastructure.
- Perform preventative maintenance (and upgrade, as required) on devices, and related peripherals to meet IT specifications.
Take the first step towards your dream career
Every Dell Technologies team member brings something unique to the table. Here's what we are looking for with this role:
Essential Requirements:
- 8+ Years strong experience in Redhat OpenShift, Go-Lang programming.
- Experience in containers and Kubernetes Administration.
- Hands on knowledge on NVIDIA AI Enterprise, NVIDIA GPU & Network Operations
- Knowledge on NVIDIA base command manager & Cluster manager
- Knowledge on Network Administration with NVIDIA ONYX Switch System
Desirable Requirements
- Knowledge on Observability & log collection (Prometheus and Grafana)
- OpenShift Administrator certification preferred.
Who We Are
We believe that each of us has the power to make an impact. That's why we put our team members at the center of everything we do. If you're looking for an opportunity to grow your career with some of the best minds and most advanced tech in the industry, we're looking for you.
Dell Technologies is a unique family of businesses that helps individuals and organizations transform how they work, live and play. Join us to build a future that works for everyone because Progress Takes All of Us.
Application closing date: 1st July 2024
Dell Technologies is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. Read the full Equal Employment Opportunity Policy here.
Job ID:R244513
Dell's Flexible & Hybrid Work Culture
At Dell Technologies, we believe our best work is done when flexibility is offered.
We know that freedom and flexibility are crucial to all our employees no matter where you are located and our flexible and hybrid work style allows team members to have the freedom to ideate, be innovative, and drive results their way. To learn more about our work culture, please visit our locations page.