Infra Architect - AI GPU

Jio

Early Applicant

3 days ago
Be among the first 50 applicants

Exp: 15-17 Years

Full time

Pune, India

Job Description

Skills:
Machine Learning, Architecture Design, GPU, Data Centre, Artificial Intelligence (AI), Certified Kubernetes Administrator,

Job Summary

We are seeking an experienced AI/GPU Infrastructure Architect with Total 15 Years of experience to design and optimize GPU-based infrastructure for machine learning and artificial intelligence applications. The ideal candidate will have a deep understanding of AI workloads, GPU architecture, and infrastructure design, along with hands-on experience in deploying scalable and efficient solutions.

Key Responsibilities

Architecture Design: Develop and implement architecture for AI and GPU infrastructure, ensuring scalability, reliability, and performance.
Infrastructure Optimization: Optimize GPU resource allocation and management to enhance performance for AI workloads.
Collaboration: Work closely with data scientists, software engineers, and IT teams to understand requirements and translate them into architectural solutions.
Performance Monitoring: Set up monitoring and benchmarking tools to assess system performance and make recommendations for improvement.
Research and Development: Stay updated with the latest advancements in AI and GPU technologies and assess their applicability to our infrastructure.
Documentation: Create and maintain architectural documentation, including design specifications, best practices, and deployment guides.
Security and Compliance: Ensure that infrastructure designs meet security standards and compliance requirements.
Training and Support: Provide guidance and training to teams on infrastructure usage and best practices.

Required Skills

Proven experience as an infrastructure architect in the field of Data Centre Infrastructure (DC Rack Planning, Compute, Storage, Network) , specifically with AI and GPU technologies.
Strong understanding of GPU architecture and parallel processing concepts.
Experience of designing high performance storage solutions for AI kind of workloads with innovative solutions.
Knowledge of designing high performance network solutions for AI Workloads using technologies like InfiniBand, ROCE, GPU Direct etc..
Experience with distributed computing and microservices architecture
Proficiency in cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization technologies (e.g., Docker, Kubernetes).
Familiarity with AI frameworks (e.g., TensorFlow, PyTorch) and ML libraries.
Experience with system performance tuning and optimization techniques.
Knowledge of security best practices related to AI and infrastructure management.
Excellent problem-solving skills and ability to work in a collaborative environment.
Strong communication skills, both verbal and written

Qualifications

Bachelors in Engineers / MCA
Certification in relevant technologies (e.g., Azure Certified Solutions Architect, NVIDIA certifications, Certified Kubernetes Administrator).

More Info

Industry:Other

Function:technology

Job Type:Permanent Job

Skills Required

containerization technologies

cloud platforms

system performance tuning

Architecture Design

ML libraries

Certified Kubernetes Administrator

Artificial Intelligence (AI)

microservices architecture

security best practices

AI frameworks

Distributed Computing

Machine Learning

Gpu

Data Centre

Date Posted: 21/11/2024

Job ID: 101084465

Report Job

About Company

JioJob Source: www.expertia.ai

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs