Search by job, company or skills

HCLTech

LLM Ops Engineer

Early Applicant
  • a month ago
  • Be among the first 50 applicants

Job Description

Description & Requirements

Position Summary

LLMOps(Large language model operations) Engineer will play a pivotal role in building and

maintaining the infrastructure and pipelines for our cutting-edge Generative AI applications,

establishing efficient and scalable systems for LLM research, evaluation, training, and fine-tuning.

Engineer will be responsible for managing and optimizing large language models (LLMs) across

various platforms This position is uniquely tailored for those who excel in crafting pipelines, cloud

infrastructure, environments, and workflows. Your expertise in automating and streamlining the ML

lifecycle will be instrumental in ensuring the efficiency, scalability, and reliability of our Generative AI models and associated platform. LLMOps engineer's expertise will ensure the smooth deployment, maintenance, and performance of these AI platforms and powerful large language models.

You will follow Site Reliability Engineering & MLOps principles and will be encouraged to contribute your own best practices and ideas to our ways of working.

Reporting to the Head of Cloud Native operations, you will be an experienced thought leader, and

comfortable engaging senior managers and technologists. You will engage with clients, display

technical leadership, and guide the creation of efficient and complex products/solutions.

Key Responsibilities

Technical & Architectural Leadership

Contribute to the technical delivery of projects, ensuring a high quality of work that adheres to

best practices, brings innovative approaches and meets client expectations. Project types include

following (but not limited to):

o Solution architecture, Proof of concepts (PoCs), MVP, design, develop, and

implementation of ML/LLM pipelines for generative AI models, encompassing data

ingestion, pre-processing, training, deployment, and monitoring.

o Automate ML tasks across the model lifecycle.

Contribute to HCL thought leadership across the Cloud Native domain with an expert

understanding of advanced AI solutions using Large Language Models (LLM) & Natural Language

Processing (NLP) techniques and partner technologies.

Collaborate with cross-functional teams to integrate LLM and NLP technologies into existing

systems.

Ensure the highest levels of security and compliance are maintained in all ML and LLM

operations.

Stay abreast of the latest developments in ML and LLM technologies and methodologies,

integrating these innovations to enhance operational efficiency and model effectiveness.

Collaborate with global peers from partner ecosystems on joint technical projects. This partner

ecosystem includes Google, Microsoft, AWS, IBM, Red Hat, Intel, Cisco, and Dell / VMware etc.

Service Delivery

Provide a technical hands-on contribution. Create scalable infra to support enterprise loads

(distributed GPU compute, foundation models, orchestrating across multiple cloud vendors, etc.)

Ensuring the reliable and efficient platform operations.

Apply data science, machine learning, deep learning, and natural language processing methods

to analyse, process, and improve the model's data and performance.

Create and optimize prompts and queries for retrieval augmented generation and prompt

engineering techniques to enhance the model's capabilities and user experience w.r.t

Operations & associated platforms.

Client-facing influence and guidance, engaging in consultative client discussions and performing

a Trusted Advisor role.

Provide effective support to HCL Sales and Delivery teams.

Support sales pursuits and enable HCL revenue growth.

Define the modernization strategy for client platform and associated IT practices, create solution

architecture and provide oversight of the client journey.

Innovation & Initiative

Always maintain hands-on technical credibility, keep in front of the industry, and be prepared to

show and lead the way forward to others.

Engage in technical innovation and support HCL's position as an industry leader.

Actively contribute to HCL sponsorship of leading industry bodies such as the CNCF and Linux

Foundation.

Contribute to thought leadership by writing Whitepapers, blogs, and speaking at industry events.

Be a trusted, knowledgeable internal innovator driving success across our global workforce.

Client Relationships

Advise on best practices related to platform & Operations engineering and cloud native

operations, run client briefings and workshops, and engage technical leaders in a strategic

dialogue.

Develop and maintain strong relationships with client stakeholders.

Perform a Trusted Advisor role.

Contribute to technical projects with a strong focus on technical excellence and on-time

delivery.

Mandatory Skills & Experience

Expertise in designing and optimizing machine-learning operations, with a preference for

LLMOps.

Proficient in Data Science, Machine Learning, Python, SQL, Linux/Unix shell scripting.

Experience on Large Language Models and Natural Language Processing (NLP), and experience

with researching, training, and fine-tuning LLMs. Contribute towards fine-tune Transformer

models for optimal performance in NLP tasks, if required.

Implement and maintain automated testing and deployment processes for machine learning

models w.r.t LLMOps.

Implement version control, CI/CD pipelines, and containerization techniques to streamline ML

and LLM workflows.

Develop and maintain robust monitoring and alerting systems for generative AI models ensuring

proactive identification and resolution of issues.

Research or engineering experience in deep learning with one or more of the following:

generative models, segmentation, object detection, classification, model optimisations.

Experience implementing RAG frameworks as part of available-ready products.

Experience in setting up the infrastructure for the latest technology such as Kubernetes,

Serverless, Containers, Microservices etc.

Experience in scripting / programming to automate deployments and testing, worked on tools

like Terraform and Ansible. Scripting languages like Python, bash, YAML etc.

Experience on CI/CD opensource and enterprise tool sets such as Argo CD, Jenkins (others like

Jenkins X, Circle CI, Argo CD, Tekton, Travis, Concourse an advantage).

Experience with the GitHub/DevOps Lifecycle

Experience in Observability solutions (Prometheus, EFK stacks, ELK stacks, Grafana, Dynatrace,

AppDynamics)

Experience in at-least one of the clouds for example - Azure/AWS/GCP

Significant experience on microservices-based, container-based or similar modern approaches of

applications and workloads.

You have exemplary verbal and written communication skills (English). Able to interact and

influence at the highest level, you will be a confident presenter and speaker, able to command

the respect of your audience.

Desired Skills & Experience

Bachelor level technical degree or equivalent experience; Computer Science, Data Science, or

Engineering background preferred; Master's Degree desired.

Experience in LLMOps or related areas, such as DevOps, data engineering, or ML infrastructure.

Hands-on experience in deploying and managing machine learning and large language model

pipelines in cloud platforms (e.g., AWS, Azure) for ML workloads.

Familiar with data science, machine learning, deep learning, and natural language processing

concepts, tools, and libraries such as Python, TensorFlow, PyTorch, NLTK etc.

Experience in using retrieval augmented generation and prompt engineering techniques to

improve the model's quality and diversity to improve operations efficiency. Proven experience

in developing and fine-tuning Language Models (LLMs).

Stay up-to-date with the latest advancements in Generative AI, conduct research, and explore

innovative techniques to improve model quality and efficiency.

The perfect candidate will already be working within a System Integrator, Consulting or

Enterprise organisation with 8+ years of experience in a technical role within the Cloud domain.

Deep understanding of core practices including SRE, Agile, Scrum, XP and Domain Driven Design.

Familiarity with the CNCF open-source community.

Enjoy working in a fast-paced and dynamic environment using the latest technologies

About Us

HCL Cloud Native & AI Labs is the global Centre of Excellence guiding the application of advanced

technologies and leading the way for clients and HCL employees worldwide.

Our clients are the world's largest enterprises, they engage with Labs for strategic advice,

accelerated engineering, and industry thought leadership to guide their modernization and

transformation outcomes.

We are the industry leader in Cloud-enabled transformation and work with the most advanced

technologies, often leading the way for others. We sponsor industry bodies like the Cloud Native

Computing Foundation (CNCF) and contribute to successfully deploying emerging technologies.

We deliver advanced engineering projects, enable industry collaboration, and work on

enhancements to Kubernetes and other open-source projects driving change in the technology

industry.

More Info

Industry:Other

Function:technology

Job Type:Permanent Job

Skills Required

Login to check your skill match score

Login

Date Posted: 20/10/2024

Job ID: 97054381

Report Job

About Company

Follow

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs

Cloud Ops Associate Systems Engineer

Bread FinancialCompany Name Confidential

Audio DSP Senior Software Engineer

Qualcomm Technologies IncCompany Name Confidential
Last Updated: 23-11-2024 07:21:00 PM
Home Jobs in Noida LLM Ops Engineer