Description
We are seeking an experienced Machine Learning Platform Engineer with 10-15 years of experience to join our team in India. The ideal candidate will be responsible for designing, developing, and maintaining a scalable and efficient ML platform that serves the needs of our organization. The candidate should have a deep understanding of ML platforms, frameworks and tools, and should be able to manage the entire lifecycle of ML models from data preparation to deployment. The candidate should be a self-starter, a problem solver, and should have a strong work ethic.
Responsibilities
- Evaluate and select appropriate cloud services for each stage of the ML lifecycle.
- Design and implement the overall architecture of the MLOps platform.
- Set up automated pipelines for data preparation, model training, and deployment.
- Implement version control for code, data, and models.
- Ensure the platform is scalable, secure, and compliant with relevant regulations.
- Provide tools and interfaces for data scientists to easily leverage the platform.
- Continuously optimize the platform for performance and cost-efficiency .
- This role is crucial in bridging the gap between data science and operations, enabling organizations to efficiently develop, deploy, and maintain machine learning models at scale.
Skills and Qualifications
- 10+ years of professional experience in building applications using cloud services. Prior experience in building Machine Learning platforms using cloud services.
- Cloud expertise: Deep knowledge of cloud platforms like AWS, Google Cloud Platform, or Azure, including their machine learning and data services (Azure preferred).
- DevOps skills: Experience with CI/CD pipelines, infrastructure as code, and containerization technologies like Docker and Kubernetes.
- Machine learning knowledge: Understanding of ML workflows, model training, and deployment processes.
- Data engineering: Familiarity with data pipelines, ETL processes, and data storage solutions.
- Software engineering: Strong programming skills, particularly in languages commonly used in ML like Python.
- System design: Ability to architect scalable, reliable systems that integrate various services.
- Automation: Expertise in automating workflows and processes across the ML lifecycle.
- Security and compliance: Knowledge of best practices for securing ML pipelines and ensuring regulatory compliance.
- Monitoring and logging: Experience setting up monitoring and logging for ML systems.
- Collaboration : - Ability to work with data scientists, software engineers, and other stakeholders.