Exp: 1+ Years
Location: Remote (India)
Work Timings: 10:30 AM to 7:30 PM; Mon - Fri
Pay: 5L - 10L/year
Key Responsibilities
- Develop and execute comprehensive test plans, strategies, and scripts to validate LLM functionality, accuracy, and performance.
- Design and implement test cases specifically targeting LLM performance, response accuracy, ethical compliance, and bias detection.
- Leverage Python testing frameworks (e.g., pytest, unittest) to create, maintain, and enhance automated test scripts for functional and performance testing of LLMs.
- Build and maintain CI/CD pipelines to streamline and automate the testing, deployment, and monitoring of LLM models.
- Document and report issues found, and work with developers to reproduce, troubleshoot, and resolve issues effectively.
- Continuously improve testing strategies by staying up-to-date with the latest testing methodologies, tools, and industry trends related to LLMs, CI/CD, and AI.
- Work closely with cross-functional teams, including developers, data scientists, and product managers, to ensure all issues are documented and addressed promptly.
Required Skills & Qualifications
- Minimum of 1 year of hands-on experience in Quality Assurance, with specific experience testing LLMs.
- Strong programming skills in Python, with a good understanding of object-oriented programming.
- Experience with Python testing frameworks, including pytest, unittest, or equivalent.
- Experience in automating test cases for LLMs and managing automation frameworks.
- Proficiency in CI/CD tools and processes, with experience using tools such as Jenkins, GitLab CI, or similar.
- Strong analytical and problem-solving skills, with an ability to understand complex LLM functionality and detect intricate issues.
- Keen eye for detail and a commitment to high standards of quality in AI testing.
- Excellent verbal and written communication skills to effectively document test results and collaborate with team members.
Preferred Qualifications
- Familiarity with OpenAI, Hugging Face, or similar libraries and platforms for language models.
- Knowledge of specialized AI testing tools or platforms used for testing language models.
- Experience with load testing, response time analysis, and performance tuning of LLMs.