Role/ Job Title: Data Engineer
Function/ Department: Data & Analytics
Job Purpose:
The data engineer will be working with our data scientists who are building solutions using generative AI in the domain of text, audio and images and tabular data. They will be responsible for working with large volumes of structured and unstructured data in its storage, retrieval, and augmentation with our GenAI solutions which use the said data.
Job & Responsibilities:
- Build data engineering pipeline focused on unstructured data pipelines
- Conduct requirements gathering and project scoping sessions with subject matter experts, business users, and executive stakeholders to discover and define business data needs in GenAI.
- Design, build, and optimize the data architecture and extract, transform, and load (ETL) pipelines to make them accessible for Data Scientists and the products built by them.
- Work on end-to-end data lifecycle from Data Ingestion, Data Transformation and Data Consumption layer. Versed with API and its usability
- Drive the highest standards in data reliability, data integrity, and data governance, enabling accurate, consistent, and trustworthy data sets
- A suitable candidate will also demonstrate experience with big data infrastructure inclusive of MapReduce, Hive, HDFS, YARN, HBase, MongoDB, DynamoDB, etc.
- Creating Technical Design Documentation of the projects/pipelines
- Good skills in technical debugging of the code in case of issues. Also, working with git for code versioning
Key Success Metrics:
- Deliver & effectively track all the deliverables
- Proficiency in data storytelling and presentation techniques
- Proficiency in machine learning and predictive analytics.
Education Qualification:
Graduation: Bachelor of Science (B.Sc) / Bachelor of Technology (B.Tech) / Bachelor of Computer Applications (BCA)
Post-Graduation: Master of Science (M.Sc) /Master of Technology (M.Tech) / Master of Computer Applications (MCA)
Experience Range: 2 to 5 years