Bachelors or Masters degree in Computer Science, Engineering, Mathematics, or related field.
2-3 years of industry experience as a Data Scientist or in a similar role.
Skilled in programming languages such as Python and using related libraries such as TensorFlow or PyTorch for model development and deployment.
Proven experience working with large language models (GPT, BERT, T5, etc.) with good prompt engineering practices.
Experience using frameworks such as LangChain, LlamaIndex, Haystack; Vector stores like Pinecone, Cassandra, Qdrant; and applying innovative techniques for LLMs, such as RAG, CoT, RLHF, Fine-tuning, NER.
Strong background in data analysis, experimental design, and hypothesis testing.
Understanding of natural language processing (NLP) concepts, techniques, and application (e.g., text generation, sentiment analysis, named entity recognition, etc.)
Solid understanding of machine learning fundamentals, deep learning architectures, and evaluation metrics.
Preferred qualifications
Knowledge of text2SQL
Knowledge of cloud platforms (AWS) and experience deploying models in production environments is a plus.
Knowledge of evaluating model performance using appropriate metrics and benchmarks.
Expertise in pre-processing and cleaning large datasets for training models.
Ability to iterate and improve models based on evaluation results.
Enthusiasm for staying up-to-date with the latest advancements in AI, NLP, and large language models.
Experience in the payments, banking, or accounting sector is a significant plus.
Familiarity with continuous integration and deployment (CI/CD) pipelines for ML models.
Knowledge of Docker, Kubernetes, and other containerization technologies.
Responsibilities
Lead the research, design, and development of LLM architectures and techniques tailored for finance-related NLP/LLM tasks.
Engage in analysis and experiments to refine trained Large Language Models (LLMs), like BERT, GPT, and Transformer-based models for understanding and generating finance-related language.
Utilize statistical skills to analyze financial patterns and devise innovative strategies for integrating quantitative insights into Natural Language Processing (NLP) solutions.
Keep up-to-date with developments in LLM research and methodologies, incorporating advanced techniques into our solutions.
Defining clear objectives and success criteria for machine learning projects.
Acquiring, cleaning, and preparing relevant datasets for training and evaluation.
Choosing appropriate machine learning algorithms and architectures based on the problem and data characteristics.
Document processes, results, and methodologies to ensure clarity and facilitate maintenance.
Contribute to the development and maintenance of tools, frameworks, and libraries to support LLM-related projects.
Collaborate with other departments, such as marketing or support, to understand needs and provide integrated solutions.
Ensure ethical implementation of models and compliance with data privacy regulations.