Company: A Large Global Organization
Key Skills: AWS Cloud, Data Engineer, Snowflake DB, Pyspark, Docker, SQL, Python
Roles and Responsibilities:
- Develop solutions for company that will deliver high quality personalized recommendations across different channels to our customers
- Working with Data science team to ensure seamless integration and support of machine learning models.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS big data technologies.
- Develop end-to-end (Data/Dev/MLOps) pipelines based on in-depth understanding of cloud platforms, AI/ML lifecycle, and business problems to ensure solutions are delivered efficiently and sustainably.
- Collaborate with other members of the team to ensure high quality deliverables
- Learning and implementing the latest design patterns in software engineering
Skills Required:
- Tech Stack: Python or NodeJS, PySpark, Micro services, Docker, Serverless Frameworks & Databricks.
- Hands on experience building ETL workflows/data pipelines
- Experience in relational and non-relational databases and SQL (NoSQL is a plus).
- Experience with Cloud technologies (AWS or Azure)
- Experience in Designing and building API's for high transactional volume
- Experience building Data and CI/CD/MLOps pipelines
- Familiarity with Airflow and MLFlow tools
- Familiarity with automated unit/integration test frameworks
- Experience working on AdTech or MarTech technologies is added advantage
- Knowledge of machine learning algorithms and concepts and implementation will be a plus
- Good written and spoken communication skills, team player.
- Strong analytic thought process and ability to interpret findings
Data Management
- Experience with both structured and unstructured data, and Hadoop, Apache Spark, or similar technologies
- Good understanding of Data Modeling, Data Warehouse, Data Catalog concepts and tools
- Experience with Data Lake architectures, and with combining structured and unstructured data into unified representations
- Able to identify, join, explore, and examine data from multiple disparate sources and formats
- Ability to reduce large quantities of unstructured or formless data and get it into a form in which it can be analyzed
- Ability to deal with data imperfections such as missing values, outliers, inconsistent formatting, etc.
- Ability to manipulate large datasets, (millions of rows, thousands of variables)
Software Development
- Ability to write code in programming languages such as Python/NodeJs, PySpark and shell script on Linux
- Familiarity with software development methodology such as Agile/Scrum
- Love to learn new technologies, keep abreast of the latest technologies within the cloud architecture, and drive your organization to adapt to emerging best practices
Architecture and Infrastructure
- Architectural design experience on AWS.
- Architectural design for application with high transactional volume
- Experience in delivering software with AWS EC2, S3, EMR/Glue, Lambda, Data Pipeline, CloudFormation, Redshift etc.
- Good knowledge of working in UNIX/LINUX systems
- Experience designing and building large scale enterprise systems
Education; Bachelor's Degree in Engineering and related field with 10+ years of similar experience