Job Description
- AWS Data Engineer with min of 5 to 7 years of experience.
- Collaborate with business analysts to understand and gather requirements for existing or new ETL pipelines.
- Connect with stakeholders daily to discuss project progress and updates.
- Work within an Agile process to deliver projects in a timely and efficient manner.
- Design and develop Airflow DAGs to schedule and manage ETL workflows.
- Transform SQL queries into Spark SQL code for ETL pipelines.
- Develop custom Python functions to handle data quality and validation.
- Write PySpark scripts to process data and perform transformations.
- Perform data validation and ensure data accuracy and completeness by creating automated tests and implementing data validation processes.
- Run Spark jobs on AWS EMR cluster using Airflow DAGs.
- Monitor and troubleshoot ETL pipelines to ensure smooth operation.
- Implement best practices for data engineering, including data modeling, data warehousing, and data pipeline architecture.
- Collaborate with other members of the data engineering team to improve processes and implement new technologies.
- Stay up to date with emerging trends and technologies in data engineering and suggest ways to improve the team's efficiency and effectiveness.
Skills: aws,ci/cd pipelines,sql,python,lambda,data engineering,pipelines,airflow,data