We are seeking a talented Data Engineer with a strong background in data engineering to join our team. You will play a key role in designing, building, and maintaining data pipelines using a variety of technologies, with a focus on the Microsoft Azure cloud platform.
Responsibilities:
- Design, develop, and implement data pipelines using Azure Data Factory (ADF) or other orchestration tools.
- Write efficient SQL queries to extract, transform, and load (ETL) data from various sources into Azure Synapse Analytics.
- Utilize PySpark and Python for complex data processing tasks on large datasets within Azure Databricks.
- Collaborate with data analysts to understand data requirements and ensure data quality.
- Implement data governance practices to ensure data security and compliance.
- Monitor and maintain data pipelines for optimal performance and troubleshoot any issues.
- Develop and maintain unit tests for data pipeline code.
- Work collaboratively with other engineers and data professionals in an Agile development environment.
- Developing Modern Data Warehouse solutions using Azure Stack (Azure Data Lake, Azure Data Factory, Azure Databricks)
- Coding complex Spark (Scala or Python) and T-SQL
- Proficient in a source code control system such as GIT
- Ability to design the solutions using Azure data services.
- Manage the governance within the team- track the productivity and quality of team members and report the overall progress to customer periodically.
- Provide technology leadership to team, provide them technical guidance and support, solve the blockers.
- Define work breakdown structure and estimates for big work items.
Preferred Skills Experience:
- Good knowledge of PySpark working knowledge of Python
- Full stack Azure Data Engineering skills (Azure Data Factory, DataBricks and Synapse Analytics)
- Experience with large dataset handling
- Exposure to DevOps Basics.
- Exposure to Basics of Release Engineering.