We are looking for a skilled Data Engineer to join our team. As a Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure to support data-driven initiatives. You will work closely with cross-functional teams to ensure data quality, availability, and accessibility for various analytics and business intelligence needs.
Responsibilities:
- Design and implement robust data pipelines and ETL processes using Python, PySpark, and AWS Glue.
- Develop and maintain data lake solutions, ensuring scalability, security, and performance.
- Implement and manage data lineage and metadata management processes.
- Collaborate with data scientists, analysts, and stakeholders to understand data requirements and translate them into technical solutions.
- Optimize data storage and retrieval for efficiency and cost-effectiveness using technologies such as Snowflake.
- Ensure data quality through data validation, testing, and monitoring.
- Implement data governance policies and procedures to ensure compliance and security.
- Perform data analysis and troubleshooting to resolve data-related issues.
- Design and develop data visualizations and dashboards to enable data-driven decision-making.
- Stay updated with industry trends and best practices in data engineering and analytics.
- Proven work experience as a Data Engineer or similar role.
- Strong proficiency in Python programming and experience with PySpark for big data processing.
- Hands-on experience with AWS services including Glue, S3, EMR, and Lambda.
- Solid understanding of data lake architecture and design principles.
- Experience with data lineage tools and metadata management.
- Familiarity with Snowflake or similar cloud data warehouse platforms.
- Proficiency in SQL and database technologies (relational and non-relational databases).
- Experience with data visualization tools such as Tableau, Power BI, or equivalent.
- Strong analytical and problem-solving skills.
- Excellent communication and teamwork skills.
Preferred Skills:
- masters degree in computer science, Data Engineering, Data Science, or a related field.
- Certification in AWS or related cloud platforms.
- Experience with DevOps practices and CI/CD pipelines.
- Knowledge of machine learning concepts and frameworks.
- Experience with Agile/Scrum methodologies