Search by job, company or skills
About The Role: We are looking for a passionate and driven Data Engineer with 1-3 years of experience to join our dynamic team. The ideal candidate will work on building and optimizing data pipelines, designing data solutions, and ensuring efficient data flow across multiple systems. This is a great opportunity for someone excited about working with large datasets and cutting-edge technologies such as Python, Apache Spark, and modern Data Warehouses like Redshift, Snowflake, or BigQuery.
Primary Responsibilities:
Data Pipeline Development:
Design, build, and maintain scalable ETL/ELT pipelines to integrate data from various sources into data warehouses.
Optimize data pipelines for performance, reliability, and scalability.
Ensure data quality and consistency across systems.
Data Warehousing:
Work with modern data warehouses such as Amazon Redshift, Snowflake, or Google BigQuery to store and manage large datasets.
Design and implement data models that support efficient querying and analytics.
Develop efficient query logic and manage the performance of large datasets.
Big Data Processing:
Leverage Apache Spark to process large datasets in distributed environments.
Develop and optimize batch and streaming data workflows using Spark.
Data Integration and Collaboration:
Collaborate with Data Scientists, Analysts, and Software Engineers to provide data for analytics and machine learning.
Integrate and clean datasets to ensure seamless data flow between applications.
Database and SQL Proficiency:
Write advanced SQL queries to manipulate, transform, and analyze data.
Ensure efficient query performance and database optimization.
Automation and Scripting:
Automate data integration, pipeline monitoring, and operational workflows using Python.
Troubleshoot data issues, identify root causes, and apply long-term fixes.
Qualifications:
1-3 years of experience working in the Cloud Environments AWS, Azure or GCP.
1-3 years of experience in a Cloud Data Engineering role or a related field.
Strong programming skills in Python for data manipulation, scripting, and automation.
Hands-on experience with Apache Spark for big data processing.
Experience with modern data warehouse technologies such as Redshift, Snowflake, or BigQuery.
Proficiency in SQL and experience writing complex queries for data analysis and transformation.
Experience in building, maintaining, and troubleshooting ETL/ELT pipelines.
Understanding of data modeling principles and best practices.
Nice-to-Have Qualifications:
Knowledge of data orchestration tools like Airflow, dbt, or similar.
Familiarity with data visualization tools (e.g., Tableau, PowerBI, or Looker).
Understanding of data governance and security best practices
Date Posted: 22/11/2024
Job ID: 101160095