Position- GCP Data Engineer
Location- Gurgaon (Hybrid)
The GCP Data Engineer will be responsible for designing, developing, and maintaining data pipelines and data infrastructure on Google Cloud Platform (GCP). This role requires expertise in data engineering best practices, cloud architecture, and big data technologies. The ideal candidate will work closely with data scientists, analysts, and other stakeholders to ensure the availability, reliability, and efficiency of data systems, enabling data-driven decision-making across the organization.
Key Responsibilities- Data Pipeline Development
- Design, develop, and maintain scalable and efficient ETL/ELT pipelines on GCP.
- Implement data ingestion processes from various data sources (e.g., APIs, databases, file systems).
- Ensure data quality, integrity, and reliability throughout the data lifecycle.
- Cloud Architecture
- Design and implement data architecture on GCP using services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, and Cloud Composer.
- Optimize and manage data storage and retrieval processes to ensure high performance and cost efficiency.
- Ensure data infrastructure is secure, scalable, and aligned with industry best practices.
- Big Data Processing
- Develop and manage large-scale data processing workflows using Apache Beam, Dataflow, and other big data technologies.
- Implement real-time data streaming solutions using Pub/Sub and Dataflow.
- Optimize data processing jobs for performance and cost.
- Collaboration and Communication
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
- Communicate technical concepts effectively to both technical and non-technical stakeholders.
- Participate in agile development processes, including sprint planning, stand-ups, and retrospectives.
- Data Management and Governance
- Implement and maintain data governance practices, including data cataloging, metadata management, and data lineage.
- Ensure compliance with data security and privacy regulations.
- Monitor and manage data quality and consistency.
- Troubleshooting and Support
- Debug and resolve technical issues related to data pipelines and infrastructure.
- Provide support and maintenance for existing data solutions.
- Continuously monitor and improve data pipeline performance and reliability.
Qualifications- Education: Bachelor's degree in Computer Science, Information Technology, Data Science, or a related field.
- Experience:
- Minimum of 5-7 years of experience in data engineering.
- Proven experience with GCP data services and tools.
- Technical Skills:
- Proficiency in GCP services (e.g., BigQuery, Dataflow, Pub/Sub, Cloud Storage, Cloud Composer).
- Strong programming skills in languages such as Python
- Familiarity with big data technologies and frameworks (e.g., Apache Beam, Hadoop, Spark).
- Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a plus.
Key Competencies- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork skills.
- Ability to work in a fast-paced, dynamic environment.
- Self-motivated and able to work independently as well as part of a team.
- Continuous learning mindset and a passion for staying up-to-date with emerging technologies.