Lead Data Engineer
Experience: 8+ Years
About Us: We are seeking a talented and experienced Data & AI Engineer with strong Azure cloud competencies to join our dynamic team.
Role Overview: Deliver successful projects in customer environments to bring use cases into production, machine learning projects, and large migrations, in order to deliver on value proposition.
Key Responsibilities:
- Architecture and Design for Data Engineering and Machine Learning Projects: Establish architecture and target design for data engineering and machine learning projects.
- Requirement Analysis, Planning, Effort and Resource Needs Estimation: Conduct current inventory analysis, review and formalize requirements, and develop project planning and execution plans.
- Advisory Services and Best Practices: Provide troubleshooting, performance tuning, cost optimization, operational runbooks, and mentoring.
- Large Migrations: Assist customers with large migrations to Databricks from Hadoop ecosystems, data warehouses (Teradata, DataStage, Netezza, Ab Initio), ETL engines (Informatica), SAS, SQL, and cloud-based data platforms like Redshift, Snowflake, and EMR.
- Design, Build, and Optimize Data Pipelines: Implement best-in-class Databricks solutions with flexibility for future iterations.
- Production Readiness: Assist with production readiness for customers, including exception handling, production cutover, capture analysis, alert scheduling, and monitoring.
- Machine Learning (ML) Model Review, Tuning, ML Operations, and Optimization: Build and review ML models, ensure ML best practices, manage model lifecycle, work with ML frameworks, and deploy models in production.
Must Have:
- Hands-on experience with distributed computing frameworks like Databricks, Spark Ecosystem (Spark Core, PySpark, Spark Streaming, Spark SQL).
- Willingness to work with product teams to optimize product features/functions.
- Experience with batch workloads and real-time streaming with high-volume data frequency.
- Performance optimization on Spark workloads.
- Environment setup, user management, authentication, and cluster management on Databricks.
- Professional curiosity and the ability to learn new technologies and tasks independently.
- Good understanding of SQL and a strong grasp of relational and analytical database management theory and practice.
Key Skills:
- Proficiency in Python, SQL, and PySpark.
- Experience with Big Data Ecosystem (Hadoop, Hive, Sqoop, HDFS, HBase).
- Expertise in Spark Ecosystem (Spark Core, Spark Streaming, Spark SQL) / Databricks.
- Strong knowledge of Azure (ADF, ADB, Logic Apps, Azure SQL Database, Azure Key Vaults, ADLS, Synapse).
- Familiarity with AWS (Lambda, AWS Glue, S3, Redshift).
- Strong understanding of data modeling and ETL methodology.
If you are a seasoned data engineer looking for a challenging and rewarding opportunity, we would love to hear from you. Apply now share your CV's at [Confidential Information] or share it via WhatsApp at 9109436045