Senior Data Engineer/Data Architect

7Dxperts

Early Applicant

4 months ago
Be among the first 50 applicants

Exp: 4-8 Years

Bengaluru / Bangalore, India

Job Description

Job description

Role & responsibilities:

8+ years of Overall Work Experience.
4+ years of experience in Spark, Databricks, Hadoop, Data and ML Engineering.
4+ Years on experience in designing architectures using AWS cloud services & Databricks.
Architecture, design and build Big Data Platform (Data Lake / Data Warehouse / Lake house) using Databricks services and integrating with wider AWS cloud services.
Knowledge & experience in infrastructure as code and CI/CD pipeline to build and deploy data platform tech stack and solution.
Hands-on spark experience in supporting and developing Data Engineering (ETL/ELT) and Machine learning (ML) solutions using Python, Spark, Scala or R languages.
Distributed system fundamentals and optimising Spark distributed computing.
Experience in setting up batch and streams data pipeline using Databricks DLT, jobs and streams.
Understand the concepts and principles of data modelling, Database, tables and can produce, maintain, and update relevant data models across multiple subject areas.
Design, build and test medium to complex or large-scale data pipelines (ETL/ELT) based on feeds from multiple systems using a range of different storage technologies and/or access methods, implement data quality validation and to create repeatable and reusable pipelines.
Experience in designing metadata repositories, understanding range of metadata tools and technologies to implement metadata repositories and working with metadata.
Understand the concepts of build automation, implementing automation pipelines to build, test and deploy changes to higher environments.
Define and execute test cases, scripts and understand the role of testing and how it works.

Preferred candidate profile:

Big Data technologies Databricks, Spark, Hadoop, EMR or Hortonworks.
Solid hands-on experience in programming languages Python, Spark, SQL, Spark SQL, Spark Streaming, Hive and Presto
Experience in different Databricks components and API like notebooks, jobs, DLT, interactive and jobs cluster, SQL warehouse, policies, secrets, dbfs, Hive Metastore, Glue Metastore, Unity Catalog and ML Flow.
Knowledge and experience in AWS Lambda, VPC, S3, EC2, API Gateway, IAM users, roles & policies, Cognito, Application Load Balancer, Glue, Redshift, Spectrum, Athena and Kinesis.
Experience in using source control tools like git, bit bucket or AWS code commit and automation tools like Jenkins, AWS Code build and Code deploy.
Hands-on experience in terraform and Databricks API to automate infrastructure stack.
Experience in implementing CI/CD pipeline and ML Ops pipeline using Git, Git actions or Jenkins.
Experience in delivering project artifacts like design documents, test cases, traceability matrix and low-level design documents.
Build references architectures, how-tos, and demo applications for customers.
Ready to complete certifications.

We are looking for an experienced Data Architect to join our specialist team to work on our Data Engineering, Data Science, Geospatial projects, and products. You will use advanced data engineering, Databricks, cloud services, infrastructure as code, Linux stack, data quality and machine learning to build data platform architecture and solution following our data architecture standards and principles.

To succeed in this data architect position, you should have strong cloud knowledge, Databricks platform, analytical skills, the ability to combine data from different sources and develop data pipeline using latest libraries and data platform standards.

If you are detail-oriented, with excellent organizational skills and experience in this field, wed like to hear from you.

What you can expect from us

We appreciate that individual growth is important as well and we support you on every aspect of personal development.