Hadoop Data Engineer

Quick Apply

Exp: 7-10 Years

Financial Services

Job Description

Role: Senior Software Engineer/Manager ( Specific to Data Engineering, software development roles)

Experience Range : 7 to 10 Years

Job Description :

6-7+ years of Experience in Hadoop or any Cloud Bigdata components (specific to the Data Engineering role), Hadoop hands on (airflow, oozie, hive, hdfs, sqoop, pig, map, reduce )
4+ years of Exp in Spark (Spark Batch, Streaming, Mlib etc). Candidates should possess proficiency in utilizing the Apache Spark framework.
6-7+ years of experience in Python programming language.
4+ years of experience with pyspark data transformation (json,csv,rdbms,stream) pipeline design , development and deployment with kubernate/onprem platform (not cloud based).
2+ years of experience in designing and implementing data workflows with Apache Airflow.
Kafka or equivalent Cloud Bigdata components (specific to the Data Engineering role)
Exposure in Oracle, MySql, SQL Server, DB2, Teradata, SPARK SQL, POSTGRES SQL, Spark SQL
Unix/Shell Scripting experience.
Cloud technologies GCP preferable.

Additional Requirements:

Exposure to Large enterprise data
Experience in application support and maintenance of spark applications
Experience in optimize and tune the performance to handle large and medium scale data volume with spark.
Experience in performance tuning techniques for large-scale data processing.
Experience working with Continuous Integration/Continuous Deployment tools
Experience working on project(s) involving the implementation of solutions applying development life cycles (SDLC)
Adherence to clean coding principles: Candidates should be capable of producing code that is devoid of bugs and can be easily understood and replicated by other developers.
Strong teamwork abilities: developers typically collaborate closely with data scientists and other backend developers. Therefore, candidates should exhibit excellent communication and collaboration skills.

Good To Have:

Minimum Qualifications:

Bachelor's Degree in Engineering, Computer Science, CIS, or related field (or equivalent work experience in a related field)

Matrix :

Skill Set:

Hadoop/Map treduce (HDFS,Sqoop, Hive, Pig, Map Reduce, oozie)-Atleast 5 Years - Mandate
Spark, Pyspark-Atleast 3 Years - Mandate
Python - Mandate
Apache Airflow - Mandate
Kafka/Spark Streaming - Good To Have
Databases, SQL (Mention Names as per JD)-Atleast 5 Years - Mandate
Exposure to Kubernetes/OnPremises - Mandate
Experience in Handling BIG,Large scale enterprise data - Mandate
ETL - Mandate
Unix/Shell Scripting - Mandate
MLIb - Good To Have
GCP - Mandate
Druid - Good To Have
ElasticSearch - Good To Have
Exposure on working/using CI/CD Pipelines- Mandate
Any Cloud Exp (Mention Cloud) - Mandate