Role: Senior Software Engineer/Manager ( Specific to Data Engineering, software development roles)
Experience Range : 7 to 10 Years
Job Description :
- 6-7+ years of Experience in Hadoop or any Cloud Bigdata components (specific to the Data Engineering role), Hadoop hands on (airflow, oozie, hive, hdfs, sqoop, pig, map, reduce )
- 4+ years of Exp in Spark (Spark Batch, Streaming, Mlib etc). Candidates should possess proficiency in utilizing the Apache Spark framework.
- 6-7+ years of experience in Python programming language.
- 4+ years of experience with pyspark data transformation (json,csv,rdbms,stream) pipeline design , development and deployment with kubernate/onprem platform (not cloud based).
- 2+ years of experience in designing and implementing data workflows with Apache Airflow.
- Kafka or equivalent Cloud Bigdata components (specific to the Data Engineering role)
- Exposure in Oracle, MySql, SQL Server, DB2, Teradata, SPARK SQL, POSTGRES SQL, Spark SQL
- Unix/Shell Scripting experience.
- Cloud technologies GCP preferable.
Additional Requirements:
- Exposure to Large enterprise data
- Experience in application support and maintenance of spark applications
- Experience in optimize and tune the performance to handle large and medium scale data volume with spark.
- Experience in performance tuning techniques for large-scale data processing.
- Experience working with Continuous Integration/Continuous Deployment tools
- Experience working on project(s) involving the implementation of solutions applying development life cycles (SDLC)
- Adherence to clean coding principles: Candidates should be capable of producing code that is devoid of bugs and can be easily understood and replicated by other developers.
- Strong teamwork abilities: developers typically collaborate closely with data scientists and other backend developers. Therefore, candidates should exhibit excellent communication and collaboration skills.
Good To Have:
- No Sql, druid, Elasticsearch, google big query
Minimum Qualifications:
- Bachelor's Degree in Engineering, Computer Science, CIS, or related field (or equivalent work experience in a related field)
Matrix :
Skill Set:
- Hadoop/Map treduce (HDFS,Sqoop, Hive, Pig, Map Reduce, oozie)-Atleast 5 Years - Mandate
- Spark, Pyspark-Atleast 3 Years - Mandate
- Python - Mandate
- Apache Airflow - Mandate
- Kafka/Spark Streaming - Good To Have
- Databases, SQL (Mention Names as per JD)-Atleast 5 Years - Mandate
- Exposure to Kubernetes/OnPremises - Mandate
- Experience in Handling BIG,Large scale enterprise data - Mandate
- ETL - Mandate
- Unix/Shell Scripting - Mandate
- MLIb - Good To Have
- GCP - Mandate
- Druid - Good To Have
- ElasticSearch - Good To Have
- Exposure on working/using CI/CD Pipelines- Mandate
- Any Cloud Exp (Mention Cloud) - Mandate