Title: Senior Data Engineer
Location: Financial District, Hyderabad - Work from Office
Type: Fulltime role
We are looking for someone who can:
- Support large data distributed pipelines that span across Scala/Python, Pyspark, AWS, Airflow, Databricks and Snowflake. Some knowledge with Airflow is good to have to understand how the DAGs work, etc. Candidate doesn't have to be super expert in any of these but knows how these products work and can learn along the way.
- Quickly triage any issues in the data pipeline ecosystem and navigate thru the logs from Airflow & Databricks and look at other metrics, go thru the pre-written runbook to decide what to do. Sometimes, the runbook says to send an email to someone or fix it in certain way or escalate it somebody else depending on the SLA of the issue and how long its been around
- Write some PySpark or Spark scala code when they try to upgrade the versions such as Airflow upgrade or Databricks upgrade. Ensure all the existing pipelines work as is or identify what to change if there is a backward compatibility break. They don't have to super expert in coding but understand and learn
- Do some interesting PoC as and when required
Knowledge: Airflow, Databricks, Snowflake, Spark, ability to triage & troubleshoot, ability to fix stuff. Doesn't have to be a super coder.
The need for a Senior engineer is the experience that they have processing data pipelines and common issues they have come across.