Search by job, company or skills
Education: Top Tie one college
We are growing rapidly and seeking a strong Data Engineer to be a key member of
the Data and Business Intelligence organization with a focus on deep data engineering
projects. You will be joining as one of the few initial data engineers as part of the data
platform team in our Bengaluru office. You will have an opportunity to help define our
technical strategy and data engineering team culture in India.
You will design and build data platforms and services while managing our data
infrastructure in cloud environments that fuels strategic business decisions across
products.
A successful candidate will be a self-starter, who drives excellence, is ready to jump into
a variety of big data technologies & frameworks, and is able to coordinate and
collaborate with other engineers, as well as mentor other engineers in the team
What You'll Be Doing
- Build highly scalable, available, fault-tolerant distributed data processing systems
(batch and streaming systems) processing over 100s of terabytes of data ingested
every day and petabyte-sized data warehouse and elasticsearch cluster.
- Build quality data solutions and refine existing diverse datasets to simplified
models encouraging self-service
- Build data pipelines that optimize on data quality and are resilient to poor quality
data sources
- Own the data mapping, business logic, transformations and data quality
- Low level systems debugging, performance measurement & optimization on large
production clusters
- Participate in architecture discussions, influence product roadmap, and take
ownership and responsibility over new projects
- Maintain and support existing platforms and evolve to newer technology stacks
and architectures
We're excited if you have
- Proficiency in Python and pyspark
- Deep understanding of Apache Spark, Spark tuning, creating RDDs, and building
data frames. Create Java/ Scala Spark jobs for data transformation and
aggregation.
- Experience in big data technologies like HDFS, YARN, Map-Reduce, Hive, Kafka,
Spark, Airflow, Presto, etc.
- Experience in building distributed environments using any of Kafka, Spark, Hive,
Hadoop, etc.
- Good understanding of the architecture and functioning of Distributed database
systems
- Experience working with various file formats like Parquet, Avro, etc for large
volumes of data
- Experience with one or more NoSQL databases
- Experience with AWS, GCP
- 5+ years of professional experience as a data or software engineer
Login to check your skill match score
Date Posted: 16/11/2024
Job ID: 100526635