- We are looking for a top-level Data Engineer 3 to join us at PayPal on our Data Foundational Services (DFS) team
- The Data Engineer 3 will take a development role in designing and developing backend data pipelines using GCP (BigQuery, Bigtable and Dataproc), Big Data (HDFS, HIVE, HBASE, Kafka), Python programming, Spark and Oracle
- You will be collaborating with functional teams, architecture team and product team to deliver high quality product with zero noise in live and zero customer impact
- You will also be continuously evaluating our platforms for improvements and work on technological advancement, performance and quality improvement
Your way to impact
You will contribute to designing, coding, testing, push it to live, own the component and improving the performance of the component in live.
Your day to day
Building highly scalable backend data pipeline with high throughput in GCP using BigQuery and Dataproc.
Write and maintain pipelines in Python framework
Write efficient and optimized code to ingest data feeds from multiple SoR systems using Python or Java
Writing spark job which will read data from backend payments SOR feed into our data pipeline
Building batches using Spring Framework which deals payments data
Building REST services using Spring framework which reads the data from SOR, perform business logic, and expose the output data to client in REST API (API development knowledge is a plus)
Scheduling jobs using UC4, airflow
What do you need to bring-
5+ years of experience in the IT industry, experience in Data Technology space is preferred.
Proficiency in any programming language like Python
Working experience in any MPP systems, should have strong SQL programming skills.
Knowledge of data warehousing concepts
Working knowledge on Big Data, Cloud databases, Streaming Integrations
Good hand on experience with any cloud like GCP, AWS, etc.
Excellent written and oral communication skills
Strong analytical skills including the ability to define problems, collect data, establish facts, and draw valid conclusions.
Expertise in database programming and performance tuning techniques
Familiar with data movement techniques and best practices to handle large volumes of data.
Experience with data warehousing architecture and data modeling best practices
Experience with File Systems, server architectures, and distributed systems
Strong communication skills and willingness to take initiative to contribute beyond basic responsibilities.
Working experience in an Agile methodology is highly preferred.
Knowledge of Hadoop, Spark, HBase and Hive is highly preferred.
Knowledge and Working experience on Cloud platforms is highly preferred.
API development knowledge using sprint boot is an added advantage