Responsible for setting up a scalable DataWarehouse and building data pipeline mechanisms to integrate the data from various sources for all of Klub s data.
Setup data as a service to expose the needed data as part of Apis.
Have a good understanding on how the finance data works.
Standardise and optimise design thinking across the technology team.
Collaborate with stakeholders across engineering teams to come up with short and long-term architecture decisions.
Build robust data models that will help to support various reporting requirements for the business , ops and the leadership team.
Participate in peer reviews , provide code/design comments.
Own the problem and deliver to success.
Requirements
Prior experience on Backend and Data Engineering systems.
Should have at least 2.5 + years of working experience in distributed systems or 2.5 + years of experience in Data Engineering.
Deep understanding on python tech stack with the libraries like Flask, scipy, numpy, pytest frameworks.
Good understanding of Apache Airflow or similar orchestration tools.
Good knowledge on data warehouse technologies like Apache Hive or similar.
Good knowledge on Apache PySpark or similar.
Good knowledge on how to build analytics services on the data for different reporting and BI needs.
Good knowledge on data pipeline/ETL tools.
Good knowledge on Trino / graphQL or similar query engine technologies.
Deep understanding of concepts on Dimensional Data Models.
Familiarity with RDBMS (MySQL/ PostgreSQL) , NoSQL (MongoDB/DynamoDB) databases & caching(redis or similar).
Should be proficient in writing sql queries.
Good knowledge on kafka.
Be able to write clean, maintainable code.
Nice to have
Built a Data Warehouse from the scratch and set up a scalable data infrastructure.
Prior experience in fintech would be a plus.
Prior experience on data modelling.
Prior experience working with open source data engineering tools