- This position participates in the support of batch and real-time data pipelines utilizing various data analytics processing frameworks in support of data science practices for Marketing and Finance business units
- This position supports the integration of data from various data sources, as well as performs extract, transform, load (ETL) data conversions, and facilitates data cleansing and enrichment
- This position performs full systems life cycle management activities, such as analysis, technical requirements, design, coding, testing, implementation of systems and applications software
- This position participates and contributes to synthesizing disparate data sources to support reusable and reproducible data assets
RESPONSIBILITIES
- Supervises and supports data engineering projects and builds solutions by leveraging a strong foundational knowledge in software/application development. He/she is hands on.
- Develops and delivers data engineering documentation.
- Gathers requirements, defines the scope, and performs the integration of data for data engineering projects.
- Recommends analytic reporting products/tools and supports the adoption of emerging technology.
- Performs data engineering maintenance and support.
- Provides the implementation strategy and executes backup, recovery, and technology solutions to perform analysis.
- Performs ETL tool capabilities with the ability to pull data from various sources and perform a load of the transformed data into a database or business intelligence platform.
- Codes using programming language used for statistical analysis and modeling such as Python/Spark
REQUIRED QUALIFICATIONS
- Literate in the programming languages used for statistical modeling and analysis, data warehousing and Cloud solutions, and building data pipelines.
Proficient in developing notebooks in Data bricks using Python and Spark and Spark SQL. - Strong understanding of a cloud services platform (e.g., GCP, or AZURE, or AWS) and all the data life cycle stages. Azure is preferred.
- Proficient in using Azure Data Factory and other Azure features such as LogicApps.
- Preferred to have knowledge of Delta lake, Lakehouse and Unity Catalog concepts.
- Strong understanding of cloud-based data lake systems and data warehousing solutions.
- Has used AGILE concepts for development, including KANBAN and Scrums
- Strong understanding of the data interconnections between organizations operational and business functions.
- Strong understanding of the data life cycle stages - data collection, transformation, analysis, storing the data securely, providing data accessibility
- Strong understanding of the data environment to ensure that it can scale for the following demands: Throughput of data, increasing data pipeline throughput, analyzing large amounts of data, Real-time predictions, insights and customer feedback, data security, data regulations, and compliance.
- Strong knowledge of algorithms and data structures, as well as data filtering and data optimization.
- Strong understanding of analytic reporting technologies and environments (e.g., Power BI, Looker, Qlik, etc.)
- Understanding of distributed systems and the underlying business problem being addressed, as well as guides team members on how their work will assist by performing data analysis and presenting findings to the stakeholders.
- Bachelor s degree in MIS, mathematics, statistics, or computer science, international equivalent, or equivalent job experience.
REQUIRED SKILLS
3 years of experience with Databricks, Apache Spark, Python, and SQL
PREFERRED SKILLS
DeltaLake Unity Catalog, R, Scala, Azure Logic Apps, Cloud Services Platform (e.g., GCP, or AZURE, or AWS), and AGILE concepts.