Search by job, company or skills

HCLTech

Data Engineer - Pyspark, Scala

Early Applicant
  • 9 hours ago
  • Be among the first 50 applicants

Job Description

Responsibilities -

  1. Expertly perform tasks under supervision.
  2. Develop and maintain end-to-end data pipelines using cloud-native solutions to extract, load, and transform data from disparate data sources to a cloud data warehouse.
  3. Capable of formatting and distributing custom data extracts through various means (e.g., custom SFTPs, APIs (e.g., RESTful), and other bulk data transfer mediums) and optimizing data storage options based on business requirements.
  4. Competent in helping to develop/design database structure and function, schema design, and database testing protocols.
  5. Contribute to the process of defining company data assets (data models) and custom client workflows, as well as standardized data quality protocols.
  6. Capable of independently and collaboratively troubleshooting database issues and queries for improving data retrieval times across various systems (e.g., via SQL).
  7. Collaborate with both technical and non-technical stakeholders including IT, Data Science, and various team members across a diverse array of business units.
  8. Work closely with IT team whenever necessary to help facilitate, troubleshoot, or develop database connectivity between internal/external resources (e.g., on-premises Azure Data Lakes, Data Warehouses, and Data Hubs).
  9. Help implement and enforce enterprise reference architecture and ensure that data infrastructure design reflects enterprise business rules as well as data governance and security guidelines.

Criteria -

  1. Experience with Azure Stack (Data Lake/Blob Storage, Azure Data Factory (or equivalent), Databrick) and production level experience with on-premises Microsoft SQL Server required.
  2. Experience with ETL/ELT, taking data from various data sources and formats and ingesting into a cloud-native data warehouse, required.
  3. Experience with one of the following: Python and Pyspark, Scala as well as standard analytic libraries/packages (e.g., pandas, Numpy, dplyr, data table, stringr, Slick, and/or Kafka) and related distribution frameworks required.
  4. Strong verbal and written communication skills required.
  5. Experience with DataRobot, Domino Data Labs, Salesforce MC, Veeva CRM preferred.
  6. Familiarity with Snowflake Warehouse preferred.

Preferred - Immediate joiners

More Info

Industry:Other

Function:technology

Job Type:Permanent Job

Skills Required

Login to check your skill match score

Login

Date Posted: 25/11/2024

Job ID: 101417515

Report Job

About Company

Follow

Hi , want to stand out? Get your resume crafted by experts.

Last Updated: 25-11-2024 06:39:05 PM
Home Jobs in Vijayawada Data Engineer - Pyspark, Scala