Developing ETL Pipelines: Designing, developing, and maintaining scalable and adaptable data pipelines using Python or PySpark to facilitate the smooth migration of data from diverse data sources . Host these ETL pipelines in AWS EC2, AWS Glue or AWS EMR and store this data to cloud database services like Google BigQuery, AWS S3, Redshift, RDS, Delta Lake etc. This includes managing significant data migrations and ensuring seamless transitions between systems.
Implementing Data Quality Check Framework: Establishing and executing data quality checks and validation pipelines using different tools like Python, PySpark, Athena or BigQuery, S3, Delta Lake to uphold the integrity and accuracy of our datasets.
Creating Mechanisms for Generating ETL Migration Status Reports: Devising a framework to generate concise summary reports detailing data migration progress, alongside promptly alerting stakeholders to any failures within ETL pipelines. This ensures swift resolution of data discrepancies arising from pipeline failures. Implement this using standard SMTP, Python, AWS SNS, AWS SES, AWS S3, Delta Lake etc services.
Data Transformations and Processing: Implementing various data encryption and decryption techniques using Python and PySpark libraries, in addition to generating insightful reports and analyses derived from processed data to aid in informed business decision-making.
Development of APIs: Building APIs using frameworks such as Flask or Django, incorporating diverse authentication and authorization techniques to safeguard the exchange of data. Host these API's on EC2 server using services like Gearman etc or Write API logics in lambda and host these API's using API Gateway services of cloud.
Code Versioning and Deployment: Leveraging GitHub extensively for robust code versioning, deployment of the latest code iterations, seamless transitioning between different code versions, and merging various branches to streamline development and code release processes.
Automation: Designing and implementing code automation solutions to streamline and automate manual tasks effectively.
Job Types: Full-time, Permanent
Pay: 700,
- 00 - 1,000,000.00 per year
Benefits: - Health insurance
Provident Fund
Schedule: Fixed shift
Experience:
- total work: 4 years (Preferred)
Work Location: In person
Application Deadline: 25/06/2024