About The Company:
Ara's clientis a leading IT solutions provider, offering Applications, Business Process Outsourcing (BPO) and Infrastructure services globally through a combination of technology knowhow, domain, and process expertise. The accolades It has been garnering can be attributed to their undeterred focus in delivering quality solutions across verticals that meet the challenging requirements of their esteemed customers has been recently felicitated by the Economic Times as the most distinguished digital company 2015. They have also been ranked 29 of 100 largest financial technology vendors by American Banker and BAI in the FinTech Forward Top 100 rankings.
The Role:
SSE AWS Databricks and Pyspark
Key Responsibilities:
- Develop and maintain data pipelines using AWS Databricks and Big Data technologies.
- Perform development tasks using Python/Scala, Spark SQL, and DataFrames.
- Work with Databricks, Data Lake, and SQL to manage and analyze large datasets.
- Optimize performance, troubleshoot, and debug Spark applications.
- Utilize AWS services such as S3, Redshift, EC2, and Lambda for data processing and storage.
- Use DevOps tools like Kubernetes for deployment and scaling.
- Implement SQL development with a focus on optimization and tuning in Redshift.
- Develop notebooks (e.g., Jupyter, Databricks, Zeppelin) for data science and analysis.
- Write scripts in Python and other programming languages for data-related tasks.
- Collaborate within an Agile Scrum environment to deliver projects.
Skills Required:
- Strong knowledge and hands-on experience with AWS Databricks.
- Proficiency in Python/Scala programming, particularly for data processing tasks.
- Expertise in Spark SQL, DataFrames, and Spark application development.
- Experience with Big Data pipeline development and deployment.
- Familiarity with AWS services, including S3, Redshift, EC2, and Lambda.
- Knowledge of DevOps practices, with experience in Kubernetes for container orchestration.
- Proficiency in SQL development, including optimization and tuning techniques in Redshift.
- Hands-on experience with notebooks like Jupyter, Databricks, or Zeppelin.
- Experience with PySpark for large-scale data processing.
- Proficiency in scripting languages, including Python and potentially others.
Qualifications & Experience:
- 5+ years of experience in AWS Databricks
- Bachelor of Engineering (Computer background preferred)