Search by job, company or skills
IT/Computers - Software
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
.Setting up and managing EMR clusters for processing large-scale data using frameworks like Apache Hadoop, Apache Spark, Apache Hive, etc.
.Configuring EMR clusters based on specific requirements, including choosing the appropriate instance types, storage configurations, and software settings.
.Implementing and optimizing data processing workflows on EMR clusters, leveraging distributed computing frameworks for tasks such as data cleansing, transformation, and analysis.
.Writing scripts and code to interact with EMR clusters, often using languages like Python, Java, or Scala, to develop and execute data processing jobs.
.Integrating EMR with other AWS services, such as Amazon S3 for storage, AWS Glue for ETL (Extract, Transform, Load), AWS Lambda and other complementary services to create end-to-end data pipelines.
.Optimizing cluster performance by fine-tuning configurations, adjusting resource allocation, and implementing best practices for efficient data processing.
.Implementing monitoring solutions to track cluster performance and troubleshoot issues, ensuring the reliability and availability of the big data processing environment.
.Implementing security measures to protect data within EMR clusters, configuring access controls, encryption, and ensuring compliance with security policies.
.Implementing automation for cluster provisioning, scaling, and decommissioning to streamline operations and improve efficiency.
.Overall, an AWS EMR role requires a combination of cloud computing knowledge, big data processing expertise, scripting/coding skills, and a good understanding of data engineering principles.
.AWS certifications, such as the AWS Solution Architect - Associate or relevant certifications demonstrating expertise in cloud computing, are often preferred.
.Good communication skills to interact with cross-functional teams, understand business requirements, and effectively convey technical information.
.Ability to collaborate with data engineers, data scientists, and other stakeholders in a team-oriented environment.
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fuelled by its market leading capabilities in AI, cloud and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of 22.5 billion.
Date Posted: 14/11/2024
Job ID: 100305591
Capgemini was founded by Serge Kampf in 1967 as an enterprise management and data processing company. The company was founded as the Société pour la Gestion de l'Entreprise et le Traitement de l'Information (Sogeti).In 1974 Sogeti acquired Gemini Computers Systems, a US company based in New York.In 1975, having made two major acquisitions of CAP (Centre d'Analyse et de Programmation) and Gemini Computer Systems, and following resolution of a dispute with the similarly named CAP UK over the international use of the name 'CAP', Sogeti renamed itself as CAP Gemini Sogeti.