Skills:
Statistical Analysis, Machine Learning, Big Data, HDFS, Hive, IMPALA, java,
Job Description
- Architectural Design: Design and develop robust, scalable, and high-performance data solutions leveraging Cloudera Distribution including Apache Hadoop (CDH) or Cloudera Data Platform (CDP).
- Platform Implementation: Lead the implementation and configuration of Cloudera platform components such as HDFS, Hive, HBase, Impala, Kafka, Spark, and Sentry, ensuring optimal performance and reliability.
- Integration: Integrate Cloudera-based solutions with existing IT infrastructure and applications, ensuring seamless data flow and compatibility.
- Data Management: Design and implement data management strategies including data ingestion, storage, processing, and governance.
- Performance Optimization: Perform performance tuning, monitoring, and troubleshooting of Cloudera clusters to ensure efficient data processing and analytics.
- Security and Compliance: Implement and enforce security measures, data privacy standards, and compliance requirements within the Cloudera environment.
- Documentation and Reporting: Create technical documentation, architecture diagrams, and provide regular reporting on system performance and project status.
- Collaboration: Work closely with cross-functional teams including developers, data scientists, system administrators, and business stakeholders to deliver integrated solutions that meet business requirements.
- Research and Innovation: Stay updated with Cloudera and big data industry trends, evaluate new technologies, and recommend enhancements to improve system architecture and performance.
Desired Profile
- Proven experience (typically 6+ years) as a Cloudera Architect or similar role, with hands-on experience in designing, deployment and implementing Cloudera-based solutions.
- In-depth knowledge of Cloudera ecosystem components and tools such as Cloudera Manager, Navigator, Hue, etc.
- Strong understanding of big data concepts, data warehousing, and ETL processes.
- Proficiency in programming and scripting languages such as Java, Python, Scala, or similar.
- Strong understanding of distributed computing principles and data processing frameworks (e.g., Spark, Flink).
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and their integration with Cloudera is a plus.
- Excellent analytical and problem-solving skills, with the ability to troubleshoot complex issues and drive resolutions.
- Strong communication skills with the ability to effectively collaborate with technical and non-technical stakeholders.
Good To Have
- Cloudera Certified Professional (CCP)
- Apache Hadoop Developer Certification
- Other relevant certifications