Design, deploy, and maintain a highly available and scalable data infrastructure on Azure open ai , databases and eventdriven services
Monitor and optimize the performance of AI workloads
Collaborate with cross-functional teams, including data engineers, data scientists, and developers, to provide technical guidance and support in implementing best practices.
Ensure data governance policies and practices are followed to maintain data integrity, security, and compliance.
Troubleshoot and resolve issues related to data infrastructure, working closely with operations and development teams.
Implement automation and monitoring tools to streamline operations and improve system reliability.
Plan and execute disaster recovery procedures and backup strategies for data platforms.
Stay up to date with industry trends and emerging technologies related to data management, analytics, and cloud computing.
Proven experience as an SRE or similar role, with a focus on data infrastructure and analytics.
Strong expertise in managing and optimizing Azure open ai or event driven applications in azure
In-depth knowledge of data governance principles, data security, and compliance requirements.
Experience with performance optimization techniques for large-scale data processing and analytics workloads.
Experience managing Azure cloud services, including compute, storage, networking, and security.
Familiarity with AI services, particularly OpenAI, for implementing machine learning and natural language processing solutions.
Proficiency in Terraform for infrastructure as code management and automation.
Any database knowledge is required, including SQL and NoSQL databases, for data storage and management.
Proficiency in scripting and automation using languages such as Python, PowerShell, or Bash.
Familiarity with cloud platforms, preferably Microsoft Azure, and related services (Azure Data Factory, Azure Data Lake Analytics, etc.).
Solid understanding of containerization technologies, such as Docker and Kubernetes.
Strong problem-solving skills and the ability to troubleshoot complex issues in a distributed data environment.
Excellent communication and collaboration skills to work effectively with cross-functional teams