You will design, deploy, and continuously improve our cloud infrastructure, focusing on reliability, scalability, and security.
Provide input into long-range platform requirements and operational guidelines, with a focus on automation and continuous improvement.
Perform incident management duties to investigate and follow up on incidents and deliver structural fixes.
Raise the standard of our engineering excellence by implementing best practices for coding, testing, and deployment.
Provide organization-wide IT support.
Guide other teams and departments in implementing security standards across the company and conduct risk management activities
Help implement compliance policies and procedures including security patch management for our complete IT assets inventory and our business continuity plan (e.g., backups, disaster recovery)
6+ years of experience in system engineering, DevOps, SRE, or similar positions.
A deep understanding of deployment and monitoring strategies for reliable, highly scalable, and robust cloud infrastructure.
Expert on cloud technologies such as AWS, Azure, Kubernetes, and Docker.
Strong knowledge of automation and infrastructure as code technologies such as Terraform, Powershell, Puppet, Ansible, etc.
Experience working with complex CI/CD pipelines for production releases in a microservices-based architecture.
An organized self-starter with an ability to follow through on tasks under minimal supervision.
Passionate about improving skills and learning new technologies.
Experience in at least one modern programming language such as Python, Java, Javascript, Node.js, etc.