The Role:We are looking for a Senior SRE with 5+ years of experience to work primarily with our
Application development team. An ideal candidate would have extensive experience
building cloud infrastructure on Google Cloud with Terraform and have strong
experience running workloads that scale on Google's Kubernetes Engine.
Join the efforts of our Site Reliability Engineering team, establish best practices, and
help shape the SRE culture
Help our teams build and improve our Google Cloud infrastructure
Work closely with Software Engineers to get applications deployed, scaled properly,
and to use the right tool for the job whether the solution should be serverless or
containerized in a Kubernetes service
Secure and instrument the Kubernetes cluster, the container, and the cloud resources
utilized
Enjoy a fast-paced environment that challenges you to adapt to change swiftly
Focus on solving the problem with simple, concise, maintainable, transparent
techniques
Implement revenue producing and cost conserving features plus discover and
contribute your own
Ability to work independently, learn from and mentor other team members
Apply statistical measurements to your systems to carve out the signal from the noise
- What you bring to the table:
5+ years of experiences as a Site Reliability Engineer/Cloud Engineer/Software
Engineer
Extensive experience using cloud resources and building infrastructure on Google
Cloud Platform using Terraform
Experience configuring and deploying containerized workloads on Kubernetes,
securing and monitoring them and troubleshooting the issues that they may produce.
Experience building and troubleshooting containers
Experience Bigquery and Bigtable, pub/sub,etc.
Experience in establishing security practices and meeting standards to well augment
the application development team
Fluent in Python or Go
Excellent grasp of CS fundamentals, the Linux operating system, and common
GNU/Linux tools
BS/MS in CS/EE or equivalent experience.
Experience working closely with software engineering teams in an effort to accelerate
output
Experience contributing to or writing and enforcing SOC2 policies
Experience with distributed systems and highly parallel processes.
Experience building CI/CD pipelines for and with containers
Experience deploying gRPC services and making them available securely for public
consumption
Experience with Datadog, Codefresh, or Jenkins.
- Mandatory Skills and Experiences:
Expert in Google Cloud Platform (GCP)
Kubernetes - GKE
Infrastructure as Code (IaC) with GCP
Python or Go
Linux operating system, and common GNU/Linux tools
GCP CI/CD pipelines, especially for containerized applications
Secrets Management with GCP
Job Types: Full-time, Permanent
Work Location: In person