Role Description:
Technology is at our core. And innovation is everywhere. But our company is more than datasets, lines of code or A/B tests. We're the thrill of the first night in a new place. The excitement of the next morning. The friends you make. The journeys you take. The sights you see. And the food you sample. Through our products, partners and people, we can empower everyone to experience the world.
We're a truly global e-commerce company, with business operations in nearly every country and city on the planet. And we want to make it easy for everyone, anywhere in the world, to pay for their travel or do business with our platform whenever and however it's convenient for them.
Platform Engineering
In Platform Engineering the platform consists of 4 tracks, Data Centres, Fleet Management Tooling, Virtualisation and Container Compute. The Platform Engineering super track owns collections of primary Tier 0 services. We strive for operational excellence and do not compromise on availability or performance on behalf of our customers.
Private Cloud
The Private Cloud team builds, operates and is responsible for the adoption of our next generation virtualization platform. We use OpenStack for VMs, Terraform for IaC, and Harness for continuous deployment.
About being a SRE at Booking.com
The core premise for the Booking SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking.
You will impact millions of people all over the globe with your creative solutions
You will be working in one of the biggest e-commerce companies in the world
You will solve interesting problems at scale by writing and deploying code across tens of thousands of servers
You will have the opportunity to collaborate with many of the world's leading SREs
You will be free to launch your own ideas and solutions within our complex production environment
Here are some of the tools and technologies we use to achieve this: Python, Go, Puppet, Kubernetes, Elasticsearch, Prometheus, HAProxy, Cassandra, Kafka etc
Yeras of expereince - 4 - 8 years
What you'll be Doing:
Design, develop and implement systems software that improves the stability, scalability, availability and latency of the Booking.com products;
Take ownership of one or more services and have the freedom to do what is best for our business and customers;
Solve problems occurring with our highly available production systems and build solutions and automation to prevent them from happening again;
Build effective monitoring to monitor the health of your system, and jump in to handle outages;
Build and run capacity tests to handle the growth of your systems;
Plan for reliability by designing systems to work across our multinational data centers;
Develop tools to assist the product development teams with successfully deploying 1000s of change sets every day;
What you'll bring:
Solid experience in at least one programming language.
Experience with building, operating and maintaining scalable distributed systems, and with operations automation;
Experience with Infrastructure as Code technologies;
Knowledge of cloud computing fundamentals;
Solid foundation in Linux administration and troubleshooting;
Understanding of Service level agreements and objectives;
Additional experience in OpenStack, Kubernetes, Networking, Security or Storage is desirable;
Monitoring / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus;
Good interpersonal skills
Proficient command of the English language, both written and spoken