Managing the Linux Infrastructure and web technologies
Delivering end-to-end automation of deployment, monitoring, and infrastructure management
Work to improve the reliability and performance of the next generation of distributed systems and containerized deployments
Diagnose and troubleshoot complex distributed systems handling millions of queries per second
Own monitoring and alerting to production systems, improvements and changes
Own implementations of new technologies while ensuring proper testing and documentation.
Work on delivery and engineering projects to develop the platform and technologies, striving to automate where possible.
Continuously improve the team, tools and processes, support regular agile releases of applications and architectural improvements.
Availability to be in on-call rotation for Production issues when required
Design and implement continuous integration (CI) and continuous deployment (CD) for technology platforms and hosted applications.
Troubleshoot system issues, plan for future capacity and monitor systems performance.
8+ years of experience managing the Linux Infrastructure and web technologies.
Managed an estate of 1000+ hosts and performed general system administration, networking, backup and restore, monitoring and troubleshooting functions on that estate.
4 Years of experience with scripting languages (bash/Python/Go) and automating tasks with Puppet/Ansible and Redhat Satellite. Experience with custom RPM generation.
Strong analytical and troubleshooting skills. You will have resolved complex systems issues in your last role and have a solid understanding of the tools needed to do so.
Excellent Communication (Listening, speaking, transmission of concepts with/without examples, etc).
Calm under pressure and work to tight deadlines. You will have brought critical production systems back to life.
2 years in Network administration
Experience with Nagios/Check_MK monitoring tools.