Site Reliability Engineer I (aka SRE I) are specialists in treating operations as a software problem. They
focus on reliability of systems and services - addressing availability, performance, scalability, latency,
observability, efficiency.
The role of a SRE I is to produce quality technical solutions to problems outlined by the Engineering
Manager and Product Owner. They write good quality test and production code with little supervision.
They have the knowledge to use the right design pattern at the right time and follow best practices with
minimal support. A SRE I works on concrete topics as any other team members, but the delivery is
slower than more experienced SREs because of the lack of experience and expertise in some
technologies and the lack of commercial awareness. The primary differences between a SRE I and a SRE
II are both the expectations of velocity as well as the amount of effort required to produce a
stand-alone solution.
Because the required technical skills and commercial knowledge can vary from one area to another, SRE
I can wear several hats; part of a business service owner team, owner of a piece of infrastructure,
and/or consultant to product development teams regarding Site Reliability Engineering related scope.
Yeras of expereince - 1 -3 Years
Key Responsibilities
Building software applications
- Has sufficient knowledge to build software applications by using relevant development
languages and applying knowledge of systems, services and tools appropriate for the business
area
- Has sufficient knowledge to refactor and simplify code by introducing design patterns when
necessary
- Has sufficient knowledge to ensure the quality of the application by following standard testing
techniques and methods that adhere to the test strategy
- Has basic knowledge to write readable and reusable code by applying standard patterns and
using standard libraries
- Has basic knowledge to maintain data security, integrity and quality by effectively following
company standards and best practices
Software Systems Design
- Has basic knowledge to evaluate possible architecture solutions by taking into account cost,
business requirements, technology requirements and emerging technologies
- Has basic knowledge to describe the implications of changing an existing system or adding a
new system to a specific area, by having a broad, high-level understanding of the infrastructure
and architecture of our systems
- Has basic knowledge to help grow the business and/or accelerate software development by
applying engineering techniques (e.g. prototyping, spiking and vendor evaluation) and
standards
- Has basic knowledge to meet business needs by designing solutions that meet current
requirements and are adaptable for future enhancements
End to End System Ownership
- Has basic knowledge to own a service end to end by actively monitoring application health and
performance, setting and monitoring relevant metrics and act accordingly when violated
- Has basic knowledge to reduce business continuity risks and bus factor by applying
state-of-the-art practices and tools, and writing the appropriate documentation such as
runbooks and OpDocs
- Has basic knowledge to reduce risk and obtain customer feedback by using continuous delivery
and experimentation frameworks
- Has sufficient knowledge to independently manage an application or service by working
through deployment and operations in production
- Has basic knowledge to maintain data security, integrity and quality by effectively following
company standards and best practises
Technical Incident Management
- Has basic knowledge to address and resolve live production issues by mitigating the customer
impact within SLA
- Has basic knowledge to improve the overall reliability of systems by producing long term
solutions through root cause analysis
- Has basic knowledge to keep track of incidents by contributing to postmortem processes and
logging live issues.
Automation and toil reduction
- Has basic knowledge to ensure that infrastructure stays current by reducing technical debt,
searching for bottlenecks and preparing for scaling
- Has basic knowledge to reduce cost of operations and maintenance by leveraging new
technologies, automation, and partner with vendors to ensure we stay current
- Has basic knowledge to reduce human labour by writing small software features that address
availability, scalability, latency and efficiency
Monitoring and Alerting improvements
- Has sufficient knowledge to review and verify performance of production systems and network
infrastructure by continuously monitoring appropriate observability metrics, business KPIs and
capacity planning
- Has basic knowledge to improve application reliability by partnering with development teams to
advise on setting appropriate observability metrics
Critical Thinking
- Has sufficient knowledge to systematically identify patterns and underlying issues in complex
situations, and to find solutions by applying logical and analytical thinking.
- Has sufficient knowledge to constructively evaluate and develop ideas, plans and solutions by
reviewing them, objectively taking into account external knowledge, initiating SMART
improvements and articulating their rationale.
Continuous Quality and Process Improvement
- Has basic knowledge to identify opportunities for process, system and structural improvements
(i.e performance gains) by examining and evaluating current process flows, methods and
standards.
- Has basic knowledge to design and implement relevant improvements by defining adapted/new
process flows, standards, and practices that enable business performance.
Effective Communication
- Has sufficient knowledge to deliver clear, well-structured, and meaningful information to a
target audience by using suitable communication mediums and language tailored to the
audience
- Has sufficient knowledge to achieve mutually agreeable solutions by staying adaptable,
communicating ideas in clear coherent language and practising active listening
- Has sufficient knowledge to ask relevant (follow-up) questions to properly engage with the
speaker and really understand what they are saying, by applying listening and reflection
techniques
Architectural Guidance
- Has basic knowledge to advise product teams towards a technical solution that meets the
functional, nonfunctional & architectural requirements by challenging the rationale for an
application design and providing context in the wider architectural landscape
Pre-Employment Screening
If your application is successful, your personal data may be used for a pre-employment screening check by a third party as permitted by applicable law. Depending on the vacancy and applicable law, a pre-employment screening may include employment history, education and other information (such as media information) that may be necessary for determining your qualifications and suitability for the position.