Please enable JavaScript to view the comments powered by Disqus. DevOps 2.0: An Insight To Site Reliability Engineering (SRE)

 

 

DevOps 2.0: An Insight To Site Reliability Engineering (SRE)

NovelVista

NovelVista

Last updated 20/07/2021


DevOps 2.0: An Insight To Site Reliability Engineering (SRE)

How often do you focus on adopting a new fashion trend and keeping up with it forever?

We bet that hasn’t happened even once. Because soon before the fashion trend fades, a
new one comes on board and you are all hyped about it. Isn’t that right?

The same thing happens with technology as well. Once a new one comes on board, that
becomes the most trending one! Such as Site Reliability Engineering (SRE), the much-
an adored the bridge between development and operations nowadays. By now, there must be a lot
of questions in your mind.

sre-foundation

 

Some of them are maybe:
“What is Site Reliability Engineering?”

“What are the SRE principles?”

“Is site reliability engineer a good job?”

“What is the role of a Site Reliability Engineer?”

“What are the similarities between SRE and DevOps?”

In this blog, we are going to answer all of the questions mentioned above. If you have any
more questions, you can always type it down in the comment section.

Do you know how much does a Site Reliability Engineer gets paid? It starts from $136,836
per year. Can you believe this?

Sre-salary-repoert

Source: Indeed

But why Site Reliability Engineers are in such high demand? Let’s see from the definition. 

What Is Site Reliability Engineer (SRE)? 

Site Reliability Engineering is basically creating a bridge between Development and
Operations departments. It is a discipline that incorporates aspects of software engineering
and applies them to infrastructure and operations problems. The main goals are to create
scalable and highly reliable software systems.

According to Benjamin Treynor, founder of Google's Site Reliability Team, SRE is "what
happens when a software engineer is tasked with what used to be called operations"
So, from where did the concept of SRE come from? To tell you that, we have to go back to
the year 2003. In that year, Benjamin Treynor was in charge of a production team whose
end goal was to make Google websites more available so that they are always able to
provide service.

Being a software engineer, Benjamin trained the way to work in a way the
way he could have worked if he were a Site Reliability Engineer. He tasked the team to
spend half of their time with the operations team so that they can understand the problem
and contribute to the development in a better way. The team Benjamin Treynor managed, is
Google’s SRE team now. 

You might ask now, we already have DevOps dealing with both development and
operations. Why do we need SRE then? Is there any similarity between these two? Let’s
look into the principles and key aspects of both to find out!

What is the relationship between SRE and DevOps?

From our previous blogs ITIL Vs DevOps, you all know about DevOps already. Right?
DevOps is basically a set of practices to build a culture of collaboration between the
development and operations teams. 

DevOps aims to achieve these 5 key points:

  1. Reduce organizational silos
  2. Accept failure as normal
  3. Implement gradual changes
  4. Leverage tooling and automation
  5. Measure everything

The SRE principles are also aligned in a way so that all the above-mentioned points can be achieved. Let’s see how that can be done!

1. Reduce organizational silos:

  • SRE shares ownership with developers to create shared responsibility
  • SREs use the same tools that developers use, and vice versa

2. Accept failure as normal:

  • SREs embrace risk
  • SRE quantifies failure and availability in a prescriptive manner using Service Level Indicators  and Service Level Objectives
  • SRE mandates blameless post mortems

3. Implement gradual changes:

  • SRE allows developers and product owners to function faster by reducing
    the cost of failure

4. Leverage tooling and automation:

  • SREs have the charter to automate menial tasks away

5.  Measure everything:

  • SRE defines prescriptive ways to measure values
  • SRE fundamentally believes that systems operation is a software problem

Hope we cleared the air of confusion here? Now, let’s see what all a Site Reliability Engineer
has to take care of.
 
What is the role of an SRE?

We gave you a brief idea about the job role of Site Reliability Engineer.
Take a look at the following points, and you will find out the details:

  • Site reliability engineers communicate with other engineers, product owners, and
    customers and come up with targets and measures. This helps them to ensure
    system availability. One can easily understand the perfect time to take action once all
    have agreed upon a system’s uptime and availability. 
  • They introduce error budgets in order to measure risk, balance availability and
    feature development. When there are no unrealistic reliability targets, a team has the
    flexibility to deliver updates and improvements to a system.
  • SRE believes in reducing toil. That results in automating tasks that require a human
    operator to work manually.
  • A site reliability engineer should have an in-depth understanding of the systems and
    their connectivity.
  • Site reliability engineers have the task of discovering the problems early to reduce
    the cost of failure.

Conclusion:

Remember how the hand of the king used to handle everything on the king’s behalf? Back in
that time, the kings used to choose the most intelligent person of the council to be the hand
because the hand used to look after everything is running smoothly as well as strategizing
everything by collaborating with the king.
A Site Reliability Engineer is also the same. The one on whom the entire project depends
on.
Do you think you’d be interested in choosing this career path? All you need to do is take up
an SRE training and apply for an SRE certification! So what do you think? Ready to do the same?

Topic Related Post

From Dev to Ops: Transitioning Your Career to SRE
Incident Management in SRE: Lessons from the Trenches (Case Studies)
SRE Tools and Technologies: A 2024 Toolkit

About Author

NovelVista Learning Solutions is a professionally managed training organization with specialization in certification courses. The core management team consists of highly qualified professionals with vast industry experience. NovelVista is an Accredited Training Organization (ATO) to conduct all levels of ITIL Courses. We also conduct training on DevOps, AWS Solution Architect associate, Prince2, MSP, CSM, Cloud Computing, Apache Hadoop, Six Sigma, ISO 20000/27000 & Agile Methodologies.

 
 

SUBMIT ENQUIRY

* Your personal details are for internal use only and will remain confidential.

 
 
 
 
 
 

Upcoming Events

ITIL-Logo-BL
ITIL

Every Weekend

AWS-Logo-BL
AWS

Every Weekend

Dev-Ops-Logo-BL
DevOps

Every Weekend

Prince2-Logo-BL
PRINCE2

Every Weekend

Topic Related

Take Simple Quiz and Get Discount Upto 50%

Popular Certifications

AWS Solution Architect Associates
SIAM Professional Training & Certification
ITIL® 4 Foundation Certification
DevOps Foundation By DOI
Certified DevOps Developer
PRINCE2® Foundation & Practitioner
ITIL® 4 Managing Professional Course
Certified DevOps Engineer
DevOps Practitioner + Agile Scrum Master
ISO Lead Auditor Combo Certification
Microsoft Azure Administrator AZ-104
Digital Transformation Officer
Certified Full Stack Data Scientist
Microsoft Azure DevOps Engineer
OCM Foundation
SRE Practitioner
Professional Scrum Product Owner II (PSPO II) Certification
Certified Associate in Project Management (CAPM)
Practitioner Certified In Business Analysis
Certified Blockchain Professional Program
Certified Cyber Security Foundation
Post Graduate Program in Project Management
Certified Data Science Professional
Certified PMO Professional
AWS Certified Cloud Practitioner (CLF-C01)
Certified Scrum Product Owners
Professional Scrum Product Owner-II
Professional Scrum Product Owner (PSPO) Training-I
GSDC Agile Scrum Master
ITIL® 4 Certification Scheme
Agile Project Management
FinOps Certified Practitioner certification
ITSM Foundation: ISO/IEC 20000:2011
Certified Design Thinking Professional
Certified Data Science Professional Certification
Generative AI Certification
Generative AI in Software Development
Generative AI in Business
Generative AI in Cybersecurity
Generative AI for HR and L&D
Generative AI in Finance and Banking
Generative AI in Marketing
Generative AI in Retail
Generative AI in Risk & Compliance
ISO 27001 Certification & Training in the Philippines
Generative AI in Project Management
Prompt Engineering Certification
SRE Certification Course
Devsecops Practitioner Certification
AIOPS Foundation Certification
ISO 9001:2015 Lead Auditor Training and Certification
ITIL4 Specialist Monitor Support and Fulfil Certification
SRE Foundation and Practitioner Combo
Generative AI webinar
Leadership Excellence Webinar
Certificate Of Global Leadership Excellence