Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
November 18, 2021 08:31 pm GMT

DevOps & SRE Words Matter: How Our Language has Evolved

As the tech world changes, language changes with it. New technologies will always introduce new terms and descriptions to provide clear understanding. For example, the emergence of the cloud introduced language to describe the changing relationship between servers and clients. Then, of course, product providers will also dictate how their products are to be described, i.e. describing services as cloud-native.

On other occasions, language changes through deliberate effort to influence behavior. Thought leaders will often invent alternative words to describe existing ideas in order to effect cultural change. Even a slight change in diction can massively affect ones engagement, attitude, and even their worldview. In this blog, well look at how language colours how we perceive our environments, and well break down three examples of how language has evolved in tech.

How language affects and shifts world perspectives

We all have associations with language. Because of our past experiences and culture, different types of messages will trigger different emotional responses. The language we use thus influences the way we think. Whether our associations are positive or negative can impact things such as:

  • Whether we dread something or get excited by it
  • How important we perceive something to be
  • If we perceive something to be collaborative or combative...
  • Innovative or legacy
  • Bleeding-edge or mainstream
  • Safe or provocative

Postmortem vs. Retrospective

Both of these terms refer to a document that summarizes a past incident and the steps that were taken to resolve it. Postmortem was originally a medical term dating back to the 1820s. The metaphorical usage of examining other things after their death has been widely used in many industries, including tech.

In recent years, many organizations are differentiating the idea of a retrospective from a postmortem as the culture mindset shifts to the ongoing learning from events and failures. The two practices are commonly considered to have some small differences, such as the timing and content of the documents. However, just as important as these differences are the psychological effects of the terminology being used, especially when these may be conducted in a high-pressure environment. Here are some of the reasons were using retrospective instead of postmortem at Blameless.

The negativity of postmortems: death has a negative association in most peoples minds. As responders attend to incidents, the negative connotation lingers. Engineers may feel worried about the consequences of an incident, and the idea of death surrounding this process may encourage feelings of guilt and fear. By removing negative associations, people will be more eager to review and look back at what actually occurred and take the time to revisit it as a team.

The finality of postmortems: at Blameless, we dont see failure as the end. We see it as an opportunity to learn and grow, a starting point for positive change. Postmortems are very final; no examination happens post-postmortem. A retrospective implies that youre looking back at something that just happened or occured a while ago, that still could have a purpose in the future.

The wide scope of retrospectives: a postmortem is defined by the single moment of failure and works backwards to determine the causes. A retrospective is concerned with more than just the direct causes of failure. Instead, it seeks to tell the complete story of the service, systems, and people, up to and beyond the incident.

We want our incident retrospectives to be documents that we are proud to contribute to, that serve as hubs of learning and impetus for change going forward. We believe that by using the word retrospective, it conveys this intent much better than postmortem.

Root Cause Analysis vs Contributing Factors Analysis

When determining why something went wrong, there are several competing schools of thought. The root cause analysis, or RCA, is a popular tool for uncovering the reason for failure. The idea of a root cause as being the primary factor causing failure dates back to the early 1900s, with root cause analysis emerging as a concept in engineering companies in the 1930s. It is commonly attributed to Kiichiro Toyota, founder of the Toyota Motors Corporation, who developed the Five Whys technique to find root causes.

Contributing factor analysis is a more recent term that has been growing in popularity. It also seeks to understand the causes of an incident, but with a different mindset. That mindset is reflected in the language itself as much as any specific practice. Lets look at some examples of these differences, and why we at Blameless feel the contributing factors analysis is more useful.

The singularity of RCAs: the most obvious difference is that a root cause analysis refers to a singular root cause, where contributing factors emphasizes multiple factors. This is more important than it may seem. If you set out looking for a singular cause, youll resist branching out to other impactful areas. For example, if you only look for an engineering cause, youll disregard factors arising from product design or team culture.

The hierarchy of RCAs: the idea of a root cause is that it is the source from which other causes grow and branch off. Understanding what causes are more significant for the incident is necessary to properly prioritize follow-up items, but it isnt the full story. You have to also consider how these changes will affect the team and system as a whole. Thinking about each factors contribution without trying to determine which is the root keeps you more open-minded.

The neutrality of contribution: when considering the cause of an incident, youll be inclined to find failures, mistakes, and other negative things. Instead you can think about every factor that contributed to the story of the incident - including things that went well, like helpful playbooks and good communication. The totality of this factor analysis gives you a more complete picture of how to respond to incidents going forward.

Blameless advocates SRE as a holistic practice, one that incorporates learning from all available sources. The Contributing Factors Analysis brings in as many sources as possible to best understand incidents.

Disaster Recovery vs Incident Response

The overall process initiated by something going wrong has gone by different names over the years. The attitudes people have towards this have changed alongside the evolution of language and terminology. At first, organizations typically referred to this as disaster recovery. This terminology dates back to the 1970s, where it focused on how systems would recover if natural (or other) disasters wiped out infrastructure and its ability to operate.

As IT systems became more virtual, outages started to be caused by a much wider range of technical aspects other than natural disasters. Organizations moved to referring to this process as incident response to reflect the range of problems and new processes and tools. Also, the processes themselves evolved along with the technology changes. Lets look at how these terms reflect the attitudes of each era, and why we now use incident response.

The singularity of recovery: incident response, sometimes referred to as incident management, is much more than just restoring the environment to its previous state. After services are back online, you still need to gather information from the incident itself and build a retrospective, develop action items to carry the learning forward, and review the effectiveness of the response steps and procedures. Recovery is really only the first step towards resolution, and doesnt convey how you can get the most learning and improvement from each incident.

The severity of disasters: people see disasters as major catastrophic events. Setting up policies and procedures to trigger only in the event of a disaster is a very high bar. However, your incident response process should work just as efficiently for all incidents In other words, not all incidents are Sev 1 and so knowing the right steps to take depending on each incident is equally important. We believe theres learning in every incident, and so every incident is worth responding to properly.

The inevitability of incidents: disasters are also thought of as something to avoid at all costs. Any effort spent on reducing the chances of a disaster would be justified, given how severe disasters can be to both customers and engineering teams. A goal of zero disasters is reasonable. However, we know that 100% reliability is impossible. By recognizing the inevitability of incidents, you embrace them and avoid overspending on infrastructure and other resources in trying to prevent them. Using the term incidents'' vs disasters helps team-members understand their true inevitability and impact.


Original Link: https://dev.to/blameless/devops-sre-words-matter-how-our-language-has-evolved-dp5

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To