Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
August 20, 2020 07:18 pm GMT

What is a Kubernetes Operator and why it matters for SRE

Originally published on Failure is Inevitable.

Kubernetes is an open-source project that containerizes workloads and services and manages deployment and configurations. Released by Google in 2015, Kubernetes is now maintained by the Cloud Native Computing Foundation. Since its release, it has become a worldwide phenomenon. The majority of cloud native companies use it, SaaS vendors offer commercial prebuilt versions, and theres even an annual convention!

What has made Kubernetes become such a fundamental service? A major factor is its automation capabilities. Kubernetes can automatically make changes to the configuration of deployed containers or even deploy new containers based on metrics it tracks or requests made by engineers. Having Kubernetes handle these processes saves time, eliminates toil, and increases consistency.

If these benefits sound familiar, it might be because they overlap with the philosophies of SRE. But how do you incorporate the automation of Kubernetes into your SRE practices? In this blog post, well explain the Kubernetes Operatorthe Kubernetes function at the heart of customized automationand discuss how it can evolve your SRE solution.

What the Kubernetes Operator can do

In Kubernetes Operators: Automating the Container Orchestration Platform, authors Jason Dobies and Joshua Wood describe an Operator as an automated Site Reliability Engineer for its application. Given an SREs multifaceted experience and diverse workload, this is a bold statement. So what exactly can the Operator do?

Kubernetes Operators complete sophisticated tasks

The Operator can complete complex tasks in order to achieve the desired changes in the applications output. It can automatically handle such tasks as:

  • Deploying applications
  • Updating applications to new versions
  • Reconfiguring application settings
  • Scaling applications up and down depending on usage
  • Failure handling
  • Setting up monitoring infrastructure

Without Kubernetes Operators, engineers would need to complete these tasks. Automating them saves time and toil, and makes the procedures and results consistent.

Kubernetes Operators control custom resources and applications
Kubernetes allows you to create and define custom resources based on specific applications. The custom resource is a data object generated by your application containing metrics on the application's state. Imagine you have an application that produces new server instances based on usage. You could define your custom resource to check RAM and disk space for each new instance. You can also define a custom resource as a target that the application is trying to match. The Kubernetes Operator can then control the application to achieve the target custom resource; if the application is spinning up servers that have insufficient RAM or disk space, the Operator can reconfigure the settings to match the desired amount.

Kubernetes Operators make stateful decisions

The Kubernetes Operator is able to modify the configuration and usage of an application based on the applications output. This is determined by the custom resources defined for that application. Custom resources showing the desired state and custom resources showing the current state form a loop. The Operator observes the current state and then takes actions that will make the application produce the desired state. After the actions are executed, the current state is reevaluated and the loop begins again.

For example, a custom resource could define the desirable state of a new server instance as some amount of load capability based on its physical resources.The Operator would then adjust the configuration until new instances reached these standards.

Kubernetes Operators and SRE

If youre using Kubernetes, youll find that building and implementing Operators aligns with your SRE goals.

Operator monitoring, SLIs, and SLOs

When developing the custom resource for your application, you need to choose which signals from the applications output will be monitored by the resource and which targets the Operator will steer the application toward. This is similar to creating SLIs and SLOs.

The process of determining metrics with greatest impact is similar for Operators and SLIs. In the Kubernetes Operators textbook, Dobies and Wood suggest looking first at the four golden signals (a concept from Googles SRE book) to determine what the Operator should monitor. These are:

  • Latency
  • Traffic
  • Errors
  • SaturationCreating Operators for your applications will help you understand what SLIs and SLOs should be set for them. Likewise, setting SLIs and SLOs can help you understand what your Operators should monitor.

You might notice that when servers are overloaded, your customers are unhappy with the applications availability.

You can set a custom resource to monitor the disk space available. At 5% remaining capacity, your custom resource will spin up new server instances, giving your customers better service. Your SLI will be based on availability and will monitor disk space. Your SLO might dictate that you need to achieve 99.9% availability to keep your customers happy, informing the Operators intervention points.

Automating SRE application deployment

Your SRE practice will involve applications being deployed on a regular basis for each new instance of a service. For example, you may want to deploy a monitoring application every time you implement a new area of system architecture. Kubernetes Operators can expedite and automate this process. For monitoring, the Prometheus Operator is one of the first Operators developed by Kubernetes. It automatically deploys and controls a new instance of the open-source monitoring software Prometheus onto any targeted clusters.

SRE tools represent an investment in reliability. The time spent implementing them is paid for by the time they save. Creating Operators is a similar investment. By creating Operators, you save time on each deployment. Furthermore, deployments are consistent and reliable. Your SRE practices have less overhead and can scale with your organization.

Operators and incident management

Operators can be set up to make adjustments to handle failure. If the applications custom resource varies from the desired result, the Operator will make changes to compensate until the desired state is achieved. The cause of the variation is irrelevant to the Operator. It only operates based on the current and desired states. You will still need to work through an incident retrospective to bubble up contributing factors.

When developing your incident response plan, the behavior of your Operators can be a valuable resource. If you know that the Operator will automatically try to correct the behavior, you can incorporate that into your expectations and procedures. For example, if you have an incident response plan for oversaturated servers, your Operator could spin up new server instances or reconfigure load balancing. Your response plan would take this into account, saving you some troubleshooting steps and allowing you to focus on the originating issue. By combining Operators and automated runbooks, you can minimize the amount of manual escalation and resolve many incidents without human intervention. As automation is another core goal of SRE, this is another way that Kubernetes Operators fit into your reliability strategy.

As you shift your services to a container-based model and Kubernetes becomes more fundamental to your DevOps practices, its important to incorporate Operators into your reliability strategy. Operators allow you to extend Kubernetes with custom resources and responses, allowing for more automation and less toil.

If you enjoyed this post, check out these resources:


Original Link: https://dev.to/blameless/what-is-a-kubernetes-operator-and-why-it-matters-for-sre-2lkg

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To