介绍 (Introduction)

When working with an application built on Kubernetes, developers will often need to schedule additional pods to handle times of peak traffic or increased load processing. By default, scheduling these additional pods is a manual step; the developer must change the number of desired replicas in the deployment object to account for the increased traffic, then change it back when the additional pods are no longer needed. This dependency on manual intervention can be less than ideal in many scenarios. For example, your workload could hit peak hours in the middle of the night when no one is awake to scale the pods, or your website could get an unexpected increase in traffic when a manual response would not be quick enough to deal with the load. In these situations, the most efficient and least error prone approach is to automate your clusters scaling with the Horizontal Pod Autoscaler (HPA).

使用基于Kubernetes构建的应用程序时,开发人员通常需要安排其他Pod来处理高峰流量或增加负载处理的时间。 默认情况下,排定这些额外的Pod是手动步骤。 开发人员必须更改部署对象中所需副本的数量,以解决增加的流量,然后在不再需要其他Pod时将其更改回。 在许多情况下,这种对手动干预的依赖可能不理想。 例如,您的工作量可能会在半夜醒来时高峰,而没人醒来扩展荚,或者您的网站可能会由于手动响应不够快而无法处理负载而意外增加流量。 在这些情况下,最有效且最不易出错的方法是使用Horizo​​ntal Pod Autoscaler(HPA)自动执行集群缩放。

By using information from the Metrics Server, the HPA will detect increased resource usage and respond by scaling your workload for you. This is especially useful with microservice architectures, and will give your Kubernetes cluster the ability to scale your deployment based on metrics such as CPU utilization. When combined with DigitalOcean Kubernetes (DOKS), a managed Kubernetes offering that provides developers with a platform for deploying containerized applications, using HPA can create an automated infrastructure that quickly adjusts to changes in traffic and load.

通过使用来自Metrics Server的信息 ,HPA将检测到资源使用率的增加,并通过为您扩展工作量做出响应。 这对于微服务架构尤其有用,它将使您的Kubernetes集群能够根据CPU利用率等指标扩展部署。 当与DigitalOcean Kubernetes(DOKS)结合使用时,托管的Kubernetes产品可以为开发人员提供一个用于部署容器化应用程序的平台,使用HPA可以创建一个自动调整基础架构,以快速适应流量和负载的变化。

Note: When considering whether to use autoscaling for your workload, keep in mind that autoscaling works best for stateless applications, especially ones that are capable of having multiple instances of the application running and accepting traffic in parallel. This parallelism is important because the main objective of autoscaling is to dynamically distribute an application’s workload across multiple instances in your Kubernetes cluster to ensure your application has the resources necessary to service the traffic in a timely and stable manner without overloading any single instance.

注意:在考虑是否对工作负载使用自动伸缩时,请记住,自动伸缩最适合无状态应用程序,尤其是那些能够使应用程序的多个实例并行运行并接受流量的应用程序。 这种并行性很重要,因为自动扩展的主要目标是在Kubernetes集群中的多个实例之间动态分配应用程序的工作负载,以确保您的应用程序具有及时,稳定地为流量提供服务所需的资源,而不会导致任何单个实例过载。

An example of a workload that does not exhibit this parrallelism is database autoscaling. Setting up autoscaling for a database would be vastly more complex, as you would need to account for race conditions, issues with data integrity, data synchronization, and constant additions and removals of database cluster members. For reasons like these, we do not recommend using this tutorial’s autoscaling strategy for databases.

不具有这种并行性的工作负载的一个示例是数据库自动缩放。 为数据库设置自动缩放将更加复杂,因为您需要考虑竞争条件,数据完整性,数据同步问题以及数据库集群成员的不断添加和删除。 由于这些原因,我们不建议对数据库使用本教程的自动缩放策略。

In this tutorial, you will set up a sample Nginx deployment on DOKS that can autoscale horizontally to account for increased CPU load. You will accomplish this by deploying Metrics Server into your cluster to gather pod metrics for HPA to use when determining when to scale.

在本教程中,您将在DOKS上设置一个示例Nginx部署,该部署可以水平自动缩放以解决CPU负载增加的问题。 您将通过将Metrics Server部署到群集中以收集供HPA使用的Pod度量标准来确定何时扩展,以实现此目的。

先决条件 (Prerequisites)

Before you begin this guide you’ll need the following:

在开始本指南之前,您需要满足以下条件:

  • A DigitalOcean Kubernetes cluster with your connection configured as the kubectl default. Instructions on how to configure kubectl are shown under the Connect to your Cluster step when you create your cluster. To create a Kubernetes cluster on DigitalOcean, see Kubernetes Quickstart.

    一个DigitalOcean Kubernetes集群,其连接配置为kubectl默认。 创建集群时,“如何连接到集群”步骤下会显示有关如何配置kubectl说明。 要在DigitalOcean上创建Kubernetes集群,请参阅Kubernetes Quickstart

  • The Helm package manager installed on your local machine, and Tiller installed on your cluster. To do this, complete Steps 1 and 2 of the How To Install Software on Kubernetes Clusters with the Helm Package Manager tutorial.

    在本地计算机上安装了Helm软件包管理器,在集群上安装了Tiller。 为此,请使用Helm Package Manager教程完成如何在Kubernetes群集上安装软件的步骤1和2。

第1步-创建测试部署 (Step 1 — Creating a Test Deployment)

In order to show the effect of the HPA, you will first deploy an application that you will use to autoscale. This tutorial uses a standard Nginx Docker image as a deployment because it is fully capable of operating in parallel, is widely used within Kubernetes with such tools as the Nginx Ingress Controller, and is lightweight to set up. This Nginx deployment will serve a static Welcome to Nginx! page that comes standard in the base image. If you already have a deployment you would like to scale, feel free to use that deployment and skip this step.

为了显示HPA的效果,您将首先部署将用于自动缩放的应用程序。 本教程使用标准的Nginx Docker映像作为部署,因为它完全可以并行运行,并且在Kubernetes中通过Nginx Ingress Controller等工具广泛使用,并且设置轻巧。 此Nginx部署将为Nginx提供静态欢迎! 基本图片中标配的页面。 如果您已有扩展,请随时使用该部署,并跳过此步骤。

Create the sample deployment using the Nginx base image by issuing the following command. You can replace the name web if you would like to give your deployment a different name:

发出以下命令,使用Nginx基本映像创建示例部署。 如果您想为您的部署使用其他名称,则可以替换名称web

  • kubectl create deployment web --image=nginx:latest

    kubectl创建部署网站 --image = nginx:latest

The --image=nginx:latest flag will create the deployment from the latest version of the Nginx base image.

--image=nginx:latest标志将从最新版本的Nginx基本映像创建部署。

After a few seconds, your web pod will spin up. To see this pod, run the following command, which will show you the pods running in the current namespace:

几秒钟后,您的web窗格将旋转。 要查看此容器,请运行以下命令,该命令将向您显示在当前名称空间中运行的容器:

  • kubectl get pods

    kubectl得到豆荚

This will give you output similar to the following:

这将为您提供类似于以下内容的输出:


   
   
Output
NAME READY STATUS RESTARTS AGE web-84d7787df5-btf9h 1/1 Running 0 11s

Take note that there is only one pod originally deployed. Once autoscaling triggers, more pods will spin up automatically.

请注意,最初只部署了一个Pod。 自动缩放触发后,更多的Pod将自动旋转。

You now have a basic deployment up and running within the cluster. This is the deployment you are going to configure for autoscaling. Your next step is to configure this deployment to define its resource requests and limits.

现在,您已经在群集中启动并运行了基本部署。 这是您要配置用于自动缩放的部署。 下一步是配置此部署以定义其资源请求和限制。

步骤2 —在部署上设置CPU限制和请求 (Step 2 — Setting CPU Limits and Requests on Your Deployment)

In this step, you are going to set requests and limits on CPU usage for your deployment. Limits in Kubernetes are set on the deployment to describe the maximum amount of the resource (either CPU or Memory) that the pod can use. Requests are set on the deployment to describe how much of that resource is needed on a node in order for that node to be considered as a valid node for scheduling. For example, if your webserver had a memory request set at 1GB, only nodes with at least 1GB of free memory would be considered for scheduling. For autoscaling, it is necessary to set these limits and requests because the HPA will need to have this information when making scaling and scheduling decisions.

在此步骤中,您将为部署设置请求和 CPU使用率限制 。 Kubernetes中的限制是在部署上设置的,以描述Pod可以使用的最大资源量(CPU或内存)。 在部署上设置请求 ,以描述节点上需要多少资源才能将该节点视为有效的调度节点。 例如,如果您的Web服务器将内存请求设置为1GB,则仅考虑具有至少1GB可用内存的节点进行调度。 对于自动缩放,必须设置这些限制和要求,因为HPA在制定缩放和计划决策时将需要具有此信息。

To set the requests and limits, you will need to make changes to the deployment you just created. This tutorial will use the following kubectl edit command to modify the API object configuration stored in the cluster. The kubectl edit command will open the editor defined by your KUBE_EDITOR or EDITOR environment variables, or fall back to vi for Linux or notepad for Windows by default.

要设置请求和限制,您将需要更改刚刚创建的部署。 本教程将使用以下kubectl edit命令来修改存储在集群中的API对象配置。 kubectl edit命令将打开由KUBE_EDITOREDITOR环境变量定义的编辑器,或者默认情况下退回到vi对于Linux)notepad对于Windows)。

Edit your deployment:

编辑您的部署:

  • kubectl edit deployment web

    kubectl编辑部署网站

You will see the configuration for the deployment. You can now set resource limits and requests specified for your deployment’s CPU usage. These limits set the baseline for how much of each resource a pod of this deployment can use individually. Setting this will give the HPA a frame of reference to know whether a pod is being overworked. For example, if you expect your pod to have an upper limit of 100 millicores of CPU and the pod is currently using 95 millicores, HPA will know that it is at 95% capacity. Without providing that limit of 100 milicores, the HPA can’t decipher the pod’s full capacity.

您将看到部署的配置。 现在,您可以设置资源限制和为部署的CPU使用率指定的请求。 这些限制为该部署的pod可以分别使用多少资源设置了基准。 设置此项将为HPA提供一个参考框架,以了解是否已过度使用Pod。 例如,如果您希望您的Pod的CPU limit为100毫微米,并且该Pod当前使用的是95毫厘,则HPA将知道其容量为95%。 如果不提供100个milicore的限制,则HPA无法破译Pod的全部容量。

We can set the limits and requests in the resources section:

我们可以在resources部分中设置限制和请求:

Deployment Configuration File
部署配置文件
. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: web
    spec:
      containers:
      - image: nginx:latest
        imagePullPolicy: Always
        name: nginx
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
. . .

For this tutorial, you will be setting requests for CPU to 100m and memory to 250Mi. These values are meant for demonstration purposes; every workload is different, so these values may not make sense for other workloads. As a general rule, these values should be set to the maximum that a pod of this workload should be expected to use. Monitoring the application and gathering resource usage data on how it performs during low and peak times is recommended to help determine these values. These values can also be tweaked and changed at any time, so you can always come back and optimize your deployment later.

对于本教程,您将requests的CPU设置为100m ,将内存的requests设置为250Mi 。 这些值仅用于演示目的。 每个工作负载都不同,因此这些值可能对其他工作负载没有意义。 通常,应将这些值设置为应期望使用此工作负载的pod的最大值。 建议监视应用程序并收集有关其在高峰时段和高峰时段的性能的数据,以帮助确定这些值。 您还可以随时调整和更改这些值,因此您以后随时可以返回并优化您的部署。

Go ahead and insert the following highlighted lines under the resources section of your Nginx container:

继续,在Nginx容器的resources部分下插入以下突出显示的行:

Deployment Configuration File
部署配置文件
. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: web
    spec:
      containers:
      - image: nginx:latest
        imagePullPolicy: Always
        name: nginx
        resources:
          limits:
            cpu: 300m
          requests:
            cpu: 100m
            memory: 250Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
. . .

Once you’ve inserted these lines, save and quit the file. If there is an issue with the syntax, kubectl will reopen the file for you with an error posted for more information.

插入这些行后,保存并退出文件。 如果语法有问题, kubectl将为您重新打开文件,并显示错误消息以获取更多信息。

Now that you have your limits and requests set, you need to ensure that your metrics are being gathered so that the HPA can monitor and correctly adhere to these limits. In order to do this, you will set up a service to gather the CPU metrics. For this tutorial, you will use the Metrics Server project for collecting these metrics, which you will install with a Helm chart.

现在您已经设置了限制和请求,您需要确保已收集指标,以便HPA可以监视并正确遵守这些限制。 为此,您将设置一个服务来收集CPU指标。 在本教程中,您将使用Metrics Server项目来收集这些度量,并随Helm图表一起安装。

步骤3 —安装Metrics服务器 (Step 3 — Installing Metrics Server)

Next, you will install the Kubernetes Metric Server. This is the server that scrapes pod metrics, which will gather the metrics that the HPA will use to decide if autoscaling is necessary.

接下来,您将安装Kubernetes Metric Server 。 这是刮取Pod指标的服务器,该指标将收集HPA将用来确定是否需要自动缩放的指标。

To install the Metrics Server using Helm, run the following command:

要使用Helm安装Metrics Server,请运行以下命令:

  • helm install stable/metrics-server --name metrics-server

    掌舵安装稳定/指标服务器--name指标服务器

This will install the latest stable version of Metrics Server. The --name flag names this release metrics-server.

这将安装最新的稳定版本的Metrics Server。 --name标志为此发行版metrics-server命名。

Once you wait for this pod to initialize, try to use the kubectl top pod command to display your pod’s metrics:

等待此pod初始化后,请尝试使用kubectl top pod命令显示您的pod的指标:

  • kubectl top pod

    kubectl顶荚

This command is meant to give a pod-level view of resource usage in your cluster, but because of the way that DOKS handles DNS, this command will return an error at this point:

该命令旨在提供群集中资源使用情况的Pod级别视图,但是由于DOKS处理DNS的方式,此命令此时将返回错误:


   
   
Output
Error: Metrics not available for pod Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

This error occurs because DOKS nodes do not create a DNS record for themselves, and since Metrics Server contacts nodes through their hostnames, the hostnames do not resolve properly. To fix this problem, change how the Metrics Server communicates with nodes by adding runtime flags to the Metrics Server container using the following command:

发生此错误的原因是DOKS节点未为其自身创建DNS记录,并且Metrics Server通过其主机名联系节点,因此主机名无法正确解析。 要解决此问题,请通过使用以下命令将运行时标志添加到Metrics Server容器中来更改Metrics Server与节点通信的方式:

  • kubectl edit deployment metrics-server

    kubectl编辑部署指标服务器

You will be adding a flag under the command section.

您将在command部分下添加一个标志。

metrics-server Configuration File
metrics-server配置文件
. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: metrics-server
        release: metrics-server
    spec:
      affinity: {}
      containers:
      - command:
        - /metrics-server
        - --cert-dir=/tmp
        - --logtostderr
        - --secure-port=8443
        image: gcr.io/google_containers/metrics-server-amd64:v0.3.4
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
. . .

The flag you are adding is --kubelet-preferred-address-types=InternalIP. This flag tells the metrics server to contact nodes using their internalIP as opposed to their hostname. You can use this flag as a workaround to communicate with the nodes via internal IP addresses.

您添加的标志是--kubelet-preferred-address-types=InternalIP 。 该标志告诉度量服务器使用其internalIP (而不是其主机名)联系节点。 您可以将此标志用作通过内部IP地址与节点进行通信的解决方法。

Also, add the --metric-resolution flag to change the default rate at which the Metrics Server scrapes metrics. For this tutorial, we will set Metrics Server to make data points every 60s, but if you would like more metrics data, you could ask for the Metrics Server to scrape metrics every 10s or 20s. This will give you more data points of resource usage per period of time. Feel free to fine-tune this resolution to meet your needs.

另外,添加--metric-resolution标志以更改Metrics Server刮取指标的默认速率。 在本教程中,我们将Metrics Server设置为每60s一次数据点,但是如果您想要更多的Metrics数据,则可以要求Metrics Server每10s20s抓取一次指标。 这将为您提供每个时间段资源使用的更多数据点。 随时调整此分辨率以满足您的需求。

Add the following highlighted lines to the file:

将以下突出显示的行添加到文件中:

metrics-server Configuration File
metrics-server配置文件
. . .
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: metrics-server
        release: metrics-server
    spec:
      affinity: {}
      containers:
      - command:
        - /metrics-server
        - --cert-dir=/tmp
        - --logtostderr
        - --secure-port=8443
        - --metric-resolution=60s
        - --kubelet-preferred-address-types=InternalIP
        image: gcr.io/google_containers/metrics-server-amd64:v0.3.4
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
. . .

After the flag is added, save and exit your editor.

添加标志后,保存并退出编辑器。

To verify your Metrics Server is running, use kubectl top pod after a few minutes. As before, this command will give us resource usage on a pod level. This time, a working Metrics Server will allow you to see metrics on each pod:

要验证Metrics Server是否正在运行,请在几分钟后使用kubectl top pod 。 和以前一样,此命令将为我们提供Pod级别的资源使用情况。 这次,运行Metrics Server的服务器将允许您查看每个Pod上的指标:

  • kubectl top pod

    kubectl顶荚

This will give the following output, with your Metrics Server pod running:

运行Metrics Server容器时,将提供以下输出:


   
   
Output
NAME CPU(cores) MEMORY(bytes) metrics-server-db745fcd5-v8gv6 3m 12Mi web-555db5bf6b-f7btr 0m 2Mi

You now have a functional Metrics Server and are able to view and monitor resource usage of pods within your cluster. Next, you are going to configure the HPA to monitor this data and react to periods of high CPU usage.

现在,您具有功能正常的Metrics Server,并且能够查看和监视集群中Pod的资源使用情况。 接下来,您将配置HPA以监视此数据并对高CPU使用率做出React。

步骤4 —创建和验证水平Pod自动缩放器 (Step 4 — Creating and Validating the Horizontal Pod Autoscaler)

Lastly, it’s time to create the Horizontal Pod Autoscaler (HPA) for your deployment. The HPA is the actual Kubernetes object that routinely checks the CPU usage data collected from your Metrics Server and scales your deployment based on the thresholds you set in Step 2.

最后,是时候为您的部署创建Horizo​​ntal Pod Autoscaler(HPA)了。 HPA是实际的Kubernetes对象,它定期检查从Metrics Server收集的CPU使用情况数据,并根据您在步骤2中设置的阈值来扩展部署。

Create the HPA using the kubectl autoscale command:

使用kubectl autoscale命令创建HPA:

  • kubectl autoscale deployment web --max=4 --cpu-percent=80

    kubectl自动缩放部署Web --max = 4 --cpu-percent = 80

This command creates the HPA for your web deployment. It also uses the --max flag to set the max replicas that web can be scaled to, which in this case you set as 4.

此命令为您的web部署创建HPA。 它还使用--max标志设置可缩放web的最大副本,在这种情况下,您将其设置为4

The --cpu-percent flag tells the HPA at what percent usage of the limit you set in Step 2 you want to trigger the autoscale to occur. This also uses the requests to help schedule the scaled up pods to a node that can accomodate the initial resource allocation. In this example, if the limit you set on your deployment in Step 1 was 100 millicores (100m), this command would trigger an autoscale once the pod hit 80m in average CPU usage. This would allow the deployment to autoscale prior to maxing out its CPU resources.

--cpu-percent标志告诉HPA您要触发自动缩放发生在步骤2中设置的限制的百分比使用率。 这也使用请求来帮助将按比例扩展的Pod调度到可以适应初始资源分配的节点。 在此示例中,如果您在步骤1中对部署设置的限制为100毫厘( 100m ),则一旦pod的平均CPU使用率达到80m ,此命令将触发自动缩放。 这将允许部署在最大化其CPU资源之前自动扩展。

Now that your deployment can automatically scale, it’s time to put this to the test.

现在您的部署可以自动扩展了,该进行测试了。

To validate, you are going to generate a load that will put your cluster over your threshold and then watch the autoscaler take over. To start, open up a second terminal to watch the currently scheduled pods and refresh the list of pods every 2 seconds. To accomplish this, use the watch command in this second terminal:

为了进行验证,您将生成一个负载,该负载将使群集超出阈值,然后观察自动缩放器接管情况。 首先,打开第二个终端以观看当前调度的Pod,并每2秒刷新一次Pod列表。 为此,请在第二个终端中使用watch命令:

  • watch "kubectl top pods"

    观看“ kubectl顶部豆荚”

The watch command issues the command given as its arguments continuously, displaying the output in your terminal. The duration between repetitions can be further configured with the -n flag. For the purposes of this tutorial, the default two seconds setting will suffice.

watch命令连续发出作为其参数给出的命令,并在终端中显示输出。 重复之间的持续时间可以进一步使用-n标志进行配置。 就本教程而言,默认的两秒钟设置就足够了。

The terminal will now display the output of kubectl top pods initially and then every 2 seconds it will refresh the output that that command generates, which will look similar to this:

现在,终端将首先显示kubectl top pods的输出,然后每2秒刷新一次该命令生成的输出,该输出类似于以下内容:


   
   
Output
Every 2.0s: kubectl top pods NAME CPU(cores) MEMORY(bytes) metrics-server-6fd5457684-7kqtz 3m 15Mi web-7476bb659d-q5bjv 0m 2Mi

Take note of the number of pods currently deployed for web.

注意当前为web部署的pod的数量。

Switch back to your original terminal. You will now open a terminal inside your current web pod using kubectl exec and create an artificial load. You can accomplish this by going into the pod and installing the stress CLI tool.

切换回原始终端。 现在,您将使用kubectl exec在当前web窗格内打开一个终端,并创建一个人工负载。 您可以通过进入pod并安装stress CLI工具来完成此操作。

Enter your pod using kubectl exec, replacing the highlighted pod name with the name of your web pod:

使用kubectl exec输入您的Pod,将突出显示的Pod名称替换为web Pod的名称:

  • kubectl exec -it web-f765fd676-s9729 /bin/bash

    kubectl exec -it web-f765fd676-s9729 / bin / bash

This command is very similar in concept to using ssh to log in to another machine. /bin/bash establishes a bash shell in your pod.

该命令在概念上与使用ssh登录到另一台计算机非常相似。 /bin/bash在您的pod中建立一个bash shell。

Next, from the bash shell inside your pod, update the repository metadata and install the stress package.

接下来,从pod内部的bash shell中,更新存储库元数据并安装stress包。

  • apt update; apt-get install -y stress

    易于更新; apt-get install -y Stress

Note: For CentOS-based containers, this would be:

注意:对于基于CentOS的容器,这将是:

  • yum install -y stress

    百胜安装-y压力

Next, generate some CPU load on your pod using the stress command and let it run:

接下来,使用stress命令在您的Pod上生成一些CPU负载,然后运行它:

  • stress -c 3

    应力-c 3

Now, go back to your watch command in the second terminal. Wait a few minutes for the Metrics Server to gather CPU data that is above the HPA’s defined threshold. Note that metrics by default are gathered at whichever rate you set --metric-resolution equal to when configuring the metrics server. It may take a minute or so for the usage metrics to update.

现在,回到第二个终端中的watch命令。 等待几分钟,让Metrics Server收集高于HPA定义的阈值的CPU数据。 请注意,默认情况下,度量标准以设置--metric-resolution速率等于配置度量标准服务器时的速率收集。 使用情况指标可能需要一分钟左右的时间才能更新。

After about two minutes, you will see additional web pods spin up:

大约两分钟后,您将看到更多的web Pod旋转起来:


   
   
Output
Every 2.0s: kubectl top pods NAME CPU(cores) MEMORY(bytes) metrics-server-db745fcd5-v8gv6 6m 16Mi web-555db5bf6b-ck98q 0m 2Mi web-555db5bf6b-f7btr 494m 21Mi web-555db5bf6b-h5cbx 0m 1Mi web-555db5bf6b-pvh9f 0m 2Mi

You can now see that the HPA scheduled new pods based off the CPU load gathered by Metrics Server. When you are satisfied with this validation, use CTRL+C to stop the stress command in your first terminal, then exit your pod’s bash shell.

现在,您可以看到HPA根据Metrics Server收集的CPU负载调度了新的Pod。 对验证满意后,请使用CTRL+C在第一个终端中停止stress命令,然后退出pod的bash shell。

结论 (Conclusion)

In this article you created a deployment that will autoscale based on CPU load. You added CPU resource limits and requests to your deployment, installed and configured Metrics Server in your cluster through the use of Helm, and created an HPA to make scaling decisions.

在本文中,您创建了将根据CPU负载自动扩展的部署。 您通过使用Helm向部署中添加了CPU资源限制和请求,在群集中安装和配置了Metrics Server,并创建了HPA来做出扩展决策。

This was a demonstration deployment of both Metrics Server and HPA. Now you can tweak the configuration to fit your particular use cases. Be sure to poke around the Kubernetes HPA docs for help and info on requests and limitations. Also, check out the Metrics Server project see all the tunable settings that may apply to your use case.

这是Metrics Server和HPA的演示部署。 现在,您可以调整配置以适合您的特定用例。 请务必在Kubernetes HPA文档中查找有关帮助和要求和限制的信息 。 另外,请检出Metrics Server项目,以查看可能适用于您的用例的所有可调设置。

If you would like to do more with Kubernetes, head over to our Kubernetes Community page or explore our Managed Kubernetes service.

如果您想对Kubernetes进行更多操作,请转到我们的Kubernetes社区页面或浏览我们的Managed Kubernetes服务

翻译自: https://www.digitalocean.com/community/tutorials/how-to-autoscale-your-workloads-on-digitalocean-kubernetes

Logo

瓜分20万奖金 获得内推名额 丰厚实物奖励 易参与易上手

更多推荐