minikube v1.20.0版本的一个bug

img{512x368}

本文永久链接 – https://tonybai.com/2021/05/14/a-bug-of-minikube-1-20

近期在研究dapr(分布式应用运行时),这是一个很朴素却很棒的想法,目前大厂,如阿里鹅厂都有大牛在研究该项目,甚至是利用dapr落地了部分应用。关于dapr,后续我也会用单独的文章详细说说。

dapr不仅支持k8s部署,还支持本地部署,并可以对接多个世界知名的公有云厂商的服务,比如:aws、azure、阿里云等。为了体验dapr对云原生应用的支持,我选择了将其部署于k8s中,同时我选择使用minikube来构建本地k8s开发环境。而本文要说的就是将dapr安装到minikube时遇到的问题。

1. 安装minikube

Kubernetes在4月份发布了最新的1.21版本,但目前minikube的最新版依然为1.20版本

minikube是k8s项目自己维护的一个k8s本地开发环境项目,它与k8s的api接口兼容,我们可以快速搭建一个minikube来进行k8s学习和实践。minikube官网上有关于它的安装、使用和维护的详尽资料。

我这里在一个ubuntu 18.04的腾讯云主机上(1 vcpu, 2g mem)上安装minikube v1.20,minikube是一个单体二进制文件,我们先将这个文件下载到本地:

# curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 60.9M  100 60.9M    0     0  7764k      0  0:00:08  0:00:08 --:--:-- 11.5M
# install minikube-linux-amd64 /usr/local/bin/minikube

验证是否下载ok:

# minikube version
minikube version: v1.20.0
commit: c61663e942ec43b20e8e70839dcca52e44cd85ae

接下来我们就利用minikube启动一个k8s cluster用作本地开发环境。由于minikube默认的最低安装要求为2核cpu,而我的虚机仅为1核,我们需要为minikube传递一些命令行参数以让其在单核CPU上也能顺利地启动一个k8s cluster。另外minikube会从gcr.io这个国内被限制访问的站点下载一些控制平面的容器镜像,为了能让此过程顺利进行下去,我们还需要告诉minikube从哪个gcr.io的mirror站点下载容器镜像:

# minikube start --extra-config=kubeadm.ignore-preflight-errors=NumCPU --force --cpus 1 --memory=1024mb --image-mirror-country='cn'
  minikube v1.20.0 on Ubuntu 18.04 (amd64)
  minikube skips various validations when --force is supplied; this may lead to unexpected behavior
  Automatically selected the docker driver. Other choices: ssh, none
  Requested cpu count 1 is less than the minimum allowed of 2
   has less than 2 CPUs available, but Kubernetes requires at least 2 to be available

  Your cgroup does not allow setting memory.
    ▪ More information: https://docs.docker.com/engine/install/linux-postinstall/#your-kernel-does-not-support-cgroup-swap-limit-capabilities

  Requested memory allocation 1024MiB is less than the usable minimum of 1800MB
  Requested memory allocation (1024MB) is less than the recommended minimum 1900MB. Deployments may fail.

  The requested memory allocation of 1024MiB does not leave room for system overhead (total system memory: 1833MiB). You may face stability issues.
  Suggestion: Start minikube with less memory allocated: 'minikube start --memory=1833mb'

  The "docker" driver should not be used with root privileges.
  If you are running minikube within a VM, consider using --driver=none:

https://minikube.sigs.k8s.io/docs/reference/drivers/none/

  Using image repository registry.cn-hangzhou.aliyuncs.com/google_containers
  Starting control plane node minikube in cluster minikube
  Pulling base image ...
    > registry.cn-hangzhou.aliyun...: 20.48 MiB / 358.10 MiB  5.72% 2.89 MiB p/
> registry.cn-hangzhou.aliyun...: 358.10 MiB / 358.10 MiB  100.00% 3.50 MiB
    > registry.cn-hangzhou.aliyun...: 358.10 MiB / 358.10 MiB  100.00% 3.50 MiB
    > registry.cn-hangzhou.aliyun...: 358.10 MiB / 358.10 MiB  100.00% 3.50 MiB
    > registry.cn-hangzhou.aliyun...: 358.10 MiB / 358.10 MiB  100.00% 6.83 MiB
  Creating docker container (CPUs=1, Memory=1024MB) ...
  Preparing Kubernetes v1.20.2 on Docker 20.10.6 ...
    ▪ kubeadm.ignore-preflight-errors=NumCPU
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
  Verifying Kubernetes components...
    ▪ Using image registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5 (global image repository)
  Enabled addons: default-storageclass, storage-provisioner

  /usr/local/bin/kubectl is version 1.17.9, which may have incompatibilites with Kubernetes 1.20.2.
    ▪ Want kubectl v1.20.2? Try 'minikube kubectl -- get pods -A'
  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

查看启动的k8s集群状态:

# minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

我们看到minikube似乎成功启动了一个k8s cluster。

2. pod storage-provisioner处于ErrImagePull状态

在后续使用helm安装redis作为state store组件(components)时,发现安装后的redis处于下面的状态:

# kubectl get pod
NAME               READY   STATUS    RESTARTS   AGE
redis-master-0     0/1     Pending   0          7m48s
redis-replicas-0   0/1     Pending   0          7m48s

通过kubectl describe命令详细查看redis-master-0这个pod:

# kubectl describe pod redis-master-0
Name:           redis-master-0
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/component=master
                app.kubernetes.io/instance=redis
                app.kubernetes.io/managed-by=Helm
                app.kubernetes.io/name=redis
                controller-revision-hash=redis-master-694655df77
                helm.sh/chart=redis-14.1.1
                statefulset.kubernetes.io/pod-name=redis-master-0
Annotations:    checksum/configmap: 0898a3adcb5d0cdd6cc60108d941d105cc240250ba6c7f84ed8b5337f1edd470
                checksum/health: 1b44d34c6c39698be89b2127b9fcec4395a221cff84aeab4fbd93ff4a636c210
                checksum/scripts: 465f195e1bffa9700282b017abc50056099e107d7ce8927fb2b97eb348907484
                checksum/secret: cd7ff82a84f998f50b11463c299c1200585036defc7cbbd9c141cc992ad80963
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/redis-master
Containers:
  redis:
    Image:      docker.io/bitnami/redis:6.2.3-debian-10-r0
    Port:       6379/TCP
    Host Port:  0/TCP
    Command:
      /bin/bash
    Args:
      -c
      /opt/bitnami/scripts/start-scripts/start-master.sh
    Liveness:   exec [sh -c /health/ping_liveness_local.sh 5] delay=5s timeout=6s period=5s #success=1 #failure=5
    Readiness:  exec [sh -c /health/ping_readiness_local.sh 1] delay=5s timeout=2s period=5s #success=1 #failure=5
    Environment:
      BITNAMI_DEBUG:           false
      REDIS_REPLICATION_MODE:  master
      ALLOW_EMPTY_PASSWORD:    no
      REDIS_PASSWORD:          <set to the key 'redis-password' in secret 'redis'>  Optional: false
      REDIS_TLS_ENABLED:       no
      REDIS_PORT:              6379
    Mounts:
      /data from redis-data (rw)
      /health from health (rw)
      /opt/bitnami/redis/etc/ from redis-tmp-conf (rw)
      /opt/bitnami/redis/mounted-etc from config (rw)
      /opt/bitnami/scripts/start-scripts from start-scripts (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from redis-token-rtxk2 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  redis-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  redis-data-redis-master-0
    ReadOnly:   false
  start-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-scripts
    Optional:  false
  health:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-health
    Optional:  false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      redis-configuration
    Optional:  false
  redis-tmp-conf:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  redis-token-rtxk2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  redis-token-rtxk2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  18s (x6 over 5m7s)  default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

我们发现是该pod的PersistentVolumeClaims没有得到满足,没有绑定到适当PV(persistent volume)上。查看pvc的状态,也都是pending:

# kubectl get pvc
NAME                          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
redis-data-redis-master-0     Pending                                      standard       35m
redis-data-redis-replicas-0   Pending                                      standard       35m

详细查看其中一个pvc的状态:

# kubectl describe  pvc redis-data-redis-master-0
Name:          redis-data-redis-master-0
Namespace:     default
StorageClass:  standard
Status:        Pending
Volume:
Labels:        app.kubernetes.io/component=master
               app.kubernetes.io/instance=redis
               app.kubernetes.io/name=redis
Annotations:   volume.beta.kubernetes.io/storage-provisioner: k8s.io/minikube-hostpath
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Mounted By:    redis-master-0
Events:
  Type    Reason                Age                  From                         Message
  ----    ------                ----                 ----                         -------
  Normal  ExternalProvisioning  55s (x143 over 35m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator

我们看到该pvc在等待绑定一个volume,而k8s cluster当前在default命名空间中没有任何pv资源。问题究竟出在哪里?

我们回到minikube自身上来,在minikube文档中,负责自动创建HostPath类型pv的是storage-provisioner插件:

img{512x368}

图:minikube插件使能情况

我们看到storage-provisioner插件的状态为enabled,那么为什么该插件没能为redis提供需要的pv资源呢?我顺便查看了一下当前k8s cluster的控制平面组件的运行情况:

# kubectl get po -n kube-system
NAMESPACE     NAME                                    READY   STATUS             RESTARTS   AGE
kube-system   coredns-54d67798b7-n6vw4                1/1     Running            0          20h
kube-system   etcd-minikube                           1/1     Running            0          20h
kube-system   kube-apiserver-minikube                 1/1     Running            0          20h
kube-system   kube-controller-manager-minikube        1/1     Running            0          20h
kube-system   kube-proxy-rtvvj                        1/1     Running            0          20h
kube-system   kube-scheduler-minikube                 1/1     Running            0          20h
kube-system   storage-provisioner                     0/1     ImagePullBackOff   0          20h

我们惊奇的发现:storage-provisioner这个pod居然处于ImagePullBackOff状态,即下载镜像有误!

3. 发现真相

还记得在minikube start命令的输出信息的末尾,我们看到这样一行内容:

Using image registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5 (global image repository)

也就是说我们从registry.cn-hangzhou.aliyuncs.com下载storage-provisioner:v5有错误!我手动在本地执行了一下下面命令:

# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5

Error response from daemon: pull access denied for registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

居然真的无法下载成功!

究竟是什么地方出现问题了呢?从提示来看,要么是该镜像不存在,要么是docker login被拒绝,由于registry.cn-hangzhou.aliyuncs.com是公共仓库,因此不存在docker login的问题,那么就剩下一个原因了:镜像不存在!

于是我在minikube官方的issue试着搜索了一下有关registry.cn-hangzhou.aliyuncs.com作为mirror的问题,还真让我捕捉到了蛛丝马迹。

在https://github.com/kubernetes/minikube/pull/10770这PR中,有人提及当–image-mirror-country使用cn时,minikube使用了错误的storage-provisioner镜像,镜像的地址不应该是registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5,而应该是registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5。

我在本地试了一下registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5,的确可以下载成功:

# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5
v5: Pulling from google_containers/storage-provisioner
Digest: sha256:18eb69d1418e854ad5a19e399310e52808a8321e4c441c1dddad8977a0d7a944
Status: Image is up to date for registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5
registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5

4. 解决问题

发现问题真相:当–image-mirror-country使用cn时,minikube使用了错误的storage-provisioner镜像。那我们如何修正这个问题呢?

我们查看一下storage-provisioner pod的imagePullPolicy:

# kubectl get pod storage-provisioner  -n kube-system -o yaml
... ...
spec:
  containers:
  - command:
    - /storage-provisioner
    image: registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5
    imagePullPolicy: IfNotPresent
    name: storage-provisioner

我们发现storage-provisioner的imagePullPolicy为ifNotPresent,这意味着如果本地有storage-provisioner:v5这个镜像的话,minikube不会再去远端下载该image。这样我们可以先将storage-provisioner:v5下载到本地并重新tag为registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5。

下面我们就来操作一下:

# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/storage-provisioner:v5 registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5

一旦有了image,通过minikube addons子命令重新enable对应pod,可以重启storage-provisioner pod,让其进入正常状态:

# minikube addons enable storage-provisioner

    ▪ Using image registry.cn-hangzhou.aliyuncs.com/google_containers/k8s-minikube/storage-provisioner:v5 (global image repository)
  The 'storage-provisioner' addon is enabled

# kubectl get po -n kube-system
NAME                               READY   STATUS    RESTARTS   AGE
coredns-54d67798b7-n6vw4           1/1     Running   0          25h
etcd-minikube                      1/1     Running   0          25h
kube-apiserver-minikube            1/1     Running   0          25h
kube-controller-manager-minikube   1/1     Running   0          25h
kube-proxy-rtvvj                   1/1     Running   0          25h
kube-scheduler-minikube            1/1     Running   0          25h
storage-provisioner                1/1     Running   0          69m

当storgae-provisioner恢复正常后,之前安装的dapr state component组件redis也自动恢复正常了:

# kubectl get pod
NAME               READY   STATUS    RESTARTS   AGE
redis-master-0     1/1     Running   0          18h
redis-replicas-0   1/1     Running   1          18h
redis-replicas-1   1/1     Running   0          16h
redis-replicas-2   1/1     Running   0          16h

“Gopher部落”知识星球正式转正(从试运营星球变成了正式星球)!“gopher部落”旨在打造一个精品Go学习和进阶社群!高品质首发Go技术文章,“三天”首发阅读权,每年两期Go语言发展现状分析,每天提前1小时阅读到新鲜的Gopher日报,网课、技术专栏、图书内容前瞻,六小时内必答保证等满足你关于Go语言生态的所有需求!部落目前虽小,但持续力很强。在2021年上半年,部落将策划两个专题系列分享,并且是部落独享哦:

  • Go技术书籍的书摘和读书体会系列
  • Go与eBPF系列

欢迎大家加入!

Go技术专栏“改善Go语⾔编程质量的50个有效实践”正在慕课网火热热销中!本专栏主要满足广大gopher关于Go语言进阶的需求,围绕如何写出地道且高质量Go代码给出50条有效实践建议,上线后收到一致好评!欢迎大家订
阅!

img{512x368}

我的网课“Kubernetes实战:高可用集群搭建、配置、运维与应用”在慕课网热卖中,欢迎小伙伴们订阅学习!

img{512x368}

我爱发短信:企业级短信平台定制开发专家 https://tonybai.com/。smspush : 可部署在企业内部的定制化短信平台,三网覆盖,不惧大并发接入,可定制扩展; 短信内容你来定,不再受约束, 接口丰富,支持长短信,签名可选。2020年4月8日,中国三大电信运营商联合发布《5G消息白皮书》,51短信平台也会全新升级到“51商用消息平台”,全面支持5G RCS消息。

著名云主机服务厂商DigitalOcean发布最新的主机计划,入门级Droplet配置升级为:1 core CPU、1G内存、25G高速SSD,价格5$/月。有使用DigitalOcean需求的朋友,可以打开这个链接地址:https://m.do.co/c/bff6eed92687 开启你的DO主机之路。

Gopher Daily(Gopher每日新闻)归档仓库 – https://github.com/bigwhite/gopherdaily

我的联系方式:

  • 微博:https://weibo.com/bigwhite20xx
  • 微信公众号:iamtonybai
  • 博客:tonybai.com
  • github: https://github.com/bigwhite
  • “Gopher部落”知识星球:https://public.zsxq.com/groups/51284458844544

微信赞赏:
img{512x368}

商务合作方式:撰稿、出书、培训、在线课程、合伙创业、咨询、广告合作。

Go标准库http与fasthttp服务端性能比较

本文永久链接 – https://tonybai.com/2021/04/25/server-side-performance-nethttp-vs-fasthttp

1. 背景

Go初学者学习Go时,在编写了经典的“hello, world”程序之后,可能会迫不及待的体验一下Go强大的标准库,比如:用几行代码写一个像下面示例这样拥有完整功能的web server:

// 来自https://tip.golang.org/pkg/net/http/#example_ListenAndServe
package main

import (
    "io"
    "log"
    "net/http"
)

func main() {
    helloHandler := func(w http.ResponseWriter, req *http.Request) {
        io.WriteString(w, "Hello, world!\n")
    }
    http.HandleFunc("/hello", helloHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

go net/http包是一个比较均衡的通用实现,能满足大多数gopher 90%以上场景的需要,并且具有如下优点:

  • 标准库包,无需引入任何第三方依赖;
  • 对http规范的满足度较好;
  • 无需做任何优化,即可获得相对较高的性能;
  • 支持HTTP代理;
  • 支持HTTPS;
  • 无缝支持HTTP/2。

不过也正是因为http包的“均衡”通用实现,在一些对性能要求严格的领域,net/http的性能可能无法胜任,也没有太多的调优空间。这时我们会将眼光转移到其他第三方的http服务端框架实现上。

而在第三方http服务端框架中,一个“行如其名”的框架fasthttp被提及和采纳的较多,fasthttp官网宣称其性能是net/http的十倍(基于go test benchmark的测试结果)。

fasthttp采用了许多性能优化上的最佳实践,尤其是在内存对象的重用上,大量使用sync.Pool以降低对Go GC的压力。

那么在真实环境中,到底fasthttp能比net/http快多少呢?恰好手里有两台性能还不错的服务器可用,在本文中我们就在这个真实环境下看看他们的实际性能。

2. 性能测试

我们分别用net/http和fasthttp实现两个几乎“零业务”的被测程序:

  • nethttp:
// github.com/bigwhite/experiments/blob/master/http-benchmark/nethttp/main.go
package main

import (
    _ "expvar"
    "log"
    "net/http"
    _ "net/http/pprof"
    "runtime"
    "time"
)

func main() {
    go func() {
        for {
            log.Println("当前routine数量:", runtime.NumGoroutine())
            time.Sleep(time.Second)
        }
    }()

    http.Handle("/", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("Hello, Go!"))
    }))

    log.Fatal(http.ListenAndServe(":8080", nil))
}
  • fasthttp:
// github.com/bigwhite/experiments/blob/master/http-benchmark/fasthttp/main.go

package main

import (
    "fmt"
    "log"
    "net/http"
    "runtime"
    "time"

    _ "expvar"

    _ "net/http/pprof"

    "github.com/valyala/fasthttp"
)

type HelloGoHandler struct {
}

func fastHTTPHandler(ctx *fasthttp.RequestCtx) {
    fmt.Fprintln(ctx, "Hello, Go!")
}

func main() {
    go func() {
        http.ListenAndServe(":6060", nil)
    }()

    go func() {
        for {
            log.Println("当前routine数量:", runtime.NumGoroutine())
            time.Sleep(time.Second)
        }
    }()

    s := &fasthttp.Server{
        Handler: fastHTTPHandler,
    }
    s.ListenAndServe(":8081")
}

对被测目标实施压力测试的客户端,我们基于hey这个http压测工具进行,为了方便调整压力水平,我们将hey“包裹”在下面这个shell脚本中(仅适于在linux上运行):

// github.com/bigwhite/experiments/blob/master/http-benchmark/client/http_client_load.sh

# ./http_client_load.sh 3 10000 10 GET http://10.10.195.181:8080
echo "$0 task_num count_per_hey conn_per_hey method url"
task_num=$1
count_per_hey=$2
conn_per_hey=$3
method=$4
url=$5

start=$(date +%s%N)
for((i=1; i<=$task_num; i++)); do {
    tm=$(date +%T.%N)
        echo "$tm: task $i start"
    hey -n $count_per_hey -c $conn_per_hey -m $method $url > hey_$i.log
    tm=$(date +%T.%N)
        echo "$tm: task $i done"
} & done
wait
end=$(date +%s%N)

count=$(( $task_num * $count_per_hey ))
runtime_ns=$(( $end - $start ))
runtime=`echo "scale=2; $runtime_ns / 1000000000" | bc`
echo "runtime: "$runtime
speed=`echo "scale=2; $count / $runtime" | bc`
echo "speed: "$speed

该脚本的执行示例如下:

bash http_client_load.sh 8 1000000 200 GET http://10.10.195.134:8080
http_client_load.sh task_num count_per_hey conn_per_hey method url
16:58:09.146948690: task 1 start
16:58:09.147235080: task 2 start
16:58:09.147290430: task 3 start
16:58:09.147740230: task 4 start
16:58:09.147896010: task 5 start
16:58:09.148314900: task 6 start
16:58:09.148446030: task 7 start
16:58:09.148930840: task 8 start
16:58:45.001080740: task 3 done
16:58:45.241903500: task 8 done
16:58:45.261501940: task 1 done
16:58:50.032383770: task 4 done
16:58:50.985076450: task 7 done
16:58:51.269099430: task 5 done
16:58:52.008164010: task 6 done
16:58:52.166402430: task 2 done
runtime: 43.02
speed: 185960.01

从传入的参数来看,该脚本并行启动了8个task(一个task启动一个hey),每个task向http://10.10.195.134:8080建立200个并发连接,并发送100w http GET请求。

我们使用两台服务器分别放置被测目标程序和压力工具脚本:

  • 目标程序所在服务器:10.10.195.181(物理机,Intel x86-64 CPU,40核,128G内存, CentOs 7.6)
$ cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core) 

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    2
Core(s) per socket:    10
座:                 2
NUMA 节点:         2
厂商 ID:           GenuineIntel
CPU 系列:          6
型号:              85
型号名称:        Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
步进:              4
CPU MHz:             800.000
CPU max MHz:           2201.0000
CPU min MHz:           800.0000
BogoMIPS:            4400.00
虚拟化:           VT-x
L1d 缓存:          32K
L1i 缓存:          32K
L2 缓存:           1024K
L3 缓存:           14080K
NUMA 节点0 CPU:    0-9,20-29
NUMA 节点1 CPU:    10-19,30-39
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke spec_ctrl intel_stibp flush_l1d

  • 压力工具所在服务器:10.10.195.133(物理机,鲲鹏arm64 cpu,96核,80G内存, CentOs 7.9)
# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (AltArch)

# lscpu
Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                96
On-line CPU(s) list:   0-95
Thread(s) per core:    1
Core(s) per socket:    48
座:                 2
NUMA 节点:         4
型号:              0
CPU max MHz:           2600.0000
CPU min MHz:           200.0000
BogoMIPS:            200.00
L1d 缓存:          64K
L1i 缓存:          64K
L2 缓存:           512K
L3 缓存:           49152K
NUMA 节点0 CPU:    0-23
NUMA 节点1 CPU:    24-47
NUMA 节点2 CPU:    48-71
NUMA 节点3 CPU:    72-95
Flags:                 fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

我用dstat监控被测目标所在主机资源占用情况(dstat -tcdngym),尤其是cpu负荷;通过expvarmon监控memstats,由于没有业务,内存占用很少;通过go tool pprof查看目标程序中对各类资源消耗情况的排名。

下面是多次测试后制作的一个数据表格:


图:测试数据

3. 对结果的简要分析

受特定场景、测试工具及脚本精确性以及压力测试环境的影响,上面的测试结果有一定局限,但却真实反映了被测目标的性能趋势。我们看到在给予同样压力的情况下,fasthttp并没有10倍于net http的性能,甚至在这样一个特定的场景下,两倍于net/http的性能都没有达到:我们看到在目标主机cpu资源消耗接近70%的几个用例中,fasthttp的性能仅比net/http高出30%~70%左右。

那么为什么fasthttp的性能未及预期呢?要回答这个问题,那就要看看net/http和fasthttp各自的实现原理了!我们先来看看net/http的工作原理示意图:


图:nethttp工作原理示意图

http包作为server端的原理很简单,那就是accept到一个连接(conn)之后,将这个conn甩给一个worker goroutine去处理,后者一直存在,直到该conn的生命周期结束:即连接关闭。

下面是fasthttp的工作原理示意图:


图:fasthttp工作原理示意图

而fasthttp设计了一套机制,目的是尽量复用goroutine,而不是每次都创建新的goroutine。fasthttp的Server accept一个conn之后,会尝试从workerpool中的ready切片中取出一个channel,该channel与某个worker goroutine一一对应。一旦取出channel,就会将accept到的conn写到该channel里,而channel另一端的worker goroutine就会处理该conn上的数据读写。当处理完该conn后,该worker goroutine不会退出,而是会将自己对应的那个channel重新放回workerpool中的ready切片中,等待这下一次被取出

fasthttp的goroutine复用策略初衷很好,但在这里的测试场景下效果不明显,从测试结果便可看得出来,在相同的客户端并发和压力下,net/http使用的goroutine数量与fasthttp相差无几。这是由测试模型导致的:在我们这个测试中,每个task中的hey都会向被测目标发起固定数量的长连接(keep-alive),然后在每条连接上发起“饱和”请求。这样fasthttp workerpool中的goroutine一旦接收到某个conn就只能在该conn上的通讯结束后才能重新放回,而该conn直到测试结束才会close,因此这样的场景相当于让fasthttp“退化”成了net/http的模型,也染上了net/http的“缺陷”:goroutine的数量一旦多起来,go runtime自身调度所带来的消耗便不可忽视甚至超过了业务处理所消耗的资源占比。下面分别是fasthttp在200长连接、8000长连接以及16000长连接下的cpu profile的结果:

200长连接:

(pprof) top -cum
Showing nodes accounting for 88.17s, 55.35% of 159.30s total
Dropped 150 nodes (cum <= 0.80s)
Showing top 10 nodes out of 60
      flat  flat%   sum%        cum   cum%
     0.46s  0.29%  0.29%    101.46s 63.69%  github.com/valyala/fasthttp.(*Server).serveConn
         0     0%  0.29%    101.46s 63.69%  github.com/valyala/fasthttp.(*workerPool).getCh.func1
         0     0%  0.29%    101.46s 63.69%  github.com/valyala/fasthttp.(*workerPool).workerFunc
     0.04s 0.025%  0.31%     89.46s 56.16%  internal/poll.ignoringEINTRIO (inline)
    87.38s 54.85% 55.17%     89.27s 56.04%  syscall.Syscall
     0.12s 0.075% 55.24%     60.39s 37.91%  bufio.(*Writer).Flush
         0     0% 55.24%     60.22s 37.80%  net.(*conn).Write
     0.08s  0.05% 55.29%     60.21s 37.80%  net.(*netFD).Write
     0.09s 0.056% 55.35%     60.12s 37.74%  internal/poll.(*FD).Write
         0     0% 55.35%     59.86s 37.58%  syscall.Write (inline)
(pprof) 

8000长连接:

(pprof) top -cum
Showing nodes accounting for 108.51s, 54.46% of 199.23s total
Dropped 204 nodes (cum <= 1s)
Showing top 10 nodes out of 66
      flat  flat%   sum%        cum   cum%
         0     0%     0%    119.11s 59.79%  github.com/valyala/fasthttp.(*workerPool).getCh.func1
         0     0%     0%    119.11s 59.79%  github.com/valyala/fasthttp.(*workerPool).workerFunc
     0.69s  0.35%  0.35%    119.05s 59.76%  github.com/valyala/fasthttp.(*Server).serveConn
     0.04s  0.02%  0.37%    104.22s 52.31%  internal/poll.ignoringEINTRIO (inline)
   101.58s 50.99% 51.35%    103.95s 52.18%  syscall.Syscall
     0.10s  0.05% 51.40%     79.95s 40.13%  runtime.mcall
     0.06s  0.03% 51.43%     79.85s 40.08%  runtime.park_m
     0.23s  0.12% 51.55%     79.30s 39.80%  runtime.schedule
     5.67s  2.85% 54.39%     77.47s 38.88%  runtime.findrunnable
     0.14s  0.07% 54.46%     68.96s 34.61%  bufio.(*Writer).Flush

16000长连接:

(pprof) top -cum
Showing nodes accounting for 239.60s, 87.07% of 275.17s total
Dropped 190 nodes (cum <= 1.38s)
Showing top 10 nodes out of 46
      flat  flat%   sum%        cum   cum%
     0.04s 0.015% 0.015%    153.38s 55.74%  runtime.mcall
     0.01s 0.0036% 0.018%    153.34s 55.73%  runtime.park_m
     0.12s 0.044% 0.062%       153s 55.60%  runtime.schedule
     0.66s  0.24%   0.3%    152.66s 55.48%  runtime.findrunnable
     0.15s 0.055%  0.36%    127.53s 46.35%  runtime.netpoll
   127.04s 46.17% 46.52%    127.04s 46.17%  runtime.epollwait
         0     0% 46.52%       121s 43.97%  github.com/valyala/fasthttp.(*workerPool).getCh.func1
         0     0% 46.52%       121s 43.97%  github.com/valyala/fasthttp.(*workerPool).workerFunc
     0.41s  0.15% 46.67%    120.18s 43.67%  github.com/valyala/fasthttp.(*Server).serveConn
   111.17s 40.40% 87.07%    111.99s 40.70%  syscall.Syscall
(pprof)

通过上述profile的比对,我们发现当长连接数量增多时(即workerpool中goroutine数量增多时),go runtime调度的占比会逐渐提升,在16000连接时,runtime调度的各个函数已经排名前4了。

4. 优化途径

从上面的测试结果,我们看到fasthttp的模型不太适合这种连接连上后进行持续“饱和”请求的场景,更适合短连接或长连接但没有持续饱和请求,在后面这样的场景下,它的goroutine复用模型才能更好的得以发挥。

但即便“退化”为了net/http模型,fasthttp的性能依然要比net/http略好,这是为什么呢?这些性能提升主要是fasthttp在内存分配层面的优化trick的结果,比如大量使用sync.Pool,比如避免在[]byte和string互转等。

那么,在持续“饱和”请求的场景下,如何让fasthttp workerpool中goroutine的数量不会因conn的增多而线性增长呢?fasthttp官方没有给出答案,但一条可以考虑的路径是使用os的多路复用(linux上的实现为epoll),即go runtime netpoll使用的那套机制。在多路复用的机制下,这样可以让每个workerpool中的goroutine处理同时处理多个连接,这样我们可以根据业务规模选择workerpool池的大小,而不是像目前这样几乎是任意增长goroutine的数量。当然,在用户层面引入epoll也可能会带来系统调用占比的增多以及响应延迟增大等问题。至于该路径是否可行,还是要看具体实现和测试结果。

注:fasthttp.Server中的Concurrency可以用来限制workerpool中并发处理的goroutine的个数,但由于每个goroutine只处理一个连接,当Concurrency设置过小时,后续的连接可能就会被fasthttp拒绝服务。因此fasthttp的默认Concurrency为:

const DefaultConcurrency = 256 * 1024

本文涉及的源码可以在这里 github.com/bigwhite/experiments/blob/master/http-benchmark 下载。


“Gopher部落”知识星球正式转正(从试运营星球变成了正式星球)!“gopher部落”旨在打造一个精品Go学习和进阶社群!高品质首发Go技术文章,“三天”首发阅读权,每年两期Go语言发展现状分析,每天提前1小时阅读到新鲜的Gopher日报,网课、技术专栏、图书内容前瞻,六小时内必答保证等满足你关于Go语言生态的所有需求!部落目前虽小,但持续力很强。在2021年上半年,部落将策划两个专题系列分享,并且是部落独享哦:

  • Go技术书籍的书摘和读书体会系列
  • Go与eBPF系列

欢迎大家加入!

Go技术专栏“改善Go语⾔编程质量的50个有效实践”正在慕课网火热热销中!本专栏主要满足广大gopher关于Go语言进阶的需求,围绕如何写出地道且高质量Go代码给出50条有效实践建议,上线后收到一致好评!欢迎大家订
阅!

img{512x368}

我的网课“Kubernetes实战:高可用集群搭建、配置、运维与应用”在慕课网热卖中,欢迎小伙伴们订阅学习!

img{512x368}

我爱发短信:企业级短信平台定制开发专家 https://tonybai.com/。smspush : 可部署在企业内部的定制化短信平台,三网覆盖,不惧大并发接入,可定制扩展; 短信内容你来定,不再受约束, 接口丰富,支持长短信,签名可选。2020年4月8日,中国三大电信运营商联合发布《5G消息白皮书》,51短信平台也会全新升级到“51商用消息平台”,全面支持5G RCS消息。

著名云主机服务厂商DigitalOcean发布最新的主机计划,入门级Droplet配置升级为:1 core CPU、1G内存、25G高速SSD,价格5$/月。有使用DigitalOcean需求的朋友,可以打开这个链接地址:https://m.do.co/c/bff6eed92687 开启你的DO主机之路。

Gopher Daily(Gopher每日新闻)归档仓库 – https://github.com/bigwhite/gopherdaily

我的联系方式:

  • 微博:https://weibo.com/bigwhite20xx
  • 微信公众号:iamtonybai
  • 博客:tonybai.com
  • github: https://github.com/bigwhite
  • “Gopher部落”知识星球:https://public.zsxq.com/groups/51284458844544

微信赞赏:
img{512x368}

商务合作方式:撰稿、出书、培训、在线课程、合伙创业、咨询、广告合作。

如发现本站页面被黑,比如:挂载广告、挖矿等恶意代码,请朋友们及时联系我。十分感谢! Go语言第一课 Go语言精进之路1 Go语言精进之路2 Go语言编程指南
商务合作请联系bigwhite.cn AT aliyun.com

欢迎使用邮件订阅我的博客

输入邮箱订阅本站,只要有新文章发布,就会第一时间发送邮件通知你哦!

这里是 Tony Bai的个人Blog,欢迎访问、订阅和留言! 订阅Feed请点击上面图片

如果您觉得这里的文章对您有帮助,请扫描上方二维码进行捐赠 ,加油后的Tony Bai将会为您呈现更多精彩的文章,谢谢!

如果您希望通过微信捐赠,请用微信客户端扫描下方赞赏码:

如果您希望通过比特币或以太币捐赠,可以扫描下方二维码:

比特币:

以太币:

如果您喜欢通过微信浏览本站内容,可以扫描下方二维码,订阅本站官方微信订阅号“iamtonybai”;点击二维码,可直达本人官方微博主页^_^:
本站Powered by Digital Ocean VPS。
选择Digital Ocean VPS主机,即可获得10美元现金充值,可 免费使用两个月哟! 著名主机提供商Linode 10$优惠码:linode10,在 这里注册即可免费获 得。阿里云推荐码: 1WFZ0V立享9折!


View Tony Bai's profile on LinkedIn
DigitalOcean Referral Badge

文章

评论

  • 正在加载...

分类

标签

归档



View My Stats