Gopher | Tony Bai

标签 Gopher 下的文章

使用minio搭建高性能对象存储-第一部分：原型

三月 16, 2020
4 条评论

近期参与了一个项目，该项目有存储大量图片、短视频、音频等非结构化数据的需求。于是我优先在Go社区寻找能满足这类需求的开源项目，minio就这样进入了我的视野。

图：minio logo

其实三年前我就知道了minio，并还下载玩(研)耍(究)了一番，但那时minio的成熟程度与今天相比还是相差较远的(当时需求简单，于是选择了较为熟悉的weedfs)。而如今的minio在github上收获了广泛的关注，小星星也是蛮多的(20k+ star)。它不仅被Go社区使用，在其他语言社区也有着广泛应用。我可以不负责任的说：在对象存储领域，minio大有kafka(java技术栈)在消息队列领域舍我其谁的气概:)。

2019年gopherchina大会上，探探工程师分享了“基于MINIO的对象存储方案在探探的实践”。虽然探探目前是否在生产中使用minio暂不得而知，但这又一次证明了minio在对象存储领域的强大影响力。

img{512x368}

图：探探工程师在gopherchina2019大会上分享minio实践

minio出品自一个有着多年网络文件系统开发经验的团队，其初始创始团队都来自于原Glusterfs团队，该团队二次创业的产品minio的设计广泛吸取了glusterfs的经验和教训：

部署简单：一个single二进制文件即是一切，还可支持各种平台。（托了go语言的福）
minio支持海量存储，可按zone扩展(原zone不受任何影响)，支持单个对象最大5TB；
兼容Amazon S3接口，充分考虑开发人员的需求和体验；
低冗余且磁盘损坏高容忍，标准且最高的数据冗余系数为2（即存储一个1M的数据对象，实际占用磁盘空间为2M）。但在任意n/2块disk损坏的情况下依然可以读出数据(n为一个纠删码集合(Erasure Coding Set)中的disk数量)。并且这种损坏恢复是基于单个对象的，而不是基于整个存储卷的。
读写性能优异

img{512x368}

图：来自minio技术白皮书中的benchmark数据

鉴于上述minio的“优点”，我打算在这个项目中基于minio实现非结构化数据的对象存储方案。本篇文章将介绍方案的原型设计与初始minio验证环境搭建。

一. 原型方案

基于minio的非结构化数据对象存储方案都大同小异，下面的图示就是根据我们的需求简单设计的原型方案：

img{512x368}

图：原型方案

我们基于minio提供的distributed mode，将位于多个host上的多块磁盘组成一个逻辑存储池，通过运行于不同host上的minio server实现一个高可用的对象存储方案；
数据通过一个独立的上传服务(基于minio提供的sdk与minio集群通信)写入minio；
通过minio的mc工具创建bucket，并将bucket的policy设置为”download”，以允许外部用户直接与minio通信，获取对象数据。中间不再设置除lb之外的中间层；
通过job或定时任务利用mc工具统一对minio中的数据进行维护，比如定期删除7天前的数据(如果数据默认过期时间设定为7天)。

二. minio server启动模式

minio支持多种server启动模式：

img{512x368}

图：minio server启动模式

minio server的standalone模式，即要管理的磁盘都在host本地。该启动模式一般仅用于实验环境、测试环境的验证和学习使用。在standalone模式下，还可以分为non-erasure code mode和erasure code mode。

所谓non-erasure code mode，即minio server启动时仅传入一个本地磁盘目录参数：比如：

$minio server data

Endpoint:  http://10.10.126.88:9000  http://127.0.0.1:9000
AccessKey: minioadmin
SecretKey: minioadmin

Browser Access:
   http://10.10.126.88:9000  http://127.0.0.1:9000           

Command-line Access: https://docs.min.io/docs/minio-client-quickstart-guide
   $ mc config host add myminio http://10.10.126.88:9000 minioadmin minioadmin

... ...

在这样的启动模式下，对于每一份对象数据，minio直接在data下面存储这份数据，不会建立副本，也不会启用纠删码机制。因此，这种模式无论是服务实例还是磁盘都是“单点”，无任何高可用保障，磁盘损坏就表示数据丢失。

同样在单minio server的情况下，erasure code mode即为minio server实例传入多个本地磁盘参数。一旦遇到多于一个磁盘参数，minio server会自动启用erasure code mode。erasure code对磁盘的个数是有要求的，如不满足要求，实例启动将失败：

$minio server data1 data2
ERROR Invalid command line arguments: Incorrect number of endpoints provided [data1 data2]
      > Please provide an even number of endpoints greater or equal to 4
      HINT:
        For more information, please refer to https://docs.min.io/docs/minio-erasure-code-quickstart-guide

erasure code启用后，要求传给minio server的endpoint(standalone模式下，即本地磁盘上的目录)至少为4个。minio server启用纠删码机制后，会自动将传入的disk drive划分为多个erasure coding set，每个erasure coding set中的disk drive的数量可以是：4, 6, 8, 10, 12, 14 和16。minio server会根据传入disk drive的数量自动计算set个数和每个set中的disk drive数量。比如下面例子中，我们传入四个endpoint(disk drive)给minio server：

$minio server data1 data2 data3 data4

Formatting 1 zone, 1 set(s), 4 drives per set.
WARNING: Host local has more than 2 drives of set. A host failure will result in data becoming unavailable.
Status:         4 Online, 0 Offline.
Endpoint:  http://10.10.126.88:9000  http://127.0.0.1:9000
AccessKey: minioadmin
SecretKey: minioadmin

Browser Access:
   http://10.10.126.88:9000  http://127.0.0.1:9000           

Command-line Access: https://docs.min.io/docs/minio-client-quickstart-guide
   $ mc config host add myminio http://10.10.126.88:9000 minioadmin minioadmin

... ...

从minio server的输出日志来看，minio server将这些drive放入了一个erasure coding set了。在输出日志中，我们还看到一行WARNING: Host local has more than 2 drives of set. A host failure will result in data becoming unavailable.，即minio server警告我们：这个erasure coding set中有多于两个的drive都在local host上，这样一旦host宕机，那么数据将无法获取。(每个set 有4个drive，根据纠删码的机制，这个set的最大允许失效的disk数量为4/2=2)。

我们再来看minio server启动的一个“语法糖” – “省略号”语法：

$minio server data{1...18}

Formatting 1 zone, 3 set(s), 6 drives per set.
WARNING: Host local has more than 3 drives of set. A host failure will result in data becoming unavailable.
WARNING: Host local has more than 3 drives of set. A host failure will result in data becoming unavailable.
WARNING: Host local has more than 3 drives of set. A host failure will result in data becoming unavailable.
Status:         18 Online, 0 Offline.
Endpoint:  http://10.10.126.88:9000  http://127.0.0.1:9000
AccessKey: minioadmin
SecretKey: minioadmin

Browser Access:
   http://10.10.126.88:9000  http://127.0.0.1:9000           

Command-line Access: https://docs.min.io/docs/minio-client-quickstart-guide
   $ mc config host add myminio http://10.10.126.88:9000 minioadmin minioadmin

... ...

minio server data{1...18}等价于minio server data1 data2 data3 data4 data5 data6 data7 data8 data9 data10 data11 data 12 data13 data14 data15 data16 data17 data18。minio server会自行扩展省略号代表的内容。我们看到：当我们传入18个disk drive后，minio server创建了3个erasure coding set，每个set中有6个disk drive。同样，minio server还针对每个set输出了一行WARNING：每个Set中有三个以上的disk drive都位于同一台host上。

这些WARNING我们可以通过distributed mode来解决。顾名思义，distributed mode下，minio server实例和其管理的disk drive分布在多台host上，这种模式可以避免minio server实例单点，数据也将分布在不同host上的不同disk中，实现了高可用，提升了整体的容灾能力。由于处理多个host上的disk，distribute mode默认就会启动erasure coding set机制。

在distributed mode下，minio server后面的远程的endpoint采用http url编码格式：

export MINIO_ACCESS_KEY=<ACCESS_KEY>
export MINIO_SECRET_KEY=<SECRET_KEY>
$minio server http://host{1...4}:9000/minio/data{1...4}

上面例子中的minio server命令相当于4个host，每个host上启动一个minio server实例，每个实例都管理16的disk drive(包括本地和远程的)。上述命令等价于：

$minio server http://host1:9000/minio/data1 http://host1:9000/minio/data2 http://host1:9000/minio/data3 http://host1:9000/minio/data4 http://host2:9000/minio/data1 http://host2:9000/minio/data2 http://host2:9000/minio/data3 http://host2:9000/minio/data4 http://host3:9000/minio/data1 http://host3:9000/minio/data2 http://host3:9000/minio/data3 http://host3:9000/minio/data4 http://host4:9000/minio/data1 http://host4:9000/minio/data2 http://host4:9000/minio/data3 http://host4:9000/minio/data4

minio同样会自动将这些disk drive划分为若干个erasure coding set。每个endpoint用http://address/disk-drive-path的形式编码。注意：这条命令在host1、host2、host3和host4上都要执行。

minio有一个zone的概念，比如下面这个例子：

$minio server data{1...8} data{9...16}

Formatting 1 zone, 1 set(s), 8 drives per set.
WARNING: Host local has more than 4 drives of set. A host failure will result in data becoming unavailable.
Formatting 2 zone, 1 set(s), 8 drives per set.
WARNING: Host local has more than 4 drives of set. A host failure will result in data becoming unavailable.
Status:         16 Online, 0 Offline.
Endpoint:  http://10.10.126.88:9000  http://127.0.0.1:9000
AccessKey: minioadmin
SecretKey: minioadmin

Browser Access:
   http://10.10.126.88:9000  http://127.0.0.1:9000           

Command-line Access: https://docs.min.io/docs/minio-client-quickstart-guide
   $ mc config host add myminio http://10.10.126.88:9000 minioadmin minioadmin

... ...

我们在命令行中给minio server传入两组采用“省略号”语法的参数，minio认为每组就是一个“zone”，这里有两组，因此minio创建了两个zone。在每个zone内，minio创建了一个erasure coding set，每个set中有8个disk drive。对于外部的写数据请求，minio server会首先查找可用空间多的zone，然后再在zone内选择set和disk drive。

如果不用“省略号”语法，那么minio server会将后面传入的所有disk drive放入一个zone中。

三. 原型验证环境搭建与配置

1. 单机上部署distributed minio集群

我们的验证环境采用最小的distributed minio模式：单机、one zone, one erasure coding set, 4 disk drive。下面是部署的示意图：

img{512x368}

图：单机上部署distributed minio集群

我们没有使用“省略号”语法，在单机上不是很好模拟。我们通过下面脚本来启动该minio集群：

# cat startup_minio.sh
#!/bin/bash

export MINIO_ACCESS_KEY="minio"
export MINIO_SECRET_KEY="minio123"

for i in {01..04}; do
    nohup minio server --address ":90${i}" http://127.0.0.1:9001/root/minio-install/data1 http://127.0.0.1:9002/root/minio-install/data2  http://127.0.0.1:9003/root/minio-install/data3 http://127.0.0.1:9004/root/minio-install/data4 > "/root/minio-install/90${i}.log"& 2>&1
done

启动该minio集群，并查看启动状态：

# bash startup_minio.sh

# ps -ef|grep minio

root      1218     1 11 21:58 pts/5    00:00:01 minio server --address :9001 http://127.0.0.1:9001/root/minio-install/data1 http://127.0.0.1:9002/root/minio-install/data2 http://127.0.0.1:9003/root/minio-install/data3 http://127.0.0.1:9004/root/minio-install/data4
root      1219     1 11 21:58 pts/5    00:00:01 minio server --address :9002 http://127.0.0.1:9001/root/minio-install/data1 http://127.0.0.1:9002/root/minio-install/data2 http://127.0.0.1:9003/root/minio-install/data3 http://127.0.0.1:9004/root/minio-install/data4
root      1220     1  3 21:58 pts/5    00:00:00 minio server --address :9003 http://127.0.0.1:9001/root/minio-install/data1 http://127.0.0.1:9002/root/minio-install/data2 http://127.0.0.1:9003/root/minio-install/data3 http://127.0.0.1:9004/root/minio-install/data4
root      1221     1 11 21:58 pts/5    00:00:01 minio server --address :9004 http://127.0.0.1:9001/root/minio-install/data1 http://127.0.0.1:9002/root/minio-install/data2 http://127.0.0.1:9003/root/minio-install/data3 http://127.0.0.1:9004/root/minio-install/data4

root@instance-cspzrq3u:~/minio-install# ls
9001.log  9002.log  9003.log  9004.log  data1  data2  data3  data4  startup_minio.sh
root@instance-cspzrq3u:~/minio-install# tail -100f 9001.log

Formatting 1 zone, 1 set(s), 4 drives per set.
Attempting encryption of all config, IAM users and policies on MinIO backend
Status:         4 Online, 0 Offline.
Endpoint:  http://192.168.16.4:9001  http://172.17.0.1:9001  http://172.18.0.1:9001  http://127.0.0.1:9001       

Browser Access:
   http://192.168.16.4:9001  http://172.17.0.1:9001  http://172.18.0.1:9001  http://127.0.0.1:9001       

.... ...

2. mc配置与管理

minio官方提供了mc命令行工具，用于对minio server进行管理。我们首先要为mc创建一个管理本地minio server(:9001)的配置：

# mc config host add myminio http://localhost:9001 minio minio123
Added `myminio` successfully.

这里我们使用mc添加了一个所谓”host”，指向上面创建的minio server(:9001)。上面的命令实质上是在~/.mc/config.json中写入了如下配置：

# cat ~/.mc/config.json
{
    "version": "9",
    "hosts": {
        "myminio": {
            "url": "http://localhost:9001",
            "accessKey": "minio",
            "secretKey": "minio123",
            "api": "s3v4",
            "lookup": "auto"
        }
    }
}

接下来，我们通过mc命令在minio集群中添加三个bucket：

root@instance-cspzrq3u:~# mc mb myminio/image
Bucket created successfully `myminio/image`.
root@instance-cspzrq3u:~# mc mb myminio/video
Bucket created successfully `myminio/video`.
root@instance-cspzrq3u:~# mc mb myminio/audio
Bucket created successfully `myminio/audio`.
root@instance-cspzrq3u:~# mc ls myminio
[2020-03-16 15:19:55 CST]      0B audio/
[2020-03-16 15:19:48 CST]      0B image/
[2020-03-16 15:19:52 CST]      0B video/

新创建的bucket默认的访问policy是none，即外部无访问权限：

root@instance-cspzrq3u:~# mc policy get myminio/image
Access permission for `myminio/image` is `none`

根据我们的设计，我们需要给这三个bucket添加外部可读取权限，以image这个bucket为例：

root@instance-cspzrq3u:~# mc policy set download myminio/image
Access permission for `myminio/image` is set to `download`
root@instance-cspzrq3u:~# mc policy get myminio/image
Access permission for `myminio/image` is `download`

3. load balancer设置

这里我们使用一个nginx前置在minio集群外部，下面是为minio创建的nginx配置文件(/etc/nginx/conf.d/minio.conf)：

// /etc/nginx/conf.d/minio.conf

 upstream minio_cluster {
    server localhost:9001;
    server localhost:9002;
    server localhost:9003;
    server localhost:9004;
 }

server {
 listen 9000;
 server_name myminio.tonybai.com;

 # To allow special characters in headers
 ignore_invalid_headers off;
 # Allow any size file to be uploaded.
 # Set to a value such as 1000m; to restrict file size to a specific value
 client_max_body_size 0;
 # To disable buffering
 proxy_buffering off;

location / {

   proxy_set_header X-Real-IP $remote_addr;
   proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   proxy_set_header X-Forwarded-Proto $scheme;
   proxy_set_header Host $http_host;

   proxy_connect_timeout 300;
   # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
   proxy_http_version 1.1;
   proxy_set_header Connection "";
   chunked_transfer_encoding off;

   proxy_pass http://minio_cluster;
}

location /image/ {
   proxy_set_header X-Real-IP $remote_addr;
   proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   proxy_set_header X-Forwarded-Proto $scheme;
   proxy_set_header Host $http_host;

   proxy_connect_timeout 300;
   # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
   proxy_http_version 1.1;
   proxy_set_header Connection "";
   chunked_transfer_encoding off;
   client_max_body_size 1000m;
   proxy_buffering off;

   proxy_pass http://minio_cluster;
 }
}

重启nginx（nginx -s reload)。

我们使用浏览器访问一下http://myminio.tonybai.com:9000/，登录后，你将看到如下页面：

img{512x368}

图：浏览器访问minio web

选择左侧的”image” bucket，点击右下角的”+”号，我们可以上传一张图片：gopher-daily-logo.png，上传后，我们退出登录。然后通过地址http://myminio.tonybai.com:9000/image/gopher-daily-logo.png访问该图片。你也可以通过wget命令下载该图片：

$wget -c http://myminio.tonybai.com:9000/image/gopher-daily-logo.png
--2020-03-16 15:40:20--  http://myminio.tonybai.com:9000/image/gopher-daily-logo.png
正在解析主机 myminio.tonybai.com (myminio.tonybai.com)... 106.12.69.83
正在连接 myminio.tonybai.com (myminio.tonybai.com)|106.12.69.83|:9000... 已连接。
已发出 HTTP 请求，正在等待回应... 200 OK
长度：59736 (58K) [image/png]
正在保存至: “gopher-daily-logo.png”

gopher-daily-logo.png        100%[============================================>]  58.34K   253KB/s  用时 0.2s   

2020-03-16 15:40:20 (253 KB/s) - 已保存 “gopher-daily-logo.png” [59736/59736])

4. 对象清除

我们的需求中，bucket中的数据对象的生命周期是7天，我们可以使用定时工具或一个job通过mc工具对这些过期对象进行清除，比如我们每隔5分钟执行一次下面的命令：

$mc rm --recursive --force --newer-than 7d myminio/image/

该命令将递归删除image bucket下早于7天前创建的数据对象。rm命令支持各种条件组合，具体可参考一下mc rm的manual。

四. 小结

至此，使用minio搭建高性能对象存储的第一步：原型算是顺利搭建ok了。相信在后续对minio的深入使用和了解后，会有更多关于minio的内容和大家分享。

我的网课“Kubernetes实战：高可用集群搭建、配置、运维与应用”在慕课网上线了，感谢小伙伴们学习支持！

我爱发短信：企业级短信平台定制开发专家 https://tonybai.com/
smspush : 可部署在企业内部的定制化短信平台，三网覆盖，不惧大并发接入，可定制扩展；短信内容你来定，不再受约束, 接口丰富，支持长短信，签名可选。

著名云主机服务厂商DigitalOcean发布最新的主机计划，入门级Droplet配置升级为：1 core CPU、1G内存、25G高速SSD，价格5$/月。有使用DigitalOcean需求的朋友，可以打开这个链接地址：https://m.do.co/c/bff6eed92687 开启你的DO主机之路。

Gopher Daily(Gopher每日新闻)归档仓库 – https://github.com/bigwhite/gopherdaily

我的联系方式：

微博：https://weibo.com/bigwhite20xx
微信公众号：iamtonybai
博客：tonybai.com
github: https://github.com/bigwhite

微信赞赏：
img{512x368}

商务合作方式：撰稿、出书、培训、在线课程、合伙创业、咨询、广告合作。

Go 1.14中值得关注的几个变化

三月 8, 2020
4 条评论

可能是得益于2020年2月26日Go 1.14的发布，在2020年3月份的TIOBE编程语言排行榜上，Go重新进入TOP 10，而去年同期Go仅排行在第18位。虽然Go语言以及其他主流语言在榜单上的“上蹿下跳”让这个榜单的权威性饱受质疑:)，但Go在这样的一个时间节点能进入TOP 10，对于Gopher和Go社区来说，总还是一个不错的结果。并且在一定层度上说明：Go在努力耕耘十年后，已经在世界主流编程语言之林中牢牢占据了自己的一个位置。

img{512x368}

图：TIOBE编程语言排行榜2020.3月榜单，Go语言重入TOP10

Go自从宣布Go1 Compatible后，直到这次的Go 1.14发布，Go的语法和核心库都没有做出不兼容的变化。这让很多其他主流语言的拥趸们觉得Go很“无趣”。但这种承诺恰恰是Go团队背后努力付出的结果，因此Go的每个发布版本都值得广大gopher尊重，每个发布版本都是Go团队能拿出的最好版本。

下面我们就来解读一下Go 1.14的变化，看看这个新版本中有哪些值得我们重点关注的变化。

一. 语言规范

和其他主流语言相比，Go语言的语法规范的变化那是极其少的（广大Gopher们已经习惯了这个节奏:)），偶尔发布一个变化，那自然是要引起广大Gopher严重关注的:)。不过事先说明：只要Go版本依然是1.x，那么这个规范变化也是backward-compitable的。

Go 1.14新增的语法变化是：嵌入接口的方法集可重叠。这个变化背后的朴素思想是这样的。看下面代码(来自这里)：

type I interface { f(); String() string }
type J interface { g(); String() string }

type IJ interface { I; J }  ----- (1)
type IJ interface { f(); g(); String() string }  ---- (2)

代码中已知定义的I和J两个接口的方法集中都包含有String() string这个方法。在这样的情况下，我们如果想定义一个方法集合为Union(I, J)的新接口IJ，我们在Go 1.13及之前的版本中只能使用第(2)种方式，即只能在新接口IJ中重新书写一遍所有的方法原型，而无法像第(1)种方式那样使用嵌入接口的简洁方式进行。

Go 1.14通过支持嵌入接口的方法集可重叠解决了这个问题：

// go1.14-examples/overlapping_interface.go
package foo

type I interface {
    f()
    String() string
}
type J interface {
    g()
    String() string
}

type IJ interface {
    I
    J
}

在go 1.13.6上运行：

$go build overlapping_interface.go
# command-line-arguments
./overlapping_interface.go:14:2: duplicate method String

但在go 1.14上运行：

$go build overlapping_interface.go

// 一切ok，无报错

不过对overlapping interface的支持仅限于接口定义中，如果你要在struct定义中嵌入interface，比如像下面这样：

// go1.14-examples/overlapping_interface1.go
package main

type I interface {
    f()
    String() string
}

type implOfI struct{}

func (implOfI) f() {}
func (implOfI) String() string {
    return "implOfI"
}

type J interface {
    g()
    String() string
}

type implOfJ struct{}

func (implOfJ) g() {}
func (implOfJ) String() string {
    return "implOfJ"
}

type Foo struct {
    I
    J
}

func main() {
    f := Foo{
        I: implOfI{},
        J: implOfJ{},
    }
    println(f.String())
}

虽然Go编译器没有直接指出结构体Foo中嵌入的两个接口I和J存在方法的重叠，但在使用Foo结构体时，下面的编译器错误肯定还是会给出的：

$ go run overlapping_interface1.go
# command-line-arguments
./overlapping_interface1.go:37:11: ambiguous selector f.String

对于结构体中嵌入的接口的方法集是否存在overlap，go编译器似乎并没有严格做“实时”检查，这个检查被延迟到为结构体实例选择method的执行者环节了，就像上面例子那样。如果我们此时让Foo结构体 override一个String方法，那么即便I和J的方法集存在overlap也是无关紧要的，因为编译器不会再模棱两可，可以正确的为Foo实例选出究竟执行哪个String方法：

// go1.14-examples/overlapping_interface2.go

.... ....

func (Foo) String() string {
        return "Foo"
}

func main() {
        f := Foo{
                I: implOfI{},
                J: implOfJ{},
        }
        println(f.String())
}

运行该代码：

$go run overlapping_interface2.go
Foo

二. Go运行时

1. 支持异步抢占式调度

在《Goroutine调度实例简要分析》一文中，我曾提到过这样一个例子：

// go1.14-examples/preemption_scheduler.go
package main

import (
    "fmt"
    "runtime"
    "time"
)

func deadloop() {
    for {
    }
}

func main() {
    runtime.GOMAXPROCS(1)
    go deadloop()
    for {
        time.Sleep(time.Second * 1)
        fmt.Println("I got scheduled!")
    }
}

在只有一个P的情况下，上面的代码中deadloop所在goroutine将持续占据该P，使得main goroutine中的代码得不到调度(GOMAXPROCS=1的情况下)，因此我们无法看到I got scheduled!字样输出。这是因为Go 1.13及以前的版本的抢占是”协作式“的，只在有函数调用的地方才能插入“抢占”代码(埋点)，而deadloop没有给编译器插入抢占代码的机会。这会导致GC在等待所有goroutine停止时等待时间过长，从而导致GC延迟；甚至在一些特殊情况下，导致在STW（stop the world）时死锁。

Go 1.14采用了基于系统信号的异步抢占调度，这样上面的deadloop所在的goroutine也可以被抢占了：

// 使用Go 1.14版本编译器运行上述代码

$go run preemption_scheduler.go
I got scheduled!
I got scheduled!
I got scheduled!

不过由于系统信号可能在代码执行到任意地方发生，在Go runtime能cover到的地方，Go runtime自然会处理好这些系统信号。但是如果你是通过syscall包或golang.org/x/sys/unix在Unix/Linux/Mac上直接进行系统调用，那么一旦在系统调用执行过程中进程收到系统中断信号，这些系统调用就会失败，并以EINTR错误返回，尤其是低速系统调用，包括：读写特定类型文件(管道、终端设备、网络设备)、进程间通信等。在这样的情况下，我们就需要自己处理EINTR错误。一个最常见的错误处理方式就是重试。对于可重入的系统调用来说，在收到EINTR信号后的重试是安全的。如果你没有自己调用syscall包，那么异步抢占调度对你已有的代码几乎无影响。

Go 1.14的异步抢占调度在windows/arm, darwin/arm, js/wasm, and plan9/*上依然尚未支持，Go团队计划在Go 1.15中解决掉这些问题。

2. defer性能得以继续优化

在Go 1.13中，defer性能得到理论上30%的提升。我们还用那个例子来看看go 1.14与go 1.13版本相比defer性能又有多少提升，同时再看看使用defer和不使用defer的对比：

// go1.14-examples/defer_benchmark_test.go
package defer_test

import "testing"

func sum(max int) int {
    total := 0
    for i := 0; i < max; i++ {
        total += i
    }

    return total
}

func foo() {
    defer func() {
        sum(10)
    }()

    sum(100)
}

func Bar() {
    sum(100)
    sum(10)
}

func BenchmarkDefer(b *testing.B) {
    for i := 0; i < b.N; i++ {
        foo()
    }
}
func BenchmarkWithoutDefer(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Bar()
    }
}

我们分别用Go 1.13和Go 1.14运行上面的基准测试代码：

Go 1.13:

$go test -bench . defer_benchmark_test.go
goos: darwin
goarch: amd64
BenchmarkDefer-8              17873574            66.7 ns/op
BenchmarkWithoutDefer-8       26935401            43.7 ns/op
PASS
ok      command-line-arguments    2.491s

Go 1.14:

$go test -bench . defer_benchmark_test.go
goos: darwin
goarch: amd64
BenchmarkDefer-8              26179819            45.1 ns/op
BenchmarkWithoutDefer-8       26116602            43.5 ns/op
PASS
ok      command-line-arguments    2.418s

我们看到，Go 1.14的defer性能照比Go 1.13还有大幅提升，并且已经与不使用defer的性能相差无几了，这也是Go官方鼓励大家在性能敏感的代码执行路径上也大胆使用defer的原因。

img{512x368}

图：各个Go版本defer性能对比(图来自于https://twitter.com/janiszt/status/1215601972281253888)

3. internal timer的重新实现

鉴于go timer长期以来性能不能令人满意，Go 1.14几乎重新实现了runtime层的timer。其实现思路遵循了Dmitry Vyukov几年前提出的实现逻辑：将timer分配到每个P上，降低锁竞争；去掉timer thread，减少上下文切换开销；使用netpoll的timeout实现timer机制。

// $GOROOT/src/runtime/time.go

type timer struct {
        // If this timer is on a heap, which P's heap it is on.
        // puintptr rather than *p to match uintptr in the versions
        // of this struct defined in other packages.
        pp puintptr

}

// addtimer adds a timer to the current P.
// This should only be called with a newly created timer.
// That avoids the risk of changing the when field of a timer in some P's heap,
// which could cause the heap to become unsorted.

func addtimer(t *timer) {
        // when must never be negative; otherwise runtimer will overflow
        // during its delta calculation and never expire other runtime timers.
        if t.when < 0 {
                t.when = maxWhen
        }
        if t.status != timerNoStatus {
                badTimer()
        }
        t.status = timerWaiting

        addInitializedTimer(t)
}

// addInitializedTimer adds an initialized timer to the current P.
func addInitializedTimer(t *timer) {
        when := t.when

        pp := getg().m.p.ptr()
        lock(&pp.timersLock)
        ok := cleantimers(pp) && doaddtimer(pp, t)
        unlock(&pp.timersLock)
        if !ok {
                badTimer()
        }

        wakeNetPoller(when)
}
... ...

这样你的程序中如果大量使用time.After、time.Tick或者在处理网络连接时大量使用SetDeadline，使用Go 1.14编译后，你的应用将得到timer性能的自然提升。

img{512x368}

图：切换到新timer实现后的各Benchmark数据

三. Go module已经production ready了

Go 1.14中带来的关于go module的最大惊喜就是Go module已经production ready了，这意味着关于go module的运作机制，go tool的各种命令和其参数形式、行为特征已趋稳定了。笔者从Go 1.11引入go module以来就一直关注和使用Go module，尤其是Go 1.13中增加go module proxy的支持，使得中国大陆的gopher再也不用为获取类似golang.org/x/xxx路径下的module而苦恼了。

Go 1.14中go module的主要变动如下：

a) module-aware模式下对vendor的处理：如果go.mod中go version是go 1.14及以上，且当前repo顶层目录下有vendor目录，那么go工具链将默认使用vendor(即-mod=vendor)中的package，而不是module cache中的($GOPATH/pkg/mod下)。同时在这种模式下，go 工具会校验vendor/modules.txt与go.mod文件，它们需要保持同步，否则报错。

在上述前提下，如要非要使用module cache构建，则需要为go工具链显式传入-mod=mod ，比如：go build -mod=mod ./...。

b) 增加GOINSECURE，可以不再要求非得以https获取module，或者即便使用https，也不再对server证书进行校验。

c) 在module-aware模式下，如果没有建立go.mod或go工具链无法找到go.mod，那么你必须显式传入要处理的go源文件列表，否则go tools将需要你明确go.mod。比如：在一个没有go.mod的目录下，要编译一个hello.go，我们需要使用go build hello.go(hello.go需要显式放在命令后面），如果你执行go build .就会得到类似如下错误信息：

$go build .
go: cannot find main module, but found .git/config in /Users/tonybai
    to create a module there, run:
    cd .. && go mod init

也就是说在没有go.mod的情况下，go工具链的功能是受限的。

d) go module支持subversion仓库了，不过subversion使用应该很“小众”了。

要系统全面的了解go module的当前行为机制，建议还是通读一遍Go command手册中关于module的说明以及官方go module wiki。

四. 编译器

Go 1.14 go编译器在-race和-msan的情况下，默认会执行-d=checkptr，即对unsafe.Pointer的使用进行合法性检查，主要检查两项内容：

当将unsafe.Pointer转型为*T时，T的内存对齐系数不能高于原地址的

比如下面代码：

// go1.14-examples/compiler_checkptr1.go
package main

import (
    "fmt"
    "unsafe"
)

func main() {
    var byteArray = [10]byte{'a', 'b', 'c'}
    var p *int64 = (*int64)(unsafe.Pointer(&byteArray[1]))
    fmt.Println(*p)
}

以-race运行上述代码：

$go run -race compiler_checkptr1.go
fatal error: checkptr: unsafe pointer conversion

goroutine 1 [running]:
runtime.throw(0x11646fd, 0x23)
    /Users/tonybai/.bin/go1.14/src/runtime/panic.go:1112 +0x72 fp=0xc00004cee8 sp=0xc00004ceb8 pc=0x106d152
runtime.checkptrAlignment(0xc00004cf5f, 0x1136880, 0x1)
    /Users/tonybai/.bin/go1.14/src/runtime/checkptr.go:13 +0xd0 fp=0xc00004cf18 sp=0xc00004cee8 pc=0x1043b70
main.main()
    /Users/tonybai/go/src/github.com/bigwhite/experiments/go1.14-examples/compiler_checkptr1.go:10 +0x70 fp=0xc00004cf88 sp=0xc00004cf18 pc=0x11283b0
runtime.main()
    /Users/tonybai/.bin/go1.14/src/runtime/proc.go:203 +0x212 fp=0xc00004cfe0 sp=0xc00004cf88 pc=0x106f7a2
runtime.goexit()
    /Users/tonybai/.bin/go1.14/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc00004cfe8 sp=0xc00004cfe0 pc=0x109b801
exit status 2

checkptr检测到：转换后的int64类型的内存对齐系数严格程度要高于转化前的原地址(一个byte变量的地址)。int64对齐系数为8，而一个byte变量地址对齐系数仅为1。

做完指针算术后，转换后的unsafe.Pointer仍应指向原先Go堆对象

compiler_checkptr2.go
package main

import (
    "unsafe"
)

func main() {
    var n = 5
    b := make([]byte, n)
    end := unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) + uintptr(n+10))
    _ = end
}

运行上述代码：

$go run  -race compiler_checkptr2.go
fatal error: checkptr: unsafe pointer arithmetic

goroutine 1 [running]:
runtime.throw(0x10b618b, 0x23)
    /Users/tonybai/.bin/go1.14/src/runtime/panic.go:1112 +0x72 fp=0xc00003e720 sp=0xc00003e6f0 pc=0x1067192
runtime.checkptrArithmetic(0xc0000180b7, 0xc00003e770, 0x1, 0x1)
    /Users/tonybai/.bin/go1.14/src/runtime/checkptr.go:41 +0xb5 fp=0xc00003e750 sp=0xc00003e720 pc=0x1043055
main.main()
    /Users/tonybai/go/src/github.com/bigwhite/experiments/go1.14-examples/compiler_checkptr2.go:10 +0x8d fp=0xc00003e788 sp=0xc00003e750 pc=0x1096ced
runtime.main()
    /Users/tonybai/.bin/go1.14/src/runtime/proc.go:203 +0x212 fp=0xc00003e7e0 sp=0xc00003e788 pc=0x10697e2
runtime.goexit()
    /Users/tonybai/.bin/go1.14/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc00003e7e8 sp=0xc00003e7e0 pc=0x1092581
exit status 2

checkptr检测到转换后的unsafe.Pointer已经超出原先heap object: b的范围了，于是报错。

不过目前Go标准库依然尚未能完全通过checkptr的检查，因为有些库代码显然违反了unsafe.Pointer的使用规则。

Go 1.13引入了新的Escape Analysis，Go 1.14中我们可以通过-m=2查看详细的逃逸分析过程日志，比如：

$go run  -gcflags '-m=2' compiler_checkptr2.go
# command-line-arguments
./compiler_checkptr2.go:7:6: can inline main as: func() { var n int; n = 5; b := make([]byte, n); end := unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) + uintptr(n + 100)); _ = end }
./compiler_checkptr2.go:9:11: make([]byte, n) escapes to heap:
./compiler_checkptr2.go:9:11:   flow: {heap} = &{storage for make([]byte, n)}:
./compiler_checkptr2.go:9:11:     from make([]byte, n) (non-constant size) at ./compiler_checkptr2.go:9:11
./compiler_checkptr2.go:9:11: make([]byte, n) escapes to heap

五. 标准库

每个Go版本，变化最多的就是标准库，这里我们挑一个可能影响后续我们编写单元测试行为方式的变化说说，那就是testing包的T和B类型都增加了自己的Cleanup方法。我们通过代码来看一下Cleanup方法的作用：

// go1.14-examples/testing_cleanup_test.go
package main

import "testing"

func TestCase1(t *testing.T) {

    t.Run("A=1", func(t *testing.T) {
        t.Logf("subtest1 in testcase1")

    })
    t.Run("A=2", func(t *testing.T) {
        t.Logf("subtest2 in testcase1")
    })
    t.Cleanup(func() {
        t.Logf("cleanup1 in testcase1")
    })
    t.Cleanup(func() {
        t.Logf("cleanup2 in testcase1")
    })
}

func TestCase2(t *testing.T) {
    t.Cleanup(func() {
        t.Logf("cleanup1 in testcase2")
    })
    t.Cleanup(func() {
        t.Logf("cleanup2 in testcase2")
    })
}

运行上面测试：

$go test -v testing_cleanup_test.go
=== RUN   TestCase1
=== RUN   TestCase1/A=1
    TestCase1/A=1: testing_cleanup_test.go:8: subtest1 in testcase1
=== RUN   TestCase1/A=2
    TestCase1/A=2: testing_cleanup_test.go:12: subtest2 in testcase1
    TestCase1: testing_cleanup_test.go:18: cleanup2 in testcase1
    TestCase1: testing_cleanup_test.go:15: cleanup1 in testcase1
--- PASS: TestCase1 (0.00s)
    --- PASS: TestCase1/A=1 (0.00s)
    --- PASS: TestCase1/A=2 (0.00s)
=== RUN   TestCase2
    TestCase2: testing_cleanup_test.go:27: cleanup2 in testcase2
    TestCase2: testing_cleanup_test.go:24: cleanup1 in testcase2
--- PASS: TestCase2 (0.00s)
PASS
ok      command-line-arguments    0.005s

我们看到：

Cleanup方法运行于所有测试以及其子测试完成之后。
Cleanup方法类似于defer，先注册的cleanup函数后执行（比如上面例子中各个case的cleanup1和cleanup2）。

在拥有Cleanup方法前，我们经常像下面这样做：

// go1.14-examples/old_testing_cleanup_test.go
package main

import "testing"

func setup(t *testing.T) func() {
    t.Logf("setup before test")
    return func() {
        t.Logf("teardown/cleanup after test")
    }
}

func TestCase1(t *testing.T) {
    f := setup(t)
    defer f()
    t.Logf("test the testcase")
}

运行上面测试：

$go test -v old_testing_cleanup_test.go
=== RUN   TestCase1
    TestCase1: old_testing_cleanup_test.go:6: setup before test
    TestCase1: old_testing_cleanup_test.go:15: test the testcase
    TestCase1: old_testing_cleanup_test.go:8: teardown/cleanup after test
--- PASS: TestCase1 (0.00s)
PASS
ok      command-line-arguments    0.005s

有了Cleanup方法后，我们就不需要再像上面那样单独编写一个返回cleanup函数的setup函数了。

此次Go 1.14还将对unicode标准的支持从unicode 11 升级到 unicode 12 ，共增加了554个新字符。

六. 其他

超强的可移植性是Go的一个知名标签，在新平台支持方面，Go向来是“急先锋”。Go 1.14为64bit RISC-V提供了在linux上的实验性支持(GOOS=linux, GOARCH=riscv64)。

rust语言已经通过cargo-fuzz从工具层面为fuzz test提供了基础支持。Go 1.14也在这方面做出了努力，并且Go已经在向将fuzz test变成Go test的一等公民而努力。

七. 小结

Go 1.14的详细变更说明在这里可以查看。整个版本的milestone对应的issue集合在这里。

不过目前Go 1.14在特定版本linux内核上会出现crash的问题，当然这个问题源于这些内核的一个已知bug。在这个issue中有关于这个问题的详细说明，涉及到的Linux内核版本包括：5.2.x, 5.3.0-5.3.14, 5.4.0-5.4.1。
本篇博客涉及的代码在这里可以下载。

我的网课“Kubernetes实战：高可用集群搭建、配置、运维与应用”在慕课网上线了，感谢小伙伴们学习支持！

Gopher Daily(Gopher每日新闻)归档仓库 – https://github.com/bigwhite/gopherdaily

我的联系方式：

微博：https://weibo.com/bigwhite20xx
微信公众号：iamtonybai
博客：tonybai.com
github: https://github.com/bigwhite

微信赞赏：
img{512x368}

商务合作方式：撰稿、出书、培训、在线课程、合伙创业、咨询、广告合作。

标签 Gopher 下的文章

使用minio搭建高性能对象存储-第一部分：原型

一. 原型方案

二. minio server启动模式

三. 原型验证环境搭建与配置

1. 单机上部署distributed minio集群

2. mc配置与管理

3. load balancer设置

4. 对象清除

四. 小结

Go 1.14中值得关注的几个变化

一. 语言规范

二. Go运行时

1. 支持异步抢占式调度

2. defer性能得以继续优化

3. internal timer的重新实现

三. Go module已经production ready了

四. 编译器

五. 标准库

六. 其他

七. 小结

欢迎使用邮件订阅我的博客

文章

评论

分类

归档

链接

开源项目

翻译项目