滴滴云A100 40G+TensorFlow1.15.2 +Ubuntu 18.04 性能测试

作者: 托尼 分类: AI 发布时间: 2020-09-16 23:33
今天拿到了滴滴云 (大师码:8888)内测版A100,跑了一下 TensorFlow基准测试,现在把结果记录一下!

运行环境

平台为:滴滴云
系统为:Ubuntu 18.04
显卡为:A100-SXM4-40GB
Python版本: 3.6
TensorFlow版本:1.15.2 NV编译版

 

系统环境:

 

测试方法

TensorFlow benchmarks测试方法:
https://github.com/tensorflow/benchmarks

 

resnet50_v1.5

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50_v1.5
Step    Img/sec total_loss
1       images/sec: 602.4 +/- 0.0 (jitter = 0.0)        7.847
10      images/sec: 606.8 +/- 1.2 (jitter = 5.4)        8.053
20      images/sec: 606.3 +/- 0.8 (jitter = 4.4)        8.102
30      images/sec: 605.8 +/- 0.8 (jitter = 3.8)        8.117
40      images/sec: 606.2 +/- 0.7 (jitter = 3.8)        7.893
50      images/sec: 606.1 +/- 0.5 (jitter = 3.0)        7.919
60      images/sec: 606.2 +/- 0.5 (jitter = 2.9)        8.104
70      images/sec: 606.6 +/- 0.5 (jitter = 2.9)        7.985
80      images/sec: 606.6 +/- 0.4 (jitter = 2.8)        7.805
90      images/sec: 606.6 +/- 0.4 (jitter = 2.8)        7.973
100     images/sec: 606.7 +/- 0.4 (jitter = 2.8)        7.644
----------------------------------------------------------------
total images/sec: 606.23
----------------------------------------------------------------

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50_v1.5 --use_fp16

 

Step    Img/sec total_loss
1       images/sec: 1327.1 +/- 0.0 (jitter = 0.0)       7.972
10      images/sec: 1321.2 +/- 5.7 (jitter = 27.6)      7.885
20      images/sec: 1323.5 +/- 4.4 (jitter = 25.9)      8.073
30      images/sec: 1323.6 +/- 3.7 (jitter = 27.3)      7.934
40      images/sec: 1322.1 +/- 3.3 (jitter = 32.9)      8.102
50      images/sec: 1321.4 +/- 3.0 (jitter = 27.7)      7.876
60      images/sec: 1322.2 +/- 2.8 (jitter = 32.3)      7.883
70      images/sec: 1322.3 +/- 2.5 (jitter = 32.6)      7.962
80      images/sec: 1324.0 +/- 2.4 (jitter = 32.2)      8.049
90      images/sec: 1324.2 +/- 2.2 (jitter = 31.2)      7.909
100     images/sec: 1325.1 +/- 2.1 (jitter = 29.6)      7.874
----------------------------------------------------------------
total images/sec: 1322.76
----------------------------------------------------------------

 

Resnet50 BS64

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50
Step    Img/sec total_loss
1       images/sec: 653.5 +/- 0.0 (jitter = 0.0)        8.219
10      images/sec: 646.2 +/- 2.0 (jitter = 6.0)        7.879
20      images/sec: 646.1 +/- 1.4 (jitter = 7.2)        7.909
30      images/sec: 646.0 +/- 1.2 (jitter = 6.0)        7.820
40      images/sec: 646.2 +/- 1.0 (jitter = 6.3)        8.006
50      images/sec: 646.0 +/- 1.0 (jitter = 8.6)        7.769
60      images/sec: 646.0 +/- 0.9 (jitter = 8.6)        8.114
70      images/sec: 645.7 +/- 0.9 (jitter = 9.5)        7.811
80      images/sec: 645.8 +/- 0.8 (jitter = 9.5)        7.979
90      images/sec: 645.8 +/- 0.8 (jitter = 8.0)        8.095
100     images/sec: 645.8 +/- 0.7 (jitter = 6.4)        8.038
----------------------------------------------------------------
total images/sec: 645.26
----------------------------------------------------------------
–use_fp16
python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50 --use_fp16
Step    Img/sec total_loss
1       images/sec: 1300.1 +/- 0.0 (jitter = 0.0)       8.101
10      images/sec: 1310.1 +/- 7.5 (jitter = 7.4)       7.758
20      images/sec: 1309.7 +/- 8.0 (jitter = 42.3)      7.912
30      images/sec: 1315.0 +/- 5.9 (jitter = 32.1)      7.776
40      images/sec: 1315.5 +/- 4.7 (jitter = 28.2)      7.918
50      images/sec: 1317.5 +/- 3.9 (jitter = 27.7)      7.895
60      images/sec: 1316.5 +/- 3.4 (jitter = 18.6)      7.711
70      images/sec: 1317.3 +/- 3.1 (jitter = 16.1)      8.008
80      images/sec: 1316.9 +/- 2.8 (jitter = 11.4)      7.777
90      images/sec: 1317.7 +/- 2.6 (jitter = 11.8)      7.808
100     images/sec: 1317.1 +/- 2.4 (jitter = 9.9)       8.036
----------------------------------------------------------------
total images/sec: 1315.11
----------------------------------------------------------------

 

AlexNet BS512

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=512 --model=alexnet
Step    Img/sec total_loss
1       images/sec: 8294.2 +/- 0.0 (jitter = 0.0)       nan
10      images/sec: 8290.2 +/- 1.6 (jitter = 5.3)       nan
20      images/sec: 8290.6 +/- 1.0 (jitter = 3.7)       nan
30      images/sec: 8290.8 +/- 0.7 (jitter = 2.8)       nan
40      images/sec: 8291.3 +/- 0.6 (jitter = 2.7)       nan
50      images/sec: 8289.8 +/- 1.4 (jitter = 2.9)       nan
60      images/sec: 8290.2 +/- 1.2 (jitter = 2.9)       nan
70      images/sec: 8290.4 +/- 1.3 (jitter = 3.6)       nan
80      images/sec: 8291.1 +/- 1.1 (jitter = 3.5)       nan
90      images/sec: 8291.9 +/- 1.0 (jitter = 4.4)       nan
100     images/sec: 8291.9 +/- 1.1 (jitter = 5.2)       nan
----------------------------------------------------------------
total images/sec: 8282.46
----------------------------------------------------------------

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=512 --model=alexnet --use_fp16
Step    Img/sec total_loss
1       images/sec: 10618.6 +/- 0.0 (jitter = 0.0)      7.250
10      images/sec: 10607.7 +/- 4.4 (jitter = 16.3)     7.251
20      images/sec: 10602.5 +/- 3.0 (jitter = 13.1)     7.251
30      images/sec: 10604.1 +/- 2.3 (jitter = 11.2)     7.251
40      images/sec: 10601.0 +/- 2.5 (jitter = 13.4)     7.251
50      images/sec: 10601.7 +/- 2.5 (jitter = 13.8)     7.251
60      images/sec: 10603.0 +/- 2.2 (jitter = 14.0)     7.250
70      images/sec: 10605.1 +/- 2.1 (jitter = 12.5)     7.251
80      images/sec: 10605.4 +/- 1.9 (jitter = 12.2)     7.251
90      images/sec: 10605.4 +/- 1.7 (jitter = 12.1)     7.251
100     images/sec: 10605.8 +/- 1.7 (jitter = 12.3)     7.251
----------------------------------------------------------------
total images/sec: 10587.67
----------------------------------------------------------------

 

Inception v3 BS64

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=inception3
Step    Img/sec total_loss
1       images/sec: 436.8 +/- 0.0 (jitter = 0.0)        7.276
10      images/sec: 437.9 +/- 1.2 (jitter = 0.8)        7.337
20      images/sec: 437.8 +/- 1.0 (jitter = 2.2)        7.269
30      images/sec: 437.9 +/- 0.8 (jitter = 2.2)        7.422
40      images/sec: 437.9 +/- 0.6 (jitter = 3.5)        7.299
50      images/sec: 438.6 +/- 0.6 (jitter = 4.1)        7.277
60      images/sec: 439.2 +/- 0.5 (jitter = 3.7)        7.363
70      images/sec: 439.5 +/- 0.5 (jitter = 4.8)        7.347
80      images/sec: 440.3 +/- 0.5 (jitter = 5.3)        7.410
90      images/sec: 440.3 +/- 0.5 (jitter = 5.2)        7.325
100     images/sec: 440.3 +/- 0.4 (jitter = 5.0)        7.346
----------------------------------------------------------------
total images/sec: 440.01
----------------------------------------------------------------

 

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=inception3 --use_fp16
Step    Img/sec total_loss
1       images/sec: 901.5 +/- 0.0 (jitter = 0.0)        7.305
10      images/sec: 945.5 +/- 7.0 (jitter = 5.0)        7.354
20      images/sec: 945.6 +/- 4.9 (jitter = 7.1)        7.330
30      images/sec: 945.3 +/- 3.9 (jitter = 6.9)        7.382
40      images/sec: 946.3 +/- 3.2 (jitter = 7.3)        7.278
50      images/sec: 946.6 +/- 2.8 (jitter = 7.5)        7.373
60      images/sec: 946.3 +/- 2.5 (jitter = 7.6)        7.299
70      images/sec: 946.8 +/- 2.3 (jitter = 7.5)        7.323
80      images/sec: 946.5 +/- 2.1 (jitter = 7.6)        7.317
90      images/sec: 946.6 +/- 2.0 (jitter = 7.6)        7.357
100     images/sec: 947.2 +/- 1.8 (jitter = 7.3)        7.327
----------------------------------------------------------------
total images/sec: 946.03
----------------------------------------------------------------

 

VGG16 BS64

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=vgg16
Step    Img/sec total_loss
1       images/sec: 442.1 +/- 0.0 (jitter = 0.0)        7.321
10      images/sec: 442.4 +/- 0.1 (jitter = 0.4)        7.315
20      images/sec: 442.4 +/- 0.1 (jitter = 0.3)        7.269
30      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.271
40      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.282
50      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.291
60      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.250
70      images/sec: 442.4 +/- 0.1 (jitter = 0.2)        7.278
80      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.274
90      images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.286
100     images/sec: 442.4 +/- 0.0 (jitter = 0.2)        7.283
----------------------------------------------------------------
total images/sec: 442.20
----------------------------------------------------------------

 

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=vgg16 --use_fp16
Step    Img/sec total_loss
1       images/sec: 687.4 +/- 0.0 (jitter = 0.0)        7.279
10      images/sec: 688.2 +/- 0.2 (jitter = 0.5)        7.255
20      images/sec: 688.0 +/- 0.1 (jitter = 0.5)        7.283
30      images/sec: 688.0 +/- 0.1 (jitter = 0.7)        7.254
40      images/sec: 687.9 +/- 0.1 (jitter = 0.7)        7.283
50      images/sec: 687.8 +/- 0.1 (jitter = 0.7)        7.249
60      images/sec: 687.7 +/- 0.1 (jitter = 0.8)        7.294
70      images/sec: 687.6 +/- 0.1 (jitter = 0.9)        7.278
80      images/sec: 687.6 +/- 0.1 (jitter = 0.9)        7.268
90      images/sec: 687.7 +/- 0.1 (jitter = 0.9)        7.264
100     images/sec: 687.6 +/- 0.1 (jitter = 0.9)        7.268
----------------------------------------------------------------
total images/sec: 687.07
----------------------------------------------------------------

 

GoogLeNet BS128

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=128 --model=googlenet
Step    Img/sec total_loss
1       images/sec: 1577.4 +/- 0.0 (jitter = 0.0)       7.104
10      images/sec: 1565.9 +/- 4.1 (jitter = 12.5)      7.105
20      images/sec: 1561.7 +/- 3.1 (jitter = 20.4)      7.094
30      images/sec: 1562.3 +/- 2.5 (jitter = 15.1)      7.087
40      images/sec: 1561.5 +/- 2.2 (jitter = 16.1)      7.067
50      images/sec: 1561.6 +/- 2.0 (jitter = 15.6)      7.091
60      images/sec: 1561.5 +/- 1.8 (jitter = 15.7)      7.049
70      images/sec: 1560.3 +/- 1.9 (jitter = 15.3)      7.074
80      images/sec: 1558.8 +/- 1.9 (jitter = 17.2)      7.077
90      images/sec: 1558.2 +/- 1.8 (jitter = 17.2)      7.079
100     images/sec: 1557.5 +/- 1.8 (jitter = 17.6)      7.066
----------------------------------------------------------------
total images/sec: 1556.06
----------------------------------------------------------------

 

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=128 --model=googlenet --use_fp16
Step    Img/sec total_loss
1       images/sec: 2690.1 +/- 0.0 (jitter = 0.0)       7.173
10      images/sec: 2675.3 +/- 13.9 (jitter = 35.5)     7.068
20      images/sec: 2682.4 +/- 9.9 (jitter = 55.4)      7.086
30      images/sec: 2686.6 +/- 8.3 (jitter = 36.6)      7.075
40      images/sec: 2687.8 +/- 6.9 (jitter = 30.6)      7.084
50      images/sec: 2686.7 +/- 6.0 (jitter = 36.4)      7.076
60      images/sec: 2687.5 +/- 5.4 (jitter = 36.4)      7.075
70      images/sec: 2681.0 +/- 6.8 (jitter = 41.6)      7.075
80      images/sec: 2683.2 +/- 6.1 (jitter = 34.0)      7.065
90      images/sec: 2684.1 +/- 5.6 (jitter = 35.6)      7.092
100     images/sec: 2683.9 +/- 5.2 (jitter = 36.1)      7.052
----------------------------------------------------------------
total images/sec: 2680.27
----------------------------------------------------------------

 

ResNet152 BS32

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet152
Step    Img/sec total_loss
1       images/sec: 225.6 +/- 0.0 (jitter = 0.0)        9.060
10      images/sec: 228.3 +/- 1.0 (jitter = 2.0)        8.594
20      images/sec: 228.3 +/- 0.6 (jitter = 2.0)        8.635
30      images/sec: 228.2 +/- 0.5 (jitter = 2.5)        8.719
40      images/sec: 227.9 +/- 0.5 (jitter = 2.8)        8.599
50      images/sec: 228.1 +/- 0.5 (jitter = 2.9)        8.791
60      images/sec: 228.3 +/- 0.4 (jitter = 3.6)        8.668
70      images/sec: 228.3 +/- 0.4 (jitter = 3.3)        9.072
80      images/sec: 228.3 +/- 0.4 (jitter = 3.5)        8.874
90      images/sec: 228.4 +/- 0.3 (jitter = 3.7)        9.030
100     images/sec: 228.4 +/- 0.3 (jitter = 3.7)        8.839
----------------------------------------------------------------
total images/sec: 228.29
----------------------------------------------------------------

 

–use_fp16

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet152 --use_fp16
Step    Img/sec total_loss
1       images/sec: 392.9 +/- 0.0 (jitter = 0.0)        9.147
10      images/sec: 397.9 +/- 2.8 (jitter = 6.0)        9.000
20      images/sec: 399.0 +/- 2.1 (jitter = 8.6)        8.842
30      images/sec: 393.7 +/- 2.9 (jitter = 14.7)       8.813
40      images/sec: 394.4 +/- 2.3 (jitter = 15.2)       8.984
50      images/sec: 394.9 +/- 2.0 (jitter = 13.9)       8.647
60      images/sec: 395.7 +/- 1.8 (jitter = 13.9)       8.838
70      images/sec: 396.5 +/- 1.6 (jitter = 15.3)       8.941
80      images/sec: 395.9 +/- 1.4 (jitter = 13.4)       8.913
90      images/sec: 396.2 +/- 1.3 (jitter = 14.1)       8.807
100     images/sec: 395.7 +/- 1.3 (jitter = 14.5)       8.729
----------------------------------------------------------------
total images/sec: 395.34
----------------------------------------------------------------

 

性能对比

A100 和V100 和 2080ti 性能对比:
https://www.tonyisstark.com/383.html

 

 

 

 

 

 

 

 

 

 

 

 

如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!

打赏      
2条评论
  • 大师码8888

    2020年9月17日 下午11:13

    滴滴云注册地址https://i.didiyun.com/2bfvQwumiHj

  • 大师码8888

    2020年9月17日 下午11:14

    滴滴云注册地址https://i.didiyun.com/2bfvQwumiHj

发表评论

邮箱地址不会被公开。 必填项已用*标注