天枢大规模分布式训练评测报告

<html>
<body>
<!DOCTYPE html><h1 cid="n0" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 2.5rem; margin: 2em 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 2.75rem; letter-spacing: -1.5px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">天枢大规模分布式训练评测报告</h1><h2 cid="n2" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.63rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.875rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">1. 简介</h2><p cid="n3" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">本报告比较了多个深度学习框架在多个经典的深度学习模型训练任务上分布式训练的吞吐率、加速比、硬件使用率（如：GPU、CPU、内存、硬盘、网络等）。测试均采用相同的数据集、相同的硬件环境和算法，仅比较各个框架之间的速度差异。<p cid="n4" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">结果表明（期望结果）：<ul class="ul-list" cid="n5" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n6" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n7" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">分布式性能：在20台以上虚机或服务器组合时，线性加速比达到80%以上，与业界已有框架相比有突出的优势；&nbsp;</li><li class="md-list-item" cid="n8" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n9" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">资源利用率：大规模分布式训练计算时，在各大典型任务上训练的硬件资源平均利用率不低于80%。&nbsp;</li></ul><h2 cid="n10" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.63rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.875rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2. 背景介绍</h2><h3 cid="n11" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.1 评测平台</h3><p cid="n12" mdtype="paragraph" class="md-end-block md-p md-focus" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">本次评测基于基于之江天枢平台，以下简要介绍平台使用流程：<p cid="n13" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">1）平台地址：<a href="zjlab.dubhe.club" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">zjlab.dubhe.club</a><ul class="ul-list" cid="n14" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n15" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n16" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">测试账号（详询俞再亮）</li><li class="md-list-item" cid="n17" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n18" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><span md-inline="image" data-src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151225.png" class="md-image md-img-loaded" style="box-sizing: border-box; min-width: 10px; min-height: 10px; position: relative; word-break: break-all; font-family: monospace; vertical-align: top; display: inline-block; width: 1110px;"><img referrerpolicy="no-referrer" src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151225.png" onerror="onImageErrorFunc(event)" onload="onLoadedFuncForQuickAction(event)" style="box-sizing: border-box; border-width: 0px 4px 0px 2px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: solid; border-top-color: initial; border-right-color: transparent; border-bottom-color: initial; border-left-color: transparent; border-image: initial; vertical-align: middle; max-width: 100%; image-orientation: from-image; cursor: default; display: block; margin: auto;"></li></ul><p cid="n19" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2）选择资源总量（可扩容）<ul class="ul-list" cid="n20" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n21" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n22" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">当前可支持 1机1卡 -&gt; 4机32卡<p cid="n23" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.5rem 0px; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><span md-inline="image" data-src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151414.png" class="md-image md-img-loaded" style="box-sizing: border-box; min-width: 10px; min-height: 10px; position: relative; word-break: break-all; font-family: monospace; vertical-align: top; display: inline-block; width: 1110px;"><img referrerpolicy="no-referrer" src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151414.png" onerror="onImageErrorFunc(event)" onload="onLoadedFuncForQuickAction(event)" style="box-sizing: border-box; border-width: 0px 4px 0px 2px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: solid; border-top-color: initial; border-right-color: transparent; border-bottom-color: initial; border-left-color: transparent; border-image: initial; vertical-align: middle; max-width: 100%; image-orientation: from-image; cursor: default; display: block; margin: auto;"></li><li class="md-list-item" cid="n24" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n25" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">单节点详细配置（单节点上限 8 卡）</li><li class="md-list-item" cid="n26" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><ul class="ul-list" cid="n27" mdtype="list" data-mark="-" style="box-sizing: border-box; margin: 0px; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative;"><li class="md-list-item" cid="n28" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n29" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">Tesla V100S-PCIE-32GB x 8</li><li class="md-list-item" cid="n30" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n31" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz</li></ul></li><li class="md-list-item" cid="n32" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><ul class="ul-list" cid="n33" mdtype="list" data-mark="-" style="box-sizing: border-box; margin: 0px; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative;"><li class="md-list-item" cid="n34" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n35" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">Memory 754G</li><li class="md-list-item" cid="n36" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n37" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">Ubuntu 18.04.5 LTS (GNU/Linux 4.4.0-142-generic x86_64)</li></ul></li><li class="md-list-item" cid="n38" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><ul class="ul-list" cid="n39" mdtype="list" data-mark="-" style="box-sizing: border-box; margin: 0px; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative;"><li class="md-list-item" cid="n40" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n41" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">CUDA Version: 11.1, Driver Version: 460.73.01</li><li class="md-list-item" cid="n42" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n43" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><code style="box-sizing: border-box; font-family: Monaco, Consolas, &quot;Andale Mono&quot;, &quot;DejaVu Sans Mono&quot;, monospace; text-align: left; vertical-align: initial; font-size: 0.875em; background: rgba(0, 0, 0, 0.05); padding: 2px 5px;">nvidia-smi topo -m</code><p cid="n44" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.5rem 0px; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><span md-inline="image" data-src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151424.png" class="md-image md-img-loaded" style="box-sizing: border-box; min-width: 10px; min-height: 10px; position: relative; word-break: break-all; font-family: monospace; vertical-align: top; display: inline-block; width: 1080px;"><img referrerpolicy="no-referrer" src="https://gitee.com/yayeoCddy/figures/raw/master/img/20210927151424.png" onerror="onImageErrorFunc(event)" onload="onLoadedFuncForQuickAction(event)" style="box-sizing: border-box; border-width: 0px 4px 0px 2px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: solid; border-top-color: initial; border-right-color: transparent; border-bottom-color: initial; border-left-color: transparent; border-image: initial; vertical-align: middle; max-width: 100%; image-orientation: from-image; cursor: default; display: block; margin: auto;"></li></ul></li></ul><h3 cid="n45" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.2 评测框架</h3><p cid="n46" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">本次评测包含了4个框架：<ol class="ol-list" start="" cid="n47" mdtype="list" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: decimal; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n48" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n49" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><a href="https://github.com/Oneflow-Inc/oneflow" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">OneFlow</a></li><li class="md-list-item" cid="n50" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n51" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><a href="https://github.com/tensorflow/tensorflow" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">TensorFlow</a> 1.x &amp; 2.x</li><li class="md-list-item" cid="n52" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n53" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><a href="https://github.com/pytorch/pytorch" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">PyTorch</a></li><li class="md-list-item" cid="n56" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n57" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><a href="https://github.com/PaddlePaddle/Paddle" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">PaddlePaddle</a></li></ol><p cid="n58" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">其中 TensorFlow 1.x、PyTorch、MXNet采用的是NVIDIA深度优化后的版本，性能测试在<a href="https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">NGC 20.03</a> 镜像中复现。其余框架的性能测试在相同的物理环境中复现。<p cid="n59" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">各个框架对应的模型训练脚本，从该框架的官方模型库中选取，或者从NVIDIA- <a href="https://github.com/NVIDIA/DeepLearningExamples" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">DeepLearningExamples</a> 仓库中选取。<h3 cid="n60" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.3 评测模型</h3><p cid="n61" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">本次评测基于以上评测框架，选择了两个经典主流的深度学习模型：<p cid="n62" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">1） <a href="https://arxiv.org/abs/1512.03385" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">ResNet-50 v1.5</a> <p cid="n63" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2） <a href="https://arxiv.org/abs/1810.04805" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">BERT-Base</a> <p cid="n64" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">其中ResNet-50是计算机视觉（Computer Version）领域最主流的深度学习模型，而BERT是自然语言处理（Natural Language Processing）领域的进行预训练的主流模型。<p cid="n65" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">同时为了验证OneFlow框架的易用性以及可拓展性，基于OneFlow单独测试了在人脸识别、大规模预训练、点击率预估任务中的经典的深度学习模型：<p cid="n66" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">1）<a href="https://insightface.ai/" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">InsightFace</a><p cid="n67" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2）<a href="https://arxiv.org/abs/1606.07792" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">Wise &amp; Deep</a><p cid="n68" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">3）<a href="https://arxiv.org/abs/2005.14165" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">GPT2</a><h3 cid="n69" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.4 评测环境</h3><p cid="n70" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">为保证能更好地测试框架本身的性能好坏，做到公平公正，本次测评所有的测试均在相同的物理集群中测试，使用相同的软件环境等。<p cid="n71" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">测试环境共有1000张V100 GPU显卡。具体的硬件和软件配置描述如下（根据实验设备实际情况填写，包括型号、大小、速度、版本等）：<ul class="ul-list" cid="n72" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n73" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n74" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">显卡参数</li><li class="md-list-item" cid="n75" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n76" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">通信设备</li><li class="md-list-item" cid="n77" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n78" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">CPU参数</li><li class="md-list-item" cid="n79" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n80" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">内存大小</li><li class="md-list-item" cid="n81" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n82" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">系统版本</li><li class="md-list-item" cid="n83" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n84" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">CUDA版本</li><li class="md-list-item" cid="n85" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n86" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;"><code style="box-sizing: border-box; font-family: Monaco, Consolas, &quot;Andale Mono&quot;, &quot;DejaVu Sans Mono&quot;, monospace; text-align: left; vertical-align: initial; font-size: 0.875em; background: rgba(0, 0, 0, 0.05); padding: 2px 5px;">nvidia-smi topo -m</code></li></ul><h3 cid="n87" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.5 评测配置</h3><p cid="n88" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">针对每个框架的每个模型，我们都测试了其分布式环境下的吞吐率，包含了不同的batch size、是否经过XLA优化加速、是否使用自动混合精度训练。下面简要介绍一下相关概念：<h4 cid="n89" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.5.1 Batch Size</h4><p cid="n90" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">在本测试报告中，batch size表示深度学习训练过程中每个设备（GPU/卡）上的样例个数。简称bsz（batch size per GPU）。特别地，使用global batch size（global bsz）表示表示深度学习训练过程中所有设备（GPUs）上的样例个数。<p cid="n91" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">由于各个深度学习框架的显存管理策略不同，内存优化程度也不一样，所以对于相同的深度学习模型，各个框架在同样显存大小的GPU上所能支持的最大batch size是不同的。通常来说，batch size越大，则性能评测结果越好。<h4 cid="n92" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.5.2 XLA</h4><p cid="n93" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><a href="https://www.tensorflow.org/xla" style="box-sizing: border-box; cursor: pointer; text-decoration: underline; outline: 0px; transition: all 0.2s ease-in-out 0s; color: rgb(224, 224, 224); -webkit-user-drag: none;">XLA</a> (Accelerated Linear Algebra)是一种深度学习编译器，可以在不改变源码的情况下进行线性代数加速。针对支持XLA的深度学习框架我们也会测试其开启或关闭状态下的性能表现。<h4 cid="n94" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.5.3 AMP 自动混合精度</h4><p cid="n95" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">AMP(Automatic Mixed Precision) 自动混合精度，在GPU上可以加速训练过程，与Float32精度相比，AMP在某些GPU上可以做到3倍左右的速度提升。我们对支持AMP的深度学习框架会测试其开启或关闭AMP的性能表现。<h3 cid="n96" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.6 评测规则</h3><p cid="n97" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">根据2.5小节介绍的评测配置，针对每个框架每个模型的一个测试（each test case），我们都会遍历如下可能的参数：<p cid="n98" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">1） 机器数（1，2，4，8，16，32，64，125），GPU数（1，8，16，32，64，128，256，512，1000）<p cid="n99" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2） 每个设备上的batch size<p cid="n100" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">3） 是否开启XLA<p cid="n101" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">4） 是否开启AMP<p cid="n102" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">注：<ul class="ul-list" cid="n103" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n104" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n105" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">125和1000分别为此次测评的最大机器数和最多GPU数。</li><li class="md-list-item" cid="n106" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n107" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">针对每个框架的每次性能测试，我们至少测试了 1机1卡、1机8卡、2机16卡、4机32卡直到64机512卡这些情况。用于评价各个框架在分布式训练情况下的横向扩展能力。</li><li class="md-list-item" cid="n108" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n109" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">针对此次测评，我们会重复几次（5-7次），并选取这几次测试的中位数作为实际的测试结果。测试结果选取规则尽可能的屏蔽掉随机因素的干扰，使得测试结果接近真实值。</li></ul><h3 cid="n110" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.7 评测指标</h3><p cid="n111" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">我们选取吞吐率（throughput）、加速比（speedup）、硬件使用率（如：GPU、CPU、内存、硬盘、网络）等作为评测指标。<p cid="n112" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">吞吐率表示了深度学习框架的处理速度，吞吐率越高，则训练一个深度学习模型所需的时间越短，深度学习框架的性能就越高。加速比表示了深度学习框架多机多卡的扩展性，加速比越高，则额外增加一个硬件设备所带来的收益就越高，深度学习框架的多机扩展性就越好。硬件使用率表示了深度学习框架的资源利用效率，数值越大，深度学习框架的性能就越高。<h4 cid="n113" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.7.1 吞吐率</h4><p cid="n114" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">吞吐率表示训练过程中深度学习框架每秒处理的样例个数。对于图片分类任务而言，表示每秒处理多少张图片（images/sec）；对于自然语言处理任务而言，表示每秒处理多少个句子（sentences/sec）。<p cid="n115" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">为了得到连续且稳定的吞吐率，我们会过滤掉训练一开始的几个step。在实际测试中，一般我们过滤了前20个step，并选取后续100个step的均值计算吞吐率。（有些框架在有些训练模型上的log是按照100的倍数输出的，这时我们会过滤掉前100个step，选取后面几百个step计算均值。）<h4 cid="n116" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.7.2 加速比</h4><p cid="n117" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">通过加速比，可测试出深度学习框架在分布式训练环境下的横向扩展能力。加速比是针对该框架在某一分布式配置下（如n台机器，共m个设备）的吞吐率与该框架在相同配置下（相同的bsz per GPU，相同的参数）单机单卡的吞吐率的比值。理想情况下，加速比为m（m&gt;1），但每个框架都只能尽可能接近m，而无法达到和超过m。<h4 cid="n118" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.12rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: normal; clear: both; overflow-wrap: break-word; padding: 0px; color: white; line-height: 1.375rem; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">2.7.3 硬件使用率</h4><p cid="n119" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">通过硬件使用率，特别是GPU、CPU、内存、硬盘、网络的使用率。在实际测试中，我们取阶段性step（每阶段的选择参考2.7.1）硬件使用率的平均值。该数值越高，说明深度学习框架的效率越高，资源调度越优化。<h2 cid="n120" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.63rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.875rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">3. ResNet-50 v1.5 性能测试</h2><h3 cid="n121" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.17rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.5rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">3.1 参与评测的各个框架和模型库介绍</h3><p cid="n122" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin-top: 0px; margin-bottom: 1.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">参与本次评测的框架、版本、模型库、以及额外特性如表3-1（该表格中的各个版本需要再确认）所示：<div spellcheck="false" class="md-htmlblock md-rawblock md-end-block" cid="n123" mdtype="html_block" style="box-sizing: border-box; margin-top: 1rem; margin-bottom: 1rem; white-space: normal; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">​<div class="md-htmlblock-container md-rawblock-container" tabindex="-1" style="box-sizing: border-box; min-height: 20px; cursor: default;">表 3-1 参与ResNet50-v1.5 性能评测的各个框架介绍</div>​</div><figure class="md-table-fig" cid="n124" mdtype="table" style="box-sizing: border-box; margin: 1.2em 0px; overflow-x: auto; max-width: calc(100% + 16px); padding: 0px; cursor: default; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">

Framework | Version | Docker From | DNN Model Sources | Features
-- | -- | -- | -- | --
OneFlow | 0.*.0 | - | OneFlow-Benchmark | -
NGC MXNet | 1.6.0 | nvcr.io/nvidia/mxnet:20.03-py3 | DeepLearningExamples/MxNet | DALI+Horovod
NGC TensorFlow 1.x | 1.15.2 | nvcr.io/nvidia/tensorflow:20.03-tf1-py3 | DeepLearningExamples/TensorFLow | DALI+Horovod+XLA
NGC PyTorch | 1.5.0a0+8f84ded | nvcr.io/nvidia/pytorch:20.03-py3 | DeepLearningExamples/PyTorch | DALI+APEX
MXNet | 1.6.0 | - | gluon-cv | Horovod
TensorFlow 2.x | 2.3.0 | - | TensorFlow-models | -
PyTorch | 1.6.0 | - | pytorch/examples | -
PaddlePaddle | 1.8.3.post107 | - | PaddleCV | DALI

</figure><h2 cid="n2873" mdtype="heading" class="md-end-block md-heading" style="box-sizing: border-box; break-after: avoid-page; break-inside: avoid; orphans: 4; font-size: 1.63rem; margin: 0px 0px 1.5rem; font-family: &quot;Lucida Grande&quot;, Corbel, sans-serif; font-weight: bold; clear: both; overflow-wrap: break-word; padding: 0px; color: rgb(222, 222, 222); line-height: 1.875rem; letter-spacing: -1px; white-space: pre-wrap; position: relative; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">8. 问题</h2><ul class="ul-list" cid="n2874" mdtype="list" data-mark="-" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 1.5rem; padding: 0px 0px 0px 1.875rem; list-style: square; position: relative; color: rgb(184, 191, 198); font-family: &quot;Helvetica Neue&quot;, Helvetica, Arial, &quot;Segoe UI Emoji&quot;, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><li class="md-list-item" cid="n2875" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n2876" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">【注】的内容需要根据具体实验情况进行修改</li><li class="md-list-item" cid="n2877" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n2878" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">GPT2实验中不同并行模式的参数根据具体实验情况修改</li><li class="md-list-item" cid="n2879" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; display: list-item;"><p cid="n2880" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; overflow-wrap: break-word; white-space: pre-wrap; position: relative;">耗时（latency）需要在每个实验中增加吗？（目前在Wise &amp; Deep、GPT2有）</li></ul> 
</body>
</html>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

天枢大规模分布式训练评测报告 #143

天枢大规模分布式训练评测报告

1. 简介

2. 背景介绍

2.1 评测平台

2.2 评测框架

2.3 评测模型

2.4 评测环境

2.5 评测配置

2.5.1 Batch Size

2.5.2 XLA

2.5.3 AMP 自动混合精度

2.6 评测规则

2.7 评测指标

2.7.1 吞吐率

2.7.2 加速比

2.7.3 硬件使用率

3. ResNet-50 v1.5 性能测试

3.1 参与评测的各个框架和模型库介绍

8. 问题

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Framework	Version	Docker From	DNN Model Sources	Features
OneFlow	0.*.0	-	OneFlow-Benchmark	-
NGC MXNet	1.6.0	nvcr.io/nvidia/mxnet:20.03-py3	DeepLearningExamples/MxNet	DALI+Horovod
NGC TensorFlow 1.x	1.15.2	nvcr.io/nvidia/tensorflow:20.03-tf1-py3	DeepLearningExamples/TensorFLow	DALI+Horovod+XLA
NGC PyTorch	1.5.0a0+8f84ded	nvcr.io/nvidia/pytorch:20.03-py3	DeepLearningExamples/PyTorch	DALI+APEX
MXNet	1.6.0	-	gluon-cv	Horovod
TensorFlow 2.x	2.3.0	-	TensorFlow-models	-
PyTorch	1.6.0	-	pytorch/examples	-
PaddlePaddle	1.8.3.post107	-	PaddleCV	DALI

天枢大规模分布式训练评测报告 #143

Description

天枢大规模分布式训练评测报告

1. 简介

2. 背景介绍

2.1 评测平台

2.2 评测框架

2.3 评测模型

2.4 评测环境

2.5 评测配置

2.5.1 Batch Size

2.5.2 XLA

2.5.3 AMP 自动混合精度

2.6 评测规则

2.7 评测指标

2.7.1 吞吐率

2.7.2 加速比

2.7.3 硬件使用率

3. ResNet-50 v1.5 性能测试

3.1 参与评测的各个框架和模型库介绍

8. 问题

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions