{"id":787,"date":"2019-04-25T19:40:25","date_gmt":"2019-04-25T19:40:25","guid":{"rendered":"http:\/\/www.smart-bricks.net\/?p=787"},"modified":"2019-04-28T22:23:11","modified_gmt":"2019-04-28T22:23:11","slug":"google-open-sources-gpipe-library-for-faster-training-of-large-deep-learning-models","status":"publish","type":"post","link":"https:\/\/www.smart-bricks.net\/index.php\/2019\/04\/25\/google-open-sources-gpipe-library-for-faster-training-of-large-deep-learning-models\/","title":{"rendered":"Google Open-Sources GPipe Library for Faster Training of Large Deep-Learning Models"},"content":{"rendered":"\n<p><a href=\"https:\/\/ai.google\/\">Google AI<\/a> is <a href=\"https:\/\/ai.googleblog.com\/2019\/03\/introducing-gpipe-open-source-library.html\">open-sourcing GPipe<\/a>, a <a href=\"https:\/\/www.tensorflow.org\/\">TensorFlow<\/a> library for accelerating the training of large deep-learning models.<br>\n<br>\nDeep-neural-networks (DNN) are the tool of choice for solving many AI \ntasks, such as natural-language processing and visual object detection. \nNew methods for the latter are often benchmarked against winners of the <a href=\"http:\/\/image-net.org\/\">ImageNet challenge<\/a>.\n Each year&#8217;s winning entry has performed better than the last; however, \nthere is a corresponding increase in model complexity. The 2014 winner, <a href=\"http:\/\/deeplearning.net\/2014\/09\/19\/googles-entry-to-imagenet-2014-challenge\/\">GoogLeNet<\/a>, achieved 74.8% top-1 accuracy with 4 million model parameters. 2017&#8217;s winner, <a href=\"https:\/\/vitalab.github.io\/deep-learning\/2018\/07\/20\/SEnets.html\">Squeeze-and-Excitation Networks<\/a>, reached 82.7% top-1 accuracy with 145.8 million parameters.<br>\n<br>\nThe increase in model size poses a problem when&nbsp;training the networks. \nIn order to train the networks in a reasonable time, much of the \ncomputation is delegated to <em>accelerators<\/em>: special-purpose hardware such as GPUs or <a href=\"https:\/\/cloud.google.com\/tpu\/\">TPUs<\/a>.\n But these devices have limited memory, which restricts the size of the \nmodel that can be trained. There are ways to reduce the memory \nrequirements, such as swapping out the data in the accelerator&#8217;s memory,\n but these can slow down training. Another solution is to partition the \nmodel, so that multiple accelerators can be used in parallel. The most \nobvious partition scheme in a sequential DNN is to split the model by \nlayers&nbsp;and have each layer trained by a different accelerator. But the \nsequential nature of training multiple layers can result in only one \naccelerator working while others are idle waiting for results to come \nfrom higher or lower in the stack.<\/p>\n\n\n\n<p>GPipe solves this problem by splitting training batches into \n&#8220;micro-batches&#8221; and pipelining them through the layers. Accelerators for\n successive layers can begin processing a micro-batch result from a \nprevious layer without waiting for the full batch to be finished.<\/p>\n\n\n\n<p>Using GPipe and 8 TPUv2s, Google&#8217;s researchers were able to train \nvisual object detection models with 1.8 billion parameters: 5.6 times \nthe parameters that could be trained on a single TPUv2. Training these \nlarge models resulted in 84.7% top-1 accuracy on the ImageNet validation\n data, beating the 2017 winner&#8217;s score.<\/p>\n\n\n\n<p>In addition to supporting larger models, GPipe&#8217;s model partitioning \nallows for faster training of a given model simply by running more \naccelerators in parallel. Researchers reported that using &#8220;four times \nmore accelerators achieved 3.5 times speedup.&#8221;<\/p>\n\n\n\n<p>GPipe is available as part of the <a href=\"https:\/\/github.com\/tensorflow\/lingvo\">Lingvo framework<\/a>&nbsp;for building sequential neural-network models in TensforFlow.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Google AI is open-sourcing GPipe, a TensorFlow library for accelerating the training of large deep-learning models. Deep-neural-networks (DNN) are the tool of choice for solving many AI tasks, such as natural-language processing and visual object detection. New methods for the latter are often benchmarked against winners of the ImageNet challenge. Each year&#8217;s winning entry has&hellip;&nbsp;<a href=\"https:\/\/www.smart-bricks.net\/index.php\/2019\/04\/25\/google-open-sources-gpipe-library-for-faster-training-of-large-deep-learning-models\/\" class=\"\" rel=\"bookmark\">Read More &raquo;<span class=\"screen-reader-text\">Google Open-Sources GPipe Library for Faster Training of Large Deep-Learning Models<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[19,26,34],"_links":{"self":[{"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/posts\/787"}],"collection":[{"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/comments?post=787"}],"version-history":[{"count":1,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/posts\/787\/revisions"}],"predecessor-version":[{"id":788,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/posts\/787\/revisions\/788"}],"wp:attachment":[{"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/media?parent=787"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/categories?post=787"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smart-bricks.net\/index.php\/wp-json\/wp\/v2\/tags?post=787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}