From 84824ae3f517c3d2e19ab007e794955b978a8246 Mon Sep 17 00:00:00 2001
From: Tianqi Chen <tqchen@users.noreply.github.com>
Date: Mon, 9 Jul 2018 22:27:59 -0700
Subject: [PATCH] [DOCS] Improve documents on deployment (#1412)

* [DOCS] Improve documents on deployment

* minor updates
---
 docs/deploy/index.rst                      | 39 +++++++++++++++++++++-
 docs/install/index.rst                     |  3 ++
 tutorials/cross_compilation_and_rpc.py     |  2 ++
 tutorials/nnvm/deploy_model_on_mali_gpu.py |  4 ++-
 tutorials/nnvm/deploy_model_on_rasp.py     |  2 ++
 tutorials/nnvm_quick_start.py              |  2 ++
 6 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/docs/deploy/index.rst b/docs/deploy/index.rst
index bf607a4b8..0ef5cf5c8 100644
--- a/docs/deploy/index.rst
+++ b/docs/deploy/index.rst
@@ -1,3 +1,5 @@
+.. _deploy-and-integration:
+
 Deploy and Integration
 ======================
 
@@ -6,7 +8,42 @@ as well as how to integrate it with your project.
 
 .. image::  http://www.tvm.ai/images/release/tvm_flexible.png
 
-In order to integrate the compiled module, we do not have to ship the compiler stack. We only need to use a lightweight runtime API that can be integrated into various platforms.
+Unlike traditional deep learning frameworks. TVM stack is divided into two major components:
+
+- TVM compiler, which does all the compilation and optimizations
+- TVM runtime, which runs on the target devices.
+
+In order to integrate the compiled module, we **do not** need to build entire TVM on the target device. You only need to build the TVM compiler stack on your desktop and use that to cross-compile modules that are deployed on the target device.
+We only need to use a light-weight runtime API that can be integrated into various platforms.
+
+For example, you can run the following commands to build the runtime API
+on a Linux based embedded system such as Raspberry Pi:
+
+.. code:: bash
+
+    git clone --recursive https://github.com/dmlc/tvm
+    cd tvm
+    mkdir build
+    cp cmake/config.cmake build
+    cd build
+    cmake ..
+    make runtime
+
+Note that we type `make runtime` to only build the runtime library.
+If you want to include additional runtime such as OpenCL,
+you can modify `config.cmake` to enable these options.
+After you get the TVM runtime library, you can link the compiled library
+
+The easiest and recommended way to test, tune and benchmark TVM kernels on
+embedded devices is through TVM's RPC API.
+Here are the links to the related tutorials.
+
+- :ref:`tutorial-cross-compilation-and-rpc`
+- :ref:`tutorial-deploy-model-on-mali-gpu`
+- :ref:`tutorial-deploy-model-on-rasp`
+
+After you finished tuning and benchmarking, you might need to deploy the model on the
+target device without relying on RPC. see the following resources on how to do so.
 
 .. toctree::
    :maxdepth: 2
diff --git a/docs/install/index.rst b/docs/install/index.rst
index 0653175f1..d351c3834 100644
--- a/docs/install/index.rst
+++ b/docs/install/index.rst
@@ -2,6 +2,9 @@ Installation
 ============
 
 To install TVM, please read :ref:`install-from-source`.
+If you are interested in deploying to mobile/embedded devices,
+you do not need to install the entire tvm stack on your device,
+instead, you only need the runtime, please read :ref:`deploy-and-integration`.
 If you would like to quickly try out TVM or do demo/tutorials, checkout :ref:`docker-images`
 
 .. toctree::
diff --git a/tutorials/cross_compilation_and_rpc.py b/tutorials/cross_compilation_and_rpc.py
index e9f80f13f..7ef0f0fc1 100644
--- a/tutorials/cross_compilation_and_rpc.py
+++ b/tutorials/cross_compilation_and_rpc.py
@@ -1,4 +1,6 @@
 """
+.. _tutorial-cross-compilation-and-rpc:
+
 Cross Compilation and RPC
 =========================
 **Author**: `Ziheng Jiang <https://github.com/ZihengJiang/>`_
diff --git a/tutorials/nnvm/deploy_model_on_mali_gpu.py b/tutorials/nnvm/deploy_model_on_mali_gpu.py
index 2f3c332da..51caf8dcb 100644
--- a/tutorials/nnvm/deploy_model_on_mali_gpu.py
+++ b/tutorials/nnvm/deploy_model_on_mali_gpu.py
@@ -1,6 +1,8 @@
 """
+.. _tutorial-deploy-model-on-mali-gpu:
+
 Deploy the Pretrained Model on ARM Mali GPU
-=======================================================
+===========================================
 **Author**: `Lianmin Zheng <https://lmzheng.net/>`_, `Ziheng Jiang <https://ziheng.org/>`_
 
 This is an example of using NNVM to compile a ResNet model and
diff --git a/tutorials/nnvm/deploy_model_on_rasp.py b/tutorials/nnvm/deploy_model_on_rasp.py
index f4527a547..37354e7a3 100644
--- a/tutorials/nnvm/deploy_model_on_rasp.py
+++ b/tutorials/nnvm/deploy_model_on_rasp.py
@@ -1,4 +1,6 @@
 """
+.. _tutorial-deploy-model-on-rasp:
+
 Deploy the Pretrained Model on Raspberry Pi
 ===========================================
 **Author**: `Ziheng Jiang <https://ziheng.org/>`_
diff --git a/tutorials/nnvm_quick_start.py b/tutorials/nnvm_quick_start.py
index 350b5b0ef..563d71b5e 100644
--- a/tutorials/nnvm_quick_start.py
+++ b/tutorials/nnvm_quick_start.py
@@ -1,4 +1,6 @@
 """
+.. _tutorial-nnvm-quick-start:
+
 Quick Start Tutorial for Compiling Deep Learning Models
 =======================================================
 **Author**: `Yao Wang <https://github.com/kevinthesun>`_
-- 
GitLab