Since the topics “Machine Learning” and “Artificial Intelligence” in general are growing bigger and bigger, dedicated AI hardware starts popping up from a number of companies. To get an overview over the current state of AI platforms, we took a closer look at two of them: NVIDIA’s Jetson Nano and Google’s new Coral USB Accelerator. In this article we will discuss the typical workflow for these platforms and their pros and cons.

The Rivals

NVIDIA’s Jetson Nano is a single-board computer, which in comparison to something like a RaspberryPi, contains quite a lot CPU/GPU horsepower at a much lower price than the other siblings of the Jetson family. It is currently available as a Developer Kit for around 109€ and contains a System-on-Module (SoM) and a carrier board that provides HDMI, USB 3.0 and Ethernet ports. A slightly different module (eMMC instead of a SD card) that can be integrated in custom designs is available for around 135€. Both run on a flavor of Ubuntu 18.04 called “Linux for Tegra” (L4T). Setting it up can take some time since you need to install everything from scratch, but it should be fairly easy, if you have worked with Linux before. Even if not, there are a lot of tutorials out there to help you through setup.

NVIDIA Jetson Nano Developer Kit

In contrast, Google’s Coral utilizes a specialized ASIC for processing of deep neural networks called Edge TPU (Tensor Processing Unit). It comes in multiple versions for different use-cases. At the time of writing, you can either get the Coral Dev Board, a single-board computer similar to NVIDIA’s Jetson Nano, which runs Mendel Linux or you go for the Coral USB Accelerator with a host system of your choice. The Dev Board costs around 149€ and the USB Accelerator is 70€. In addition, Google announced the release of their Edge TPU as both a Mini PCIe / M.2 chip for integration into existing systems and a System-on-Module for use with your own custom baseboard.

Google Coral USB Accelerator (top) and Google Coral Dev Board (bottom)

Comparing the Workflow

Our objective is to compare the workflow of both platforms from setup to running an object detector. After some research we decided to use MobileNet SSD v2, primarily because the available Google Coral models were limited at the time of testing. However, Google has since released an update for their compiler, which makes deploying your own models easier.

On the Jetson Nano we used the TensorFlow model from the official TensorFlow model zoo, followed by TensorRT to optimize it. With Google Coral the optimized and pre-compiled TensorFlow Lite model from the Coral model zoo was used. In the following part we will go through the steps together and set up these models on the respective platforms.

How to run MobileNet SSD v2 on the NVIDIA Jetson Nano

To set up our Nano for the first time we head over to NVIDIA’s getting started guide and follow the step by step instruction manual.

After completing the guide, we can focus on running MobileNet SSD v2 on the Nano. Doing all the work by ourselves would go beyond the scope of this blog post, so for the sake of simplicity, we will do this with the help of an existing git-repository. We install the dependencies as described in the repository:

$ sudo apt-get install python3-pip libhdf5-serial-dev hdf5-tools
$ pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.13.1+nv19.5 --user
$ pip3 install numpy pycuda --user

After that we clone the repository to our Nano and download the MobileNet SSD v2 model from the TensorFlow model zoo.

$ git clone https://github.com/AastaNV/TRT_object_detection.git
$ cd TRT_object_detection
$ mkdir model
$ wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz
$ tar zxvf ssd_mobilenet_v2_coco_2018_03_29.tar.gz

To run the code in the repository, we have to edit the graphsurgeon converter, specifically this file: /usr/lib/python3.6/dist-packages/graphsurgeon/node_manipulation.py

 diff --git a/node_manipulation.py b/node_manipulation.py
index d2d012a..1ef30a0 100644
--- a/node_manipulation.py
+++ b/node_manipulation.py
@@ -30,6 +30,7 @@ def create_node(name, op=None, _do_suffix=False, **kwargs):
node = NodeDef()
node.name = name
node.op = op if op else name
+    node.attr["dtype"].type = 1
for key, val in kwargs.items():
if key == "dtype":
node.attr["dtype"].type = val.as_datatype_enum

Next, we maximize the performance of the Jetson Nano with the jetson_clocks script (optional):

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Then we open the main.py and add the following line at the top of the script, so it knows which model we want to use:

from config import model_ssd_mobilenet_v2_coco_2018_03_29 as model

Finally we run the object detection with the following line of code, where [image] is the path to the image you want to perform the detection on:

$ python3 main.py [image]

Running the script the first time may take a couple of minutes because the model has to be converted into the TensorRT format, but after that it should be done in a few seconds.

And that’s it, we’ve just performed our first object detection on the NVIDIA Jetson Nano!

How to run MobileNet SSD v2 on the Google Coral

For this introduction we used the Coral USB Accelerator on a desktop computer using Ubuntu 18.04. To get started, we first have to install the Edge TPU runtime and the Python library. To do so, we execute the following commands:

$ sudo apt-get update
$ wget https://dl.google.com/coral/edgetpu_api/edgetpu_api_latest.tar.gz -O edgetpu_api.tar.gz --trust-server-names
$ tar xzf edgetpu_api.tar.gz
$ sudo edgetpu_api/install.sh

After running the install-script we can plug in the USB Accelerator and download both the compiled Edge TPU model and the corresponding labels file:

$ cd ~/Downloads/
$ wget https://dl.google.com/coral/canned_models/mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite \
https://dl.google.com/coral/canned_models/coco_labels.txt

Now it should be ready to run a model, so we run the sample script for object detection, which was installed with the Edge TPU runtime.

$ cd /usr/local/lib/python3.6/dist-packages/edgetpu/demo
$ python3 object_detection.py \
--model ~/Downloads/mobilenet_ssd_v2_coco_quant_postprocess_edgetpu.tflite \
--label ~/Downloads/coco_labels.txt \
--input [image] \
--output ~/Pictures/detection_output.jpg

Substitute [image] with the path to the image you want to detect objects on. To save the image with the detected bounding boxes, provide the desired path to the output-parameter.

After running the script, the results should be shown in the console like this:

-----------------------------------------
bear
score =  0.98046875
box =  [5.395890326395523, 56.6478157043457, 586.0, 636.3274383544922]

The processed image should look somewhat like this:

Detected object with bounding box (Google Coral)

Hooray, we just performed our first object detection on the Coral Edge TPU! If you want to dive deeper into the usage of the platform and create your own application, you can go to Google’s API documentation to learn how to use the Python API for your own projects.

Performance Comparison

In addition to running MobileNet SSD v2 on a single image, we wanted to have a look at the performance of both platforms in terms of speed and accuracy when performing inference on a lot of images. For this we used the “2017 Val images” COCO-dataset, which are 5000 images of “common objects in context”. To calculate the speed we measured the time of object detection on each of the 5000 images and calculated the average frames per seconds.

As shown in the diagram below, there is a huge difference in FPS between the Jetson Nano and the Coral. On average Coral was around 6–7 times quicker that its NVIDIA counterpart on this specific data set. This is especially notable considering the Coral USB Accelerator consumes around one fourth (~2.5W) the power of the Jetson Nano (~10W).

The experimental setup might be a little bit shifted in favor of the Coral, since we used the Coral USB Accelerator on a desktop computer with an Intel i7 CPU. But the main reason for the huge difference is most likely the higher efficiency and performance of the specialized Edge TPU ASIC compared to the much more general GPU-architecture of the Jetson Nano. For example, Coral uses only 8-bit integer values in its models and the Edge TPU is built to take full advantage of that. This comes with huge limitations in terms of usable models and overall flexibility in the software development for such a platform.

Also the detection accuracy will be lower due to the lower integer precision. To calculate the accuracy of the platforms when performing object detection, we collected the bounding boxes of the detected objects and calculated the “mean average precision” (mAP) over all 80 classes in the COCO dataset. As the second bar in the graph below shows, the Nano is around 12.60% more accurate than the Coral, at least in this experiment.

Flexibility of the platforms

The Jetson Nano is basically a small Linux computer with full functionality, which makes it very flexible in terms of software usage. It is able to run all the common Machine Learning frameworks, like TensorFlow, Caffe, PyTorch, Keras and MXNet. Code used to deploy a model on a desktop GPU can usually be transferred to the Jetson Nano with minor changes. Due to its small size, the Nano can easily be used in mobile scenarios where there is no room for a big desktop system.

In comparison, the Coral USB Accelerator is not as flexible in terms of framework usage as the Jetson Nano, since it is only able to run models which were converted from Google’s TensorFlow into the TensorFlow Lite format and then were converted a second time into a Coral-specific format to comply with the hardware architecture of the Edge TPU. This makes deploying a model to the Coral more complex and time consuming than to the Jetson Nano. If you managed to complete all the necessary steps though, the Coral makes full usage of its power. It is able to run models extremely fast — usually way faster than the Nano, but at the cost of accuracy. Since the Coral can be plugged in via USB to any host system, it can be used with stationary systems or in mobile applications, depending on your needs.

Conclusion

Both the Jetson Nano and the Google Coral USB Accelerator are amazing gadgets which make it possible to deploy state of the art Machine Learning models at an affordable price. Depending on your use-case, one of them may be more suitable. If you need flexibility, the Jetson Nano is probably better for you. If you want to focus on one framework only and are willing to adapt your model to the Coral’s needs, then the Google Coral Edge TPU should be the right one for you. There may be one thing to keep in mind though: the Google Coral is pretty new on the market and whether it will be supported and further developed in the long term or if it will be part of the Google product graveyard, remains to be seen.