YOLOv3 Step by Step

In this document, we provide a step by step example on how to utilize our tools to compile and test with a newly downloaded YOLOv3 model.

Step 0: Prepare environment and data

We need to download the latest toolchain docker image which contains all the tools we need.

docker pull kneron/toolchain:latest

Start the docker with a local folder mounted into the docker.

docker run --rm -it -v /your/folder/path/for/docker_mount:/data1 kneron/toolchain:latest

Go to our mounted folder and download a public keras based YOLOv3 model from Github https://github.com/qqwweee/keras-yolo3

cd /data1 && git clone https://github.com/qqwweee/keras-yolo3.git keras_yolo3

Follow the model's document to save the pretrained model as an h5 file:

cd keras_yolo3
wget https://pjreddie.com/media/files/yolov3-tiny.weights
python convert.py yolov3-tiny.cfg yolov3-tiny.weights /data1/yolo.h5

We now have yolo.h5 under our mounted folder /data1.

We also need to preprare some images under the mounted folder. We have provided some example input images at http://doc.kneron.com/docs/toolchain/res/test_image10.zip.

Here is how you can get it:

cd /data1
wget http://doc.kneron.com/docs/toolchain/res/test_image10.zip
unzip test_image10.zip

Now we have images in folder test_image10/ at /data1; these are needed for quantization.

We also need some extra images for accuracy testing. But considering the complexity of document, we use only one image in toolchain docker for testing.

cd /data1
cp /workspace/E2E_Simulator/app/test_image_folder/yolo/000000350003.jpg ./.

Now we have image 000000350003.jpg at /data1 for testing.

Step 1: Import KTC and required lib in python shell

Now, we go through all toolchain flow by KTC (Kneron Toolchain) using the Python API in the Python shell.

Figure 1. python shell

import ktc
import numpy as np
import os
import onnx
from PIL import Image
import numpy as np

Step 2: Convert and optimize the pretrain model

You can check the model architecture with Netron.

We find this model has no input shape, so it will be unable to run in our toolchain. We need to specify the input shape while doing the conversion.

# convert h5 model to onnx
m = ktc.onnx_optimizer.keras2onnx_flow("/data1/yolo.h5", input_shape = [1, 416, 416, 3])

Not only do we need to do conversion, but we also need to optimize it to make it efficient and compatible with our hardware.

m = ktc.onnx_optimizer.onnx2onnx_flow(m)

Now, we have optimized onnx model in variable "m". Here, we save the onnx model 'm' to disk at /data1/yolo.opt.onnx for further verification (like Netron or onnxruntime) in step 4.

onnx.save(m,'yolo.opt.onnx')

Step 3: IP Evaluation

To make sure the onnx model is as expected, we should check the onnx model's performance and see if there are any unsupprted operators (or CPU nodes).

# npu (only) performance simulation
km = ktc.ModelConfig(19, "0001", "520", onnx_model=m)
eval_result = km.evaluate()
print("\nNpu performance evaluation result:\n" + str(eval_result))

The estimated FPS (NPU only) report on your terminal should look similar to this:

    ***** Warning: this model has 1 CPU ops which may cause that the report's fps is different from the actual fps *****
    ***** Warning: CPU ops types: KneronResize.

    [Evaluation Result]
    estimate FPS float = 22.5861
    total time = 44.2751 ms
    total theoretical covolution time = 16.7271 ms
    average DRAM bandwidth = 0.279219 GB/s
    MAC efficiency to total time = 37.7799 %
    MAC idle time = 3.85105 ms
    MAC running time = 40.424 ms

There are two things to take note of in this report:

At the same time, a folder called compiler will be generated in your docker mounted folder (/data1); the evaluation result will be found in this folder. One important thing is to check the 'ioinfo.csv' in /data1/compiler, which looks like this:

    i,0,input_1_o0,3,416,416
    c,0,up_sampling2d_1_o0_kn,128,26,26
    o,0,conv2d_10_o0,255,13,13
    o,1,conv2d_13_o0,255,26,26

This file gives information about the special nodes in the ONNX. Each line shows the information of each node, and the first element shows the type of the special node.

type explanation:

We can see, under KL520, one CPU node called up_sampling2d_1_o0_kn1 in our ONNX model.

Step 4: Check ONNX model and preprocess and postprocess are good

If we can get correct detection result from the ONNX and provided preprocess and postprocess functions, everything should be correct.

First, we need to check the preprocess and postprocess methods. Here is the relevant code.

The following is the extracted preprocess:

from yolo3.utils import letterbox_image

def preprocess(pil_img):
    model_input_size = (416, 416)  # to match our model input size when converting
    boxed_image = letterbox_image(pil_img, model_input_size)
    np_data = np.array(boxed_image, dtype='float32')

    np_data /= 255.
    return np_data

This is the extracted postprocess:

import tensorflow as tf
import pathlib
import sys
sys.path.append(str(pathlib.Path("keras_yolo3").resolve()))
from yolo3.model import yolo_eval

def postprocess(inf_results, ori_image_shape):
    tensor_data = [tf.convert_to_tensor(data, dtype=tf.float32) for data in inf_results]

    # get anchor info
    anchors_path = "/data1/keras_yolo3/model_data/tiny_yolo_anchors.txt"
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    anchors = np.array(anchors).reshape(-1, 2)

    # post process
    num_classes = 80
    boxes, scores, classes = yolo_eval(tensor_data, anchors, num_classes, ori_image_shape)
    with tf.Session() as sess:
        boxes = boxes.eval()
        scores = scores.eval()
        classes = classes.eval()

    return boxes, scores, classes

Now, we can check the ONNX inference result with api 'ktc.kneron_inference'.

## onnx model check

input_image = Image.open('/data1/000000350003.jpg')

# resize and normalize input data
in_data = preprocess(input_image)

# onnx inference 
out_data = ktc.kneron_inference([in_data], onnx_file="/data1/yolo.opt.onnx", input_names=["input_1_o0"])

# onnx output data processing
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])

print(det_res)

The result will be displayed on your terminal like this:

(array([[258.8878 , 470.29474, 297.01447, 524.3069 ],
       [233.62653, 218.19923, 306.79245, 381.78162]], dtype=float32), array([0.9248918, 0.786504 ], dtype=float32), array([2, 7], dtype=int32))

This result looks good.

Note that we only use one image as example. Using more data to check accuracy is a good idea.

Step 5: Quantization

Let us use the same preprocess on our quantization data and put it in a list:

# load and normalize all image data from folder
img_list = []
for (dir_path, _, file_names) in os.walk("/data1/test_image10"):
    for f_n in file_names:
        fullpath = os.path.join(dir_path, f_n)
        print("processing image: " + fullpath)

        image = Image.open(fullpath)
        img_data = preprocess(image)
        img_list.append(img_data)

Then, perform quantization. The BIE model will be generated at /data1/output.bie.

# fix point analysis
bie_model_path = km.analysis({"input_1_o0": img_list})
print("\nFix point analysis done. Save bie model to '" + str(bie_model_path) + "'")

Step 6: Check if BIE model accuracy is good enough

After quantization, the slight drop in model accuracy is expected. We should check if this accuracy is good enough to use.

Toolchain API ktc.kneron_inference can help us to check. The usage of 'ktc.kneron_inference' is similar to Step 4, but there are several differences:

  1. The 2nd parameter is changed from onnx_file to bie_file.
  2. You need to provide the radix value, which can be obtained by ktc.get_radix with input images as the parameter.
  3. If the platform is not 520, you need to provide an extra parameter: platform, e.g. platform=720.
## bie model check
input_image = Image.open('/data1/000000350003.jpg')

# resize and normalize input data
in_data = preprocess(input_image)

# check nef radix from quantization data
radix = ktc.get_radix(img_list)

# bie inference 
out_data = ktc.kneron_inference([in_data], bie_file=bie_model_path, input_names=["input_1_o0"], radix=radix)

# bie output data processing
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])
print(det_res)

The result will be displayed on your terminal like this:

(array([[258.51468, 467.71683, 293.07394, 529.15967]], dtype=float32), array([0.8253723], dtype=float32), array([2], dtype=int32))

This is slightly different from the result in Step 3: we lost one bounding box after quantization. Note that this loss is acceptable after quantization.

If you are running the example using 720 as the hardware platform, there might be one extra bounding box. This is normal. We may observe different behaviour from 520 and 720.

Step 7: Compile

The final step is compile the BIE model into an NEF model.

# compile
nef_model_path = ktc.compile([km])
print("\nCompile done. Save Nef file to '" + str(nef_model_path) + "'")

You can find the NEF file under /data1/batch_compile/models_520.nef. models_520.nef is the final compiled model.

(optional) Step 8. Check NEF model

Toolchain api ktc.inference does support NEF model inference. The usage of ktc.kneron_inference is similar to the steps in Step 4 and Step 6, with minor differences.

  1. The 2nd parameter is changed from to nef_model.
  2. You need to provide the radix value, which can be obtained by ktc.get_radix with input images as the parameter.
  3. If the platform is not 520, you need to provide an extra parameter: platform, e.g. platform=720.
# nef model check
input_image = Image.open('/data1/000000350003.jpg')

# resize and normalize input data
in_data = preprocess(input_image)

# check nef radix from quantization data
radix = ktc.get_radix(img_list)

# nef inference
out_data = ktc.kneron_inference([in_data], nef_file=nef_model_path, radix=radix)

# nef output data processing
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])
print(det_res)

The result will be displayed on your terminal like this:

(array([[258.51468, 467.71683, 293.07394, 529.15967]], dtype=float32), array([0.8253723], dtype=float32), array([2], dtype=int32))

Note: the NEF model results should be exactly the same as the BIE model results.

Step 9. Prepare host_lib (Don't do it in toolchain docker)

To run NEF on KL520, we need help from host_lib:

  1. Connect KL520 USB dongle and USB camera to your computer
  2. git clone https://github.com/kneron/host_lib.git
  3. Follow the instruction in the github link to setup the environment

Step 10. Run our yolo NEF on KL520 with host_lib

We leverage the provided the example code in host_lib to run our YOLO NEF.

  1. Replace host_lib/input_models/KL520/tiny_yolo_v3/models_520.nef with our YOLO NEF.
  2. Modify host_lib/python/examples_kl520/cam_dme_serial_post_host_yolo.py line 29. Change model input size from (224,224) to (416,416)

Figure 2. modify input_size in example

  1. Modify preprocess config from "Kneron" mode to "Yolo" mode

Figure 3. modify preprocess method in example

  1. Run example cam_dme_serial_post_host_yolo.py
    cd host_lib/python
    python main.py -t KL520-cam_dme_serial_post_host_yolo

Then, you should see a window pop up and show us the YOLO NEF detection result from camera:

Figure 4. detection result

Appendix

The whole model conversion process from ONNX to NEF (Steps 1-6) can be combined into one Python script:

import ktc
import os
import onnx
from PIL import Image
import numpy as np

###  post process function  ###
import tensorflow as tf
import pathlib
import sys
sys.path.append(str(pathlib.Path("keras_yolo3").resolve()))
from yolo3.model import yolo_eval

def postprocess(inf_results, ori_image_shape):
    tensor_data = [tf.convert_to_tensor(data, dtype=tf.float32) for data in inf_results]

    # get anchor info
    anchors_path = "/data1/keras_yolo3/model_data/tiny_yolo_anchors.txt"
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    anchors = np.array(anchors).reshape(-1, 2)

    # post process
    num_classes = 80
    boxes, scores, classes = yolo_eval(tensor_data, anchors, num_classes, ori_image_shape)
    with tf.Session() as sess:
        boxes = boxes.eval()
        scores = scores.eval()
        classes = classes.eval()

    return boxes, scores, classes

###  pre process function  ###
from yolo3.utils import letterbox_image

def preprocess(pil_img):
    model_input_size = (416, 416)  # to match our model input size when converting
    boxed_image = letterbox_image(pil_img, model_input_size)
    np_data = np.array(boxed_image, dtype='float32')

    np_data /= 255.
    return np_data


# convert h5 model to onnx
m = ktc.onnx_optimizer.keras2onnx_flow("/data1/yolo.h5", input_shape = [1,416,416,3])
m = ktc.onnx_optimizer.onnx2onnx_flow(m)
onnx.save(m,'yolo.opt.onnx')


# npu(only) performance simulation
km = ktc.ModelConfig(19, "0001", "520", onnx_model=m)
eval_result = km.evaluate()
print("\nNpu performance evaluation result:\n" + str(eval_result))


## onnx model check
input_image = Image.open('/data1/000000350003.jpg')
in_data = preprocess(input_image)
out_data = ktc.kneron_inference([in_data], onnx_file="/data1/yolo.opt.onnx", input_names=["input_1_o0"])
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])
print(det_res)

# load and normalize all image data from folder
img_list = []
for (dir_path, _, file_names) in os.walk("/data1/test_image10"):
    for f_n in file_names:
        fullpath = os.path.join(dir_path, f_n)
        print("processing image: " + fullpath)

        image = Image.open(fullpath)
        img_data = preprocess(image)
        img_list.append(img_data)


# fix point analysis
bie_model_path = km.analysis({"input_1_o0": img_list})
print("\nFix point analysis done. Save bie model to '" + str(bie_model_path) + "'")


# bie model check
input_image = Image.open('/data1/000000350003.jpg')
in_data = preprocess(input_image)
radix = ktc.get_radix(img_list)
out_data = ktc.kneron_inference([in_data], bie_file=bie_model_path, input_names=["input_1_o0"], radix=radix)
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])
print(det_res)


# compile
nef_model_path = ktc.compile([km])
print("\nCompile done. Save Nef file to '" + str(nef_model_path) + "'")

# nef model check
input_image = Image.open('/data1/000000350003.jpg')
in_data = preprocess(input_image)
radix = ktc.get_radix(img_list)
out_data = ktc.kneron_inference([in_data], nef_file=nef_model_path, radix=radix)
det_res = postprocess(out_data, [input_image.size[1], input_image.size[0]])
print(det_res)