如何在樹莓派+NCS二代使用model zoo及自己訓練的模型

Intel在收購Movidius之後,隔年推出的第二代Neural stick開始與舊版有了區隔,不再支援NCSDK改以OpenVino取代,且兩者無法通用。剛推出的NCS新版二代原本不支援樹莓派(因為非Intel CPU),但後來在強大壓力下於去年底推出了僅具推論功能適用於樹莓派的精簡OpenVino版。

新舊兩代版本的NCS外觀,上方為二代,下方為一代。二代突顯了Intel的名稱已無Movidius字樣。

目前NCSDK與OpenVino的支援列表:

OpenVINO NCSDK 1.X NCSDK 2.X
NCS一代 O O O
NCS二代 O X X
ARM platform O O O
Intel platform O O O

安裝OpenVino for Raspbian

  1. 我們從一個全新的系統開始安裝,首先下載樹莓派的官方Raspbian image檔,目前最新版本為4.14

  1. 執行安裝下列所有檔案,你可以複製它們放在sh檔中一次執行,此安裝列表參考自https://www.pyimagesearch.com/2019/04/08/openvino-opencv-and-movidius-ncs-on-the-raspberry-pi/
sudo apt-get purge wolfram-engine -y

sudo apt-get purge libreoffice*  -y

sudo apt-get clean

sudo apt-get autoremove -y

sudo apt-get update && sudo apt-get upgrade -y

sudo apt-get install build-essential cmake unzip pkg-config -y

sudo apt-get install libjpeg-dev libpng-dev libtiff-dev -y

sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev -y

sudo apt-get install libxvidcore-dev libx264-dev -y

sudo apt-get install libgtk-3-dev -y

sudo apt-get install libcanberra-gtk* -y

sudo apt-get install libatlas-base-dev gfortran -y

sudo apt-get install python3-dev -y

wget https://bootstrap.pypa.io/get-pip.py

sudo python3 get-pip.py

sudo pip install virtualenv virtualenvwrapper

sudo rm -rf ~/get-pip.py ~/.cache/pip

cd ~

mkdir openvino

virtualenv -p python3 envAI

source ~/envAI/bin/activate

echo "source ~/envAI/bin/activate" >> ~/.bashrc

pip install numpy

pip install "picamera[array]"

pip install imutils

cd ~/envAI/lib/python3.5/site-packages/

ln -s ~/openvino/inference_engine_vpu_arm/python/python3.5/cv2.cpython-35m-arm-linux-gnueabihf.so

echo "source ~/openvino/inference_engine_vpu_arm/bin/setupvars.sh" >> ~/.bashrc
  1. 執行下方指令下載並解壓OpenVino(可在http://download.01.org/openvinotoolkit找到最新for Raspbian的版本)。注意,wget指令有時下載的結果為連結錯誤的html檔,若發現檔案大小不對無法解壓請重新執行wget下載。
cd openvino

wget http://download.01.org/openvinotoolkit/2018_R5/packages/l_openvino_toolkit_ie_p_2018.5.445.tgz

tar -zxvf l_openvino_toolkit_ie_p_2018.5.445.tgz
  1. 修改setupvars.sh

nano ~/openvino/inference_engine_vpu_arm/bin/setupvars.sh

將替換為OpenVino的安裝路徑(僅有此一個地方需修改):/home/pi/openvino/inference_engine_vpu_arm

  1. 以root權限設定USB rules for NCS and OpenVINO:
sudo usermod -a -G users "$(whoami)"

sh openvino/inference_engine_vpu_arm/install_dependencies/install_NCS_udev_rules.sh
  1. 確認Opencv可load成功,且為OpenVino內建的版本(version中有openvino字):
(envAI) pi@fruit:~ $ python

Python 3.5.3 (default, Sep 27 2018, 17:25:39)

[GCC 6.3.0 20170516] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> import cv2

>>> cv2.__version__

'4.0.1-openvino'

執行ModelZoo的Samples

model zoo:https://github.com/opencv/open_model_zoo提供了相當多預訓練的OpenVino IR模型,每種模型又分為for Neural stick(MYRIAD)的FP16版本及非MYRIAD的FP32版本。

Model CPU GPU FPGA,CPU MYRIAD
face-detection-adas-0001 Supported Supported Supported Supported
age-gender-recognition-retail-0013 Supported Supported Supported Supported
head-pose-estimation-adas-0001 Supported Supported Supported Supported
emotions-recognition-retail-0003 Supported Supported Supported Supported
facial-landmarks-35-adas-0001 Supported Supported Supported
vehicle-license-plate-detection-barrier-0106 Supported Supported Supported Supported
vehicle-attributes-recognition-barrier-0039 Supported Supported Supported Supported
license-plate-recognition-barrier-0001 Supported Supported Supported Supported
person-detection-retail-0001 Supported Supported Supported
person-vehicle-bike-detection-crossroad-0078 Supported Supported Supported Supported
person-attributes-recognition-crossroad-0200 Supported Supported
person-reidentification-retail-0031 Supported Supported Supported Supported
person-reidentification-retail-0076 Supported Supported Supported Supported
person-reidentification-retail-0079 Supported Supported Supported Supported
road-segmentation-adas-0001 Supported Supported
semantic-segmentation-adas-0001 Supported Supported
person-detection-retail-0013 Supported Supported Supported Supported
face-detection-retail-0004 Supported Supported Supported Supported
face-person-detection-retail-0002 Supported Supported Supported Supported
pedestrian-detection-adas-0002 Supported Supported Supported
vehicle-detection-adas-0002 Supported Supported Supported Supported
pedestrian-and-vehicle-detector-adas-0001 Supported Supported Supported
person-detection-action-recognition-0003 Supported Supported Supported
landmarks-regression-retail-0009 Supported Supported Supported
face-reidentification-retail-0095 Supported Supported
human-pose-estimation-0001 Supported Supported Supported
single-image-super-resolution-0063 Supported
single-image-super-resolution-1011 Supported
single-image-super-resolution-1021 Supported
text-detection-0001 Supported Supported

臉部偵測範例

以第一個模型face-detection-adas-0001為例,這個模型是由Caffe轉換過來,使用Mobilenet為base CNN並利用depth-wise縮減維度,在1080P影片可偵測到最小人臉為90×90 pixels,頭部尺寸大於64px的準確率可高達93.1%。(請參考:https://github.com/opencv/open_model_zoo/blob/master/intel_models/face-detection-adas-0001/description/face-detection-adas-0001.md)。

作法:

我們分別先下載xml(model檔)及bin(權重檔)之後

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R4/open_model_zoo/face-detection-adas-0001/FP16/face-detection-adas-0001.bin

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R4/open_model_zoo/face-detection-adas-0001/FP16/face-detection-adas-0001.xml

便可透過下方的程式來載入執行:

import cv2

import imutils

import time

model_path = "demo/face-detection-adas-0001.xml"

pbtxt_path = "demo/face-detection-adas-0001.bin"

net = cv2.dnn.readNet(model_path, pbtxt_path)

net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)

camera = cv2.VideoCapture(0)

frameID = 0

grabbed = True

start_time = time.time()

while grabbed:

    (grabbed, img) = camera.read()

    img = cv2.resize(img, (450,320))

    frame = img.copy()

    # Prepare input blob and perform an inference

    blob = cv2.dnn.blobFromImage(frame, size=(672, 384), ddepth=cv2.CV_8U)

    net.setInput(blob)

    out = net.forward()

    # Draw detected faces on the frame

    for detection in out.reshape(-1, 7):

        confidence = float(detection[2])

        xmin = int(detection[3] * frame.shape[1])

        ymin = int(detection[4] * frame.shape[0])

        xmax = int(detection[5] * frame.shape[1])

        ymax = int(detection[6] * frame.shape[0])

        if confidence > 0.5:

            cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color=(0, 255, 0))

    cv2.imshow("FRAME", frame)

    frameID += 1

    fps = frameID / (time.time() - start_time)

    print("FPS:", fps)

    cv2.waitKey(1)

可以發現到, OpenVino內建的OpenCV再加上最重要的此行:net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD),就能透過dnn直接讓OpenVino IR模型在NCS上執行。使用720P的webcamera,FPS約在3.5~3.6左右,臉部的偵測相當精準。

性別年齡預測範例

第二個範例用模型age-gender-recognition為例,這個模型也是由Caffe轉換過來。由於該訓練樣本不包含小孩,故本模型僅適合偵測年齡在18~75歲之間的男女。年齡預測誤差為6.99歲,性別預測正確率為95.8%(請參考:http://docs.openvinotoolkit.org/latest/_age_gender_recognition_retail_0013_description_age_gender_recognition_retail_0013.html

作法:

我們分別先下載xml(model檔)及bin(權重檔)之後

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R5/open_model_zoo/age-gender-recognition-retail-0013/FP16/age-gender-recognition-retail-0013.bin

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R5/open_model_zoo/age-gender-recognition-retail-0013/FP16/age-gender-recognition-retail-0013.xml

便可透過下方的程式來載入執行:

import cv2

import imutils

import time

model_path = "demo/age-gender-recognition-retail-0013.xml"

pbtxt_path = "demo/age-gender-recognition-retail-0013.bin"

net = cv2.dnn.readNet(model_path, pbtxt_path)

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

cascade_scale = 1.2

cascade_neighbors = 6

minFaceSize = (30,30)

net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)

def getFaces(img):

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(

        gray,

        scaleFactor= cascade_scale,

        minNeighbors=cascade_neighbors,

        minSize=minFaceSize,

        flags=cv2.CASCADE_SCALE_IMAGE

    )

    bboxes = []

    for (x,y,w,h) in faces:

        if(w>minFaceSize[0] and h>minFaceSize[1]):

            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(0)

frameID = 0

grabbed = True

start_time = time.time()

while grabbed:

    (grabbed, img) = camera.read()

    img = cv2.resize(img, (550,400))

    out = []

    frame = img.copy()

    faces = getFaces(frame)

    x, y, w, h = 0, 0, 0, 0

    i = 0

    for (x,y,w,h) in faces:

        cv2.rectangle( frame,(x,y),(x+w,y+h),(0,255,0),2)

        if(w>0 and h>0):

            facearea = frame[y:y+h, x:x+w]

            blob = cv2.dnn.blobFromImage(facearea, size=(62, 62), ddepth=cv2.CV_8U)

            net.setInput(blob)

            out = net.forward()

            num_age = out[0][0][0][0]

            num_sex = out[0][1][0][0]

            age = int(num_age*100)

            if(num_sex>0.5):

                sex = "man"

            else:

                sex = "woman"

            txt = "sex: {}, age: {}".format(sex,age)

            if(age<=1):

                txt = "sex: {}, age: {}".format(sex,'?')

            if(i % 2 == 0):

                cv2.putText(frame,txt,(int(x), int(y)),cv2.FONT_HERSHEY_SIMPLEX,0.65,(255, 255, 0), 2)

            else:

                cv2.putText(frame,txt,(int(x), int(y+h)),cv2.FONT_HERSHEY_SIMPLEX,0.65,(255, 255, 0), 2)

            i += 1

    cv2.imshow("FRAME", frame)

    frameID += 1

    fps = frameID / (time.time() - start_time)

    print("FPS:", fps)

    cv2.waitKey(1)

實際上該模型測試後感覺效果並不是很理想,最主要原因是上述的程式取得臉孔後並未作align,其次是dataset訓練樣本中的東方人可能太少的緣故。

情緒分析

第三個範例是emotion-recognition模型,該模型選用了AffectNet dataset中的五種情緒進行訓練(http://mohammadmahoor.com/affectnet/),這五種是:neutral、happy、 sad、surprise、anger,正確率約70.20%(請參考:http://docs.openvinotoolkit.org/latest/_emotions_recognition_retail_0003_description_emotions_recognition_retail_0003.html

作法:

我們分別先下載xml(model檔)及bin(權重檔)之後

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R5/open_model_zoo/emotions-recognition-retail-0003/FP16/emotions-recognition-retail-0003.bin

wget –no-check-certificate https://download.01.org/openvinotoolkit/2018_R5/open_model_zoo/emotions-recognition-retail-0003/FP16/emotions-recognition-retail-0003.xml

便可透過下方的程式來載入執行:

import cv2

import imutils

import time

model_path = "demo/emotions-recognition-retail-0003.xml"

pbtxt_path = "demo/emotions-recognition-retail-0003.bin"

net = cv2.dnn.readNet(model_path, pbtxt_path)

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

cascade_scale = 1.2

cascade_neighbors = 6

minFaceSize = (30,30)

# Specify target device

net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)

def getFaces(img):

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(

        gray,

        scaleFactor= cascade_scale,

        minNeighbors=cascade_neighbors,

        minSize=minFaceSize,

        flags=cv2.CASCADE_SCALE_IMAGE

    )

    bboxes = []

    for (x,y,w,h) in faces:

        if(w>minFaceSize[0] and h>minFaceSize[1]):

            bboxes.append((x, y, w, h))

    return bboxes

camera = cv2.VideoCapture(0)

frameID = 0

grabbed = True

start_time = time.time()

while grabbed:

    (grabbed, img) = camera.read()

    img = cv2.resize(img, (550,400))

    # Read an image

    out = []

    frame = img.copy()

    faces = getFaces(frame)

    x, y, w, h = 0, 0, 0, 0

    i = 0

    for (x,y,w,h) in faces:

        cv2.rectangle( frame,(x,y),(x+w,y+h),(255,255,255),1)

        if(w>0 and h>0):

            facearea = frame[y:y+h, x:x+w]

            # Prepare input blob and perform an inference

            blob = cv2.dnn.blobFromImage(facearea, size=(64, 64), ddepth=cv2.CV_8U)

            net.setInput(blob)

            out = net.forward()

            neutral = int(out[0][0][0][0] * 100)

            happy = int(out[0][1][0][0] * 100)

            sad = int(out[0][2][0][0] * 100)

            surprise = int(out[0][3][0][0] * 100)

            anger = int(out[0][4][0][0] * 100)

            yy = y

            line2 = "{}%\n{}%\n{}%\n{}%\n{}%".format(neutral,happy,sad,surprise,anger)

            y0, dy = yy, 35

            for ii, txt in enumerate(line2.split('\n')):

                y = y0 + ii*dy

                cv2.putText(frame, txt, (x, y ), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 0, 0), 2)

            line1 = "Neutral:\nHappy:\nSad:\nSurprise:\nAnger:"

            y0, dy = yy, 35

            for ii, txt in enumerate(line1.split('\n')):

                y = y0 + ii*dy

                cv2.putText(frame, txt, (x+55, y ), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

            i += 1

    # Save the frame to an image file

    cv2.imshow("FRAME", frame)

    frameID += 1

    fps = frameID / (time.time() - start_time)

    print("FPS:", fps)

    cv2.waitKey(1)

使用自己的模型

最後,就用我們自己訓練的模型在NCS上跑看看。如果你的OpenVino有依照上述方式安裝於樹莓派,那麼,方式就跟使用標準的OpenCV DNN模組一樣,只要在load module之後,增加一行model.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)即可。(DNN_TARGET_MYRIAD: 於NCS執行,DNN_TARGET_CPU於CPU執行)

在樹莓派透過NCS,SSD_Mobilenet V2的速度可達3.5 fps以上,已接近實時偵測的需求。

SSD Mobilenet+OpenVino+NCS 2於樹莓派的速度

1080P(1920×1080) 720P(1280×720) VGA(640×480)
2.796 fps 3.712 fps 4.750 fps