Tutorial - deploy YOLOv5-keypoint with ncnn #4983

changshanzhao · 2023-08-26T03:40:08Z

changshanzhao
Aug 26, 2023

概要

本文在nihui大大yolov5部署基础上完成，主要修改了yolov5的输出部分。JLU-wind模型中输出包含16个维度，前10个维度分别对应5个关键点的坐标，第十一个维度是bbox的置信度，最后5个维度包含了类别信息。

输出中不含bbox，但是可以用5个关键点的外接矩形作为bbox。

setup yolov5-KeyPoint pytorch

# if you need 5 keypoint
git clone https://github.com/changshanzhao/JLU-wind.git
# else if you need 4 keypoint
git clone https://github.com/changshanzhao/YOLO-armor.git
# install requirements
pip install -r requirements.txt --user

export torchscript and convert it to ncnn via pnnx

# export to torchscript, result saved to yolov5s.torchscript
python export.py --weights yolov5s.pt --include torchscript

# download latest pnnx from https://github.com/pnnx/pnnx/releases
wget https://github.com/pnnx/pnnx/releases/download/20230217/pnnx-20230217-ubuntu.zip
unzip pnnx-20230217-ubuntu.zip

# convert torchscript to pnnx and ncnn, result saved to yolov5s.ncnn.param yolov5s.ncnn.bin
./pnnx-20230217-ubuntu/pnnx yolov5s.torchscript inputshape=[1,3,640,640]

ncnn inference

// load ncnn model, auto-enable vulkan acceleration
ncnn::Net yolov5;
yolov5.opt.use_vulkan_compute = true;
yolov5.load_param("yolov5s.ncnn.param");
yolov5.load_model("yolov5s.ncnn.bin");

const int target_size = 640;
const float prob_threshold = 0.25f;
const float nms_threshold = 0.45f;

// load image, resize and pad to 640x640
cv::Mat bgr = cv::imread("/home/nihui/nbs.jpg", CV_LOAD_IMAGE_COLOR);
const int img_w = bgr.cols;
const int img_h = bgr.rows;

// solve resize scale
int w = img_w;
int h = img_h;
float scale = 1.f;
if (w > h)
{
    scale = (float)target_size / w;
    w = target_size;
    h = h * scale;
}
else
{
    scale = (float)target_size / h;
    h = target_size;
    w = w * scale;
}

// construct ncnn::Mat from image pixel data, swap order from bgr to rgb
ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h);

// pad to target_size rectangle
const int wpad = target_size - w;
const int hpad = target_size - h;
ncnn::Mat in_pad;
ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);

// apply yolov5 pre process, that is to normalize 0~255 to 0~1
const float norm_vals[3] = { 1 / 255.f, 1 / 255.f, 1 / 255.f };
in_pad.substract_mean_normalize(0, norm_vals);

// yolov5 model inference
ncnn::Extractor ex = yolov5.create_extractor();
ex.input("in0", in_pad);
ncnn::Mat out;
ex.extract("out0", out);

// the out blob would be a 2-dim tensor with w=85 h=25200
//
//        |cx|cy|bw|bh|box score(1)| per-class scores(80) |
//         -- -- -- -- ------------ ---------------------- 
//        |53|50|70|80|    0.11    |0.1 0.0 0.0 0.5 ......|
//   all /|  |  |  |  |      .     |           .          |
//  boxes |46|40|38|44|    0.95    |0.0 0.9 0.0 0.0 ......|
// (25200)|  |  |  |  |      .     |           .          |
//       \|  |  |  |  |      .     |           .          |
//         -- -- -- -- ------------ ---------------------- 
//

implement yolov5 post-process in cpp

static inline int argmax(const float *ptr, int len) {
    int max_arg = 0;
    for (int i = 1; i < len; i  ) {
        if (ptr[i] > ptr[max_arg]) max_arg = i;
    }
    return max_arg;
}

template<class F, class T, class ...Ts>
T reduce(F &&func, T x, Ts... xs) {
    if constexpr (sizeof...(Ts) > 0){
        return func(x, reduce(std::forward<F>(func), xs...));
    } else {
        return x;
    }
}

template<class T, class ...Ts>
T reduce_max(T x, Ts... xs) {
    return reduce([](auto &&a, auto &&b){return std::max(a, b);}, x, xs...);
}

template<class T, class ...Ts>
T reduce_min(T x, Ts... xs) {
    return reduce([](auto &&a, auto &&b){return std::min(a, b);}, x, xs...);
}

struct Object
{
    int cls;
    int color;
    float confidence;
    cv::Point2f ppts[5];
    cv::Rect rect;
};

// enumerate all boxes
std::vector<Object> proposals;
for (int i = 0; i < out.h; i  )
{
    const float* ptr = out.row(i);

    const int num_class = 6;
	cv::Point2f ppts[5];
    ppts[0] = cv::Point2f(ptr[0],ptr[1]);
    ppts[1] = cv::Point2f(ptr[2],ptr[3]);
    ppts[2] = cv::Point2f(ptr[4],ptr[5]);
    ppts[3] = cv::Point2f(ptr[6],ptr[7]);
    ppts[4] = cv::Point2f(ptr[8],ptr[9]);
    
    cv::Rect2f bbox1;
    bbox1.x = reduce_min(ptr[0], ptr[2], ptr[4], ptr[6], ptr[8]);
    bbox1.y = reduce_min(ptr[1], ptr[3], ptr[5], ptr[7], ptr[9]);
    bbox1.width = reduce_max(ptr[0], ptr[2], ptr[4], ptr[6], ptr[8]) - bbox1.x;
    bbox1.height = reduce_max(ptr[1], ptr[3], ptr[5], ptr[7], ptr[9]) - bbox1.y;
    const float* box_scores = ptr   10;
	int color = argmax(ptr   11, 2);
    int cls = argmax(ptr   13, 3);
    // combined score = box score * class score
    float confidence = box_score;
	
    // filter candidate boxes with combined score >= prob_threshold
    if (confidence < prob_threshold)
        continue;

    // collect candidates
    Object obj;
	obj.confidence = confidence;
    obj.rect = bbox1;
    obj.color = color;
    obj.cls = cls;
	obj.ppts = ppts;
    proposals.push_back(obj);
}

// sort all candidates by score from highest to lowest
qsort_descent_inplace(proposals);

// apply non max suppression
std::vector<int> picked;
nms_sorted_bboxes(proposals, picked, nms_threshold);

// collect final result after nms
std::vector<Object> objects;
const int count = picked.size();
objects.resize(count);
for (int i = 0; i < count; i  )
{
    objects[i] = proposals[picked[i]];

    // adjust offset to original unpadded
    float x0 = (objects[i].rect.x - (wpad / 2)) / scale;
    float y0 = (objects[i].rect.y - (hpad / 2)) / scale;
    float x1 = (objects[i].rect.x   objects[i].rect.width - (wpad / 2)) / scale;
    float y1 = (objects[i].rect.y   objects[i].rect.height - (hpad / 2)) / scale;

    // clip
    x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
    y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
    x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
    y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);

    objects[i].rect.x = x0;
    objects[i].rect.y = y0;
    objects[i].rect.width = x1 - x0;
    objects[i].rect.height = y1 - y0;
}

// draw result on image
draw_objects(bgr, objects);

changshanzhao · 2023-08-26T03:51:59Z

changshanzhao
Aug 26, 2023
Author

2 replies

changshanzhao Aug 26, 2023
Author

可以达到的效果，关键点的回归精度还是挺可观的

Hideousmon Oct 9, 2023

RMer！ 😆

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial - deploy YOLOv5-keypoint with ncnn #4983

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Tutorial - deploy YOLOv5-keypoint with ncnn #4983

changshanzhao Aug 26, 2023

概要

setup yolov5-KeyPoint pytorch

export torchscript and convert it to ncnn via pnnx

ncnn inference

implement yolov5 post-process in cpp

Replies: 1 comment · 2 replies

changshanzhao Aug 26, 2023 Author

changshanzhao Aug 26, 2023 Author

Hideousmon Oct 9, 2023

changshanzhao
Aug 26, 2023

Replies: 1 comment 2 replies

changshanzhao
Aug 26, 2023
Author

changshanzhao Aug 26, 2023
Author