models/public: new background matting models (openvinotoolkit#3054)

* Add description for new models * Minor fix * Fix name of argument Co-authored-by: Ekaterina Aidova <[email protected]> * Update index.md * Update device_support.md * Fix model.yml * Remove trailing space * Update task types in model_downloader * Fix link * Fix input shapes * Add AC configs and update documentation * Add support of background-matting-v2 model * Update * Add another background-matting model * Update * Update documentation in AC * Update index.md and name of model * Fix annotation converter * Update metrics * Fix scale * Fixes * Pylint errors fix * Add layout to inputs Co-authored-by: Ekaterina Aidova <[email protected]> * Add layout to inputs Co-authored-by: Ekaterina Aidova <[email protected]> * Fix year Co-authored-by: Ekaterina Aidova <[email protected]> * Update description in model.yml * Rename to robust-video-matting-mobilenetv3 * Minor fix Co-authored-by: Ekaterina Aidova <[email protected]> * Bug fix Co-authored-by: Ekaterina Aidova <[email protected]> * Add output verification * Pylint error fix * Fix dataset_definitions.yml * Fix converter * Add link to demo Co-authored-by: Anna Grebneva <[email protected]> * Add link to demo Co-authored-by: Anna Grebneva <[email protected]> * Add models to demo and tests * Fix framework Co-authored-by: Ekaterina Aidova <[email protected]> * Fix framework Co-authored-by: Ekaterina Aidova <[email protected]> * Fix model name in readme * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Fix reference metric Co-authored-by: Anna Grebneva <[email protected]> * Remove redundant metric from readme Co-authored-by: Ekaterina Aidova <[email protected]> Co-authored-by: Anna Grebneva <[email protected]>
kminemur · Feb 15, 2022 · 29ab2a8 · 29ab2a8
1 parent 8011247
commit 29ab2a8
Show file tree

Hide file tree

Showing 25 changed files with 1,122 additions and 9 deletions.
diff --git a/data/dataset_definitions.yml b/data/dataset_definitions.yml
@@ -1,4  1,19 @@
 datasets:
   - name: HumanMattingClips120
     annotation_conversion:
       converter: background_matting_sequential
       images_dir: HumanMattingClips120
       masks_dir: HumanMattingClips120
       backgrounds_dir: HumanMattingClips120
       image_prefix: com/
       mask_prefix: fgr/
       background_prefix: bgr/
       with_background: True
       with_alpha: True
     annotation: human_matting_120.pickle
     dataset_meta: human_matting_120_meta.json
     data_source: HumanMattingClips120
 
   - name: ms_coco_mask_rcnn
     annotation_conversion:
       converter: mscoco_mask_rcnn

diff --git a/demos/background_subtraction_demo/python/README.md b/demos/background_subtraction_demo/python/README.md
@@ -78,7  78,7 @@ omz_converter --list models.lst
 * instance-segmentation-person-????
 * yolact-resnet50-fpn-pytorch
 * background-matting-mobilenetv2
-* robust-video-matting
 * robust-video-matting-mobilenetv3
 
 > **NOTE**: Refer to the tables [Intel's Pre-Trained Models Device Support](../../../models/intel/device_support.md) and [Public Pre-Trained Models Device Support](../../../models/public/device_support.md) for the details on models inference support at different devices.
 

diff --git a/demos/background_subtraction_demo/python/models.lst b/demos/background_subtraction_demo/python/models.lst
@@ -1,5  1,5 @@
 # This file can be used with the --list option of the model downloader.
 instance-segmentation-person-????
 yolact-resnet50-fpn-pytorch
-# TODO: background-matting-mobilenetv2
-# TODO: robust-video-matting
 background-matting-mobilenetv2
 robust-video-matting-mobilenetv3
diff --git a/demos/tests/cases.py b/demos/tests/cases.py
@@ -717,8  717,8 @@ def single_option_cases(key, *args):
         }),
         single_option_cases('-m',
             ModelArg('instance-segmentation-person-0007'),
-    #       ModelArg('robust-video-matting'),
-    #       ModelArg('background-matting-mobilenetv2'),
             ModelArg('robust-video-matting-mobilenetv3'),
             ModelArg('background-matting-mobilenetv2'),
             ModelArg('yolact-resnet50-fpn-pytorch')),
     )),
 

diff --git a/models/public/background-matting-mobilenetv2/README.md b/models/public/background-matting-mobilenetv2/README.md
@@ -0,0  1,139 @@
 # background-matting-mobilenetv2
 
 ## Use Case and High-Level Description
 
 The `background-matting-mobilenetv2` model is a high-resolution background replacement technique based on
 background matting (with MobileNetV2 backbone), where an additional frame of the background is
 captured and used in recovering the alpha matte and the foreground layer. This model is
 pre-trained in PyTorch\* framework and converted to ONNX\* format. More details provided in
 the [paper](https://arxiv.org/abs/2012.07810).
 For details see the [repository](https://github.com/PeterL1n/BackgroundMattingV2).
 For details regarding export to ONNX see [here](https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/export_onnx.py).
 
 ## Specification
 
 | Metric                          | Value                                     |
 |---------------------------------|-------------------------------------------|
 | Type                            | Background_matting                        |
 | GFlops                          | 6.7419                                    |
 | MParams                         | 5.052                                     |
 | Source framework                | PyTorch\*                                 |
 
 ## Accuracy
 
 Accuracy measured on a dataset composed with foregrounds from the HumanMatting dataset and backgrounds from the OpenImagesV5 one with input resolution 1280x720.
 
 | Metric         | Original model | Converted model |
 | -------------- | -------------- | --------------- |
 | Alpha MAD      | 4.32           | 4.35            |
 | Alpha MSE      | 1.0            | 1.0             |
 | Alpha GRAD     | 2.48           | 2.49            |
 | Foreground MSE | 2.7            | 2.69            |
 
 * Alpha MAD - mean of absolute difference for alpha.
 * Alpha MSE - mean squared error for alpha.
 * Alpha GRAD - spatial-gradient metric for alpha.
 * Foreground MSE - mean squared error for foreground.
 
 ## Input
 
 ### Original Model
 
 Image, name: `src`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Expected color order: `RGB`.
 scale factor: 255
 
 Image, name: `bgr`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Expected color order: `RGB`.
 scale factor: 255
 
 ### Converted Model
 
 Image, name: `src`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Expected color order: `BGR`.
 
 Image, name: `bgr`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Expected color order: `BGR`.
 
 ## Output
 
 ### Original model
 
 Alpha matte. Name: `pha` Shape: `1, 1, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Foreground. Name: `fgr` Shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 ### Converted model
 
 Alpha matte. Name: `pha` Shape: `1, 1, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 Foreground. Name: `fgr` Shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:
 
 - `B` - batch size
 - `C` - number of channels
 - `H` - image height
 - `W` - image width
 
 ## Download a Model and Convert it into Inference Engine Format
 
 You can download models and if necessary convert them into Inference Engine format using the [Model Downloader and other automation tools](../../../tools/model_tools/README.md) as shown in the examples below.
 
 An example of using the Model Downloader:
 ```
 omz_downloader --name <model_name>
 ```
 
 An example of using the Model Converter:
 ```
 omz_converter --name <model_name>
 ```
 
 ## Demo usage
 
 The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
 
 * [Background subtraction Python\* Demo](../../../demos/background_subtraction_demo/python/README.md)
 
 ## Legal Information
 
 The original model is distributed under the
 [MIT License](https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/LICENSE).
diff --git a/models/public/background-matting-mobilenetv2/accuracy-check.yml b/models/public/background-matting-mobilenetv2/accuracy-check.yml
@@ -0,0  1,54 @@
 models:
   - name: background-matting-mobilenetv2
 
     launchers:
       - framework: openvino
         adapter:
           type: background_matting_with_pha_and_fgr
           alpha_out: pha
           foreground_out: fgr
         inputs:
           - name: src
             type: INPUT
             value: com*
           - name: bgr
             type: INPUT
             value: bgr*
 
     datasets:
       - name: HumanMattingClips120
         reader: pillow_imread
 
         preprocessing:
           - type: rgb_to_bgr
           - type: resize
             dst_width: 1280
             dst_height: 720
             use_pillow: True
 
         metrics:
           - name: alpha_MAD
             type: mad
             prediction_source: pha
             process_type: alpha
             reference: 4.35
 
           - name: alpha_GRAD
             type: spatial_gradient
             prediction_source: pha
             process_type: alpha
             reference: 2.49
 
           - name: alpha_MSE
             type: mse_with_mask
             prediction_source: pha
             process_type: alpha
             use_mask: False
             reference: 1.0
 
           - name: foreground_MSE
             type: mse_with_mask
             prediction_source: fgr
             process_type: image
             use_mask: True
             reference: 2.69
diff --git a/models/public/background-matting-mobilenetv2/model.yml b/models/public/background-matting-mobilenetv2/model.yml
@@ -0,0  1,38 @@
 # Copyright (c) 2022 Intel Corporation
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #      http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
 description: >-
   The "background-matting-mobilenetv2" model is a high-resolution background replacement
   technique based on background matting (with MobileNetV2 backbone), where an additional
   frame of the background is captured and used in recovering the alpha matte and the
   foreground layer. This model is pre-trained in PyTorch* framework and converted
   to ONNX* format. More details provided in the paper <https://arxiv.org/abs/2012.07810>.
   For details see the repository <https://github.com/PeterL1n/BackgroundMattingV2>.
   For details regarding export to ONNX see here <https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/export_onnx.py>.
 task_type: background_matting
 files:
   - name: bgm_mobilenetv2.onnx
     size: 20006715
     checksum: 3fc7bad659149e3e9e0feb4ef531c177ab158e5bbcfdabba9952c3832053f353d23560ff488aeae857245734ef639e6b
     source: https://github.com/DmitriySidnev/BackgroundMattingV2/raw/master/onnx/bgm_mobilenetv2.onnx
 model_optimizer_args:
   - --input_shape=[1,3,720,1280],[1,3,720,1280]
   - --input=src,bgr
   - --layout=src(NCHW),bgr(NCHW)
   - --output=pha,fgr
   - --scale=255
   - --reverse_input_channels
   - --input_model=$dl_dir/bgm_mobilenetv2.onnx
 framework: onnx
 license: https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/LICENSE
diff --git a/models/public/device_support.md b/models/public/device_support.md
@@ -6,6  6,7 @@
 | aclnet-int8 | YES | YES |    |
 | alexnet | YES | YES | YES |
 | anti-spoof-mn3 | YES | YES | YES |
 | background-matting-mobilenetv2 | YES |    |    |
 | bert-base-ner | YES | YES |    |
 | brain-tumor-segmentation-0001 | YES |    |    |
 | brain-tumor-segmentation-0002 | YES |    |    |
@@ -103,6  104,7 @@
 | retinanet-tf | YES | YES |    |
 | rexnet-v1-x1.0 | YES | YES |    |
 | rfcn-resnet101-coco-tf | YES | YES | YES |
 | robust-video-matting-mobilenetv3 | YES | YES |    |
 | se-inception | YES | YES | YES |
 | se-resnet-50 | YES | YES | YES |
 | se-resnext-50 | YES | YES | YES |

diff --git a/models/public/index.md b/models/public/index.md
@@ -319,6  319,17 @@ Named entity recognition (NER) is the task of tagging entities in text with thei
 | -------------- | -------------- | ------------------------------------------------------ | -------- | ------- | -------- |
 | vehicle-reid-0001 | PyTorch\* | [vehicle-reid-0001](./vehicle-reid-0001/README.md) | 96.31%/85.15 % | 2.643 | 2.183 |
 
 ## Background matting
 
 Background matting is a method of separating a foreground from a background in an image or video,
 wherein some pixels may belong to foreground as well as background, such pixels are called partial
 or mixed pixels. This distinguishes background matting from segmentation approaches where the result is a binary mask.
 
 | Model Name     | Implementation | OMZ Model Name                                         | Accuracy | GFlops  | mParams  |
 | -------------- | -------------- | ------------------------------------------------------ | -------- | ------- | -------- |
 | background-matting-mobilenetv2 | PyTorch\* | [background-matting-v2](./background-matting-mobilenetv2/README.md) | 4.32/1.0/2.48/2.7 | 6.7419 | 5.052 |
 | robust-video-matting-mobilenetv3 | PyTorch\* | [robust-video-matting-mobilenetv3](./robust-video-matting-mobilenetv3/README.md) | 20.8/15.1/4.42/4.05 | 9.3892 | 3.7363 |
 
 ## See Also
 
 * [Open Model Zoo Demos](../../demos/README.md)