Skip to content

Commit

Permalink
models/public: new background matting models (openvinotoolkit#3054)
Browse files Browse the repository at this point in the history
* Add description for new models

* Minor fix

* Fix name of argument

Co-authored-by: Ekaterina Aidova <[email protected]>

* Update index.md

* Update device_support.md

* Fix model.yml

* Remove trailing space

* Update task types in model_downloader

* Fix link

* Fix input shapes

* Add AC configs and update documentation

* Add support of background-matting-v2 model

* Update

* Add another background-matting model

* Update

* Update documentation in AC

* Update index.md and name of model

* Fix annotation converter

* Update metrics

* Fix scale

* Fixes

* Pylint errors fix

* Add layout to inputs

Co-authored-by: Ekaterina Aidova <[email protected]>

* Add layout to inputs

Co-authored-by: Ekaterina Aidova <[email protected]>

* Fix year

Co-authored-by: Ekaterina Aidova <[email protected]>

* Update description in model.yml

* Rename to robust-video-matting-mobilenetv3

* Minor fix

Co-authored-by: Ekaterina Aidova <[email protected]>

* Bug fix

Co-authored-by: Ekaterina Aidova <[email protected]>

* Add output verification

* Pylint error fix

* Fix dataset_definitions.yml

* Fix converter

* Add link to demo

Co-authored-by: Anna Grebneva <[email protected]>

* Add link to demo

Co-authored-by: Anna Grebneva <[email protected]>

* Add models to demo and tests

* Fix framework

Co-authored-by: Ekaterina Aidova <[email protected]>

* Fix framework

Co-authored-by: Ekaterina Aidova <[email protected]>

* Fix model name in readme

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Fix reference metric

Co-authored-by: Anna Grebneva <[email protected]>

* Remove redundant metric from readme

Co-authored-by: Ekaterina Aidova <[email protected]>
Co-authored-by: Anna Grebneva <[email protected]>
  • Loading branch information
3 people authored Feb 15, 2022
1 parent 8011247 commit 29ab2a8
Show file tree
Hide file tree
Showing 25 changed files with 1,122 additions and 9 deletions.
15 changes: 15 additions & 0 deletions data/dataset_definitions.yml
Original file line number Diff line number Diff line change
@@ -1,4 1,19 @@
datasets:
- name: HumanMattingClips120
annotation_conversion:
converter: background_matting_sequential
images_dir: HumanMattingClips120
masks_dir: HumanMattingClips120
backgrounds_dir: HumanMattingClips120
image_prefix: com/
mask_prefix: fgr/
background_prefix: bgr/
with_background: True
with_alpha: True
annotation: human_matting_120.pickle
dataset_meta: human_matting_120_meta.json
data_source: HumanMattingClips120

- name: ms_coco_mask_rcnn
annotation_conversion:
converter: mscoco_mask_rcnn
Expand Down
2 changes: 1 addition & 1 deletion demos/background_subtraction_demo/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 78,7 @@ omz_converter --list models.lst
* instance-segmentation-person-????
* yolact-resnet50-fpn-pytorch
* background-matting-mobilenetv2
* robust-video-matting
* robust-video-matting-mobilenetv3

> **NOTE**: Refer to the tables [Intel's Pre-Trained Models Device Support](../../../models/intel/device_support.md) and [Public Pre-Trained Models Device Support](../../../models/public/device_support.md) for the details on models inference support at different devices.
Expand Down
4 changes: 2 additions & 2 deletions demos/background_subtraction_demo/python/models.lst
Original file line number Diff line number Diff line change
@@ -1,5 1,5 @@
# This file can be used with the --list option of the model downloader.
instance-segmentation-person-????
yolact-resnet50-fpn-pytorch
# TODO: background-matting-mobilenetv2
# TODO: robust-video-matting
background-matting-mobilenetv2
robust-video-matting-mobilenetv3
4 changes: 2 additions & 2 deletions demos/tests/cases.py
Original file line number Diff line number Diff line change
Expand Up @@ -717,8 717,8 @@ def single_option_cases(key, *args):
}),
single_option_cases('-m',
ModelArg('instance-segmentation-person-0007'),
# ModelArg('robust-video-matting'),
# ModelArg('background-matting-mobilenetv2'),
ModelArg('robust-video-matting-mobilenetv3'),
ModelArg('background-matting-mobilenetv2'),
ModelArg('yolact-resnet50-fpn-pytorch')),
)),

Expand Down
139 changes: 139 additions & 0 deletions models/public/background-matting-mobilenetv2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 1,139 @@
# background-matting-mobilenetv2

## Use Case and High-Level Description

The `background-matting-mobilenetv2` model is a high-resolution background replacement technique based on
background matting (with MobileNetV2 backbone), where an additional frame of the background is
captured and used in recovering the alpha matte and the foreground layer. This model is
pre-trained in PyTorch\* framework and converted to ONNX\* format. More details provided in
the [paper](https://arxiv.org/abs/2012.07810).
For details see the [repository](https://github.com/PeterL1n/BackgroundMattingV2).
For details regarding export to ONNX see [here](https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/export_onnx.py).

## Specification

| Metric | Value |
|---------------------------------|-------------------------------------------|
| Type | Background_matting |
| GFlops | 6.7419 |
| MParams | 5.052 |
| Source framework | PyTorch\* |

## Accuracy

Accuracy measured on a dataset composed with foregrounds from the HumanMatting dataset and backgrounds from the OpenImagesV5 one with input resolution 1280x720.

| Metric | Original model | Converted model |
| -------------- | -------------- | --------------- |
| Alpha MAD | 4.32 | 4.35 |
| Alpha MSE | 1.0 | 1.0 |
| Alpha GRAD | 2.48 | 2.49 |
| Foreground MSE | 2.7 | 2.69 |

* Alpha MAD - mean of absolute difference for alpha.
* Alpha MSE - mean squared error for alpha.
* Alpha GRAD - spatial-gradient metric for alpha.
* Foreground MSE - mean squared error for foreground.

## Input

### Original Model

Image, name: `src`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Expected color order: `RGB`.
scale factor: 255

Image, name: `bgr`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Expected color order: `RGB`.
scale factor: 255

### Converted Model

Image, name: `src`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Expected color order: `BGR`.

Image, name: `bgr`, shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Expected color order: `BGR`.

## Output

### Original model

Alpha matte. Name: `pha` Shape: `1, 1, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Foreground. Name: `fgr` Shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

### Converted model

Alpha matte. Name: `pha` Shape: `1, 1, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

Foreground. Name: `fgr` Shape: `1, 3, 720, 1280`, format: `B, C, H, W`, where:

- `B` - batch size
- `C` - number of channels
- `H` - image height
- `W` - image width

## Download a Model and Convert it into Inference Engine Format

You can download models and if necessary convert them into Inference Engine format using the [Model Downloader and other automation tools](../../../tools/model_tools/README.md) as shown in the examples below.

An example of using the Model Downloader:
```
omz_downloader --name <model_name>
```

An example of using the Model Converter:
```
omz_converter --name <model_name>
```

## Demo usage

The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:

* [Background subtraction Python\* Demo](../../../demos/background_subtraction_demo/python/README.md)

## Legal Information

The original model is distributed under the
[MIT License](https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/LICENSE).
54 changes: 54 additions & 0 deletions models/public/background-matting-mobilenetv2/accuracy-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 1,54 @@
models:
- name: background-matting-mobilenetv2

launchers:
- framework: openvino
adapter:
type: background_matting_with_pha_and_fgr
alpha_out: pha
foreground_out: fgr
inputs:
- name: src
type: INPUT
value: com*
- name: bgr
type: INPUT
value: bgr*

datasets:
- name: HumanMattingClips120
reader: pillow_imread

preprocessing:
- type: rgb_to_bgr
- type: resize
dst_width: 1280
dst_height: 720
use_pillow: True

metrics:
- name: alpha_MAD
type: mad
prediction_source: pha
process_type: alpha
reference: 4.35

- name: alpha_GRAD
type: spatial_gradient
prediction_source: pha
process_type: alpha
reference: 2.49

- name: alpha_MSE
type: mse_with_mask
prediction_source: pha
process_type: alpha
use_mask: False
reference: 1.0

- name: foreground_MSE
type: mse_with_mask
prediction_source: fgr
process_type: image
use_mask: True
reference: 2.69
38 changes: 38 additions & 0 deletions models/public/background-matting-mobilenetv2/model.yml
Original file line number Diff line number Diff line change
@@ -0,0 1,38 @@
# Copyright (c) 2022 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

description: >-
The "background-matting-mobilenetv2" model is a high-resolution background replacement
technique based on background matting (with MobileNetV2 backbone), where an additional
frame of the background is captured and used in recovering the alpha matte and the
foreground layer. This model is pre-trained in PyTorch* framework and converted
to ONNX* format. More details provided in the paper <https://arxiv.org/abs/2012.07810>.
For details see the repository <https://github.com/PeterL1n/BackgroundMattingV2>.
For details regarding export to ONNX see here <https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/export_onnx.py>.
task_type: background_matting
files:
- name: bgm_mobilenetv2.onnx
size: 20006715
checksum: 3fc7bad659149e3e9e0feb4ef531c177ab158e5bbcfdabba9952c3832053f353d23560ff488aeae857245734ef639e6b
source: https://github.com/DmitriySidnev/BackgroundMattingV2/raw/master/onnx/bgm_mobilenetv2.onnx
model_optimizer_args:
- --input_shape=[1,3,720,1280],[1,3,720,1280]
- --input=src,bgr
- --layout=src(NCHW),bgr(NCHW)
- --output=pha,fgr
- --scale=255
- --reverse_input_channels
- --input_model=$dl_dir/bgm_mobilenetv2.onnx
framework: onnx
license: https://github.com/DmitriySidnev/BackgroundMattingV2/blob/master/LICENSE
2 changes: 2 additions & 0 deletions models/public/device_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 6,7 @@
| aclnet-int8 | YES | YES | |
| alexnet | YES | YES | YES |
| anti-spoof-mn3 | YES | YES | YES |
| background-matting-mobilenetv2 | YES | | |
| bert-base-ner | YES | YES | |
| brain-tumor-segmentation-0001 | YES | | |
| brain-tumor-segmentation-0002 | YES | | |
Expand Down Expand Up @@ -103,6 104,7 @@
| retinanet-tf | YES | YES | |
| rexnet-v1-x1.0 | YES | YES | |
| rfcn-resnet101-coco-tf | YES | YES | YES |
| robust-video-matting-mobilenetv3 | YES | YES | |
| se-inception | YES | YES | YES |
| se-resnet-50 | YES | YES | YES |
| se-resnext-50 | YES | YES | YES |
Expand Down
11 changes: 11 additions & 0 deletions models/public/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 319,17 @@ Named entity recognition (NER) is the task of tagging entities in text with thei
| -------------- | -------------- | ------------------------------------------------------ | -------- | ------- | -------- |
| vehicle-reid-0001 | PyTorch\* | [vehicle-reid-0001](./vehicle-reid-0001/README.md) | 96.31%/85.15 % | 2.643 | 2.183 |

## Background matting

Background matting is a method of separating a foreground from a background in an image or video,
wherein some pixels may belong to foreground as well as background, such pixels are called partial
or mixed pixels. This distinguishes background matting from segmentation approaches where the result is a binary mask.

| Model Name | Implementation | OMZ Model Name | Accuracy | GFlops | mParams |
| -------------- | -------------- | ------------------------------------------------------ | -------- | ------- | -------- |
| background-matting-mobilenetv2 | PyTorch\* | [background-matting-v2](./background-matting-mobilenetv2/README.md) | 4.32/1.0/2.48/2.7 | 6.7419 | 5.052 |
| robust-video-matting-mobilenetv3 | PyTorch\* | [robust-video-matting-mobilenetv3](./robust-video-matting-mobilenetv3/README.md) | 20.8/15.1/4.42/4.05 | 9.3892 | 3.7363 |

## See Also

* [Open Model Zoo Demos](../../demos/README.md)
Expand Down
Loading

0 comments on commit 29ab2a8

Please sign in to comment.