Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump actual_subset_size #2769

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

awayzjj
Copy link

@awayzjj awayzjj commented Jun 29, 2024

Changes

Dump actual_subset_size to ov.Model

Reason for changes

Inconsistencies arise when the dataset size is less than the provided or default 'subset_size'.

Related tickets

Closes: #2562

Tests

WIP

@awayzjj awayzjj requested a review from a team as a code owner June 29, 2024 03:31
@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Jun 29, 2024
@alexsu52 alexsu52 requested a review from l-bat July 1, 2024 10:29
@awayzjj
Copy link
Author

awayzjj commented Jul 9, 2024

@l-bat HI, could you please help me review the PR?

dataset_length = dataset.get_length()
if dataset_length:
return min(dataset_length, subset_size)
return subset_size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if __len()__ is not implemented, the actual subset size should be obtained as:

actual_subset_size = 0
for data in islice(calibration_dataset.get_inference_data(), subset_size):
    actual_subset_size  = 1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. I found that the suggested changes do not work because the calibration_dataset.get_inference_data() iterator is consumed before dump_parameters. As a result, actual_subset_size ends up being zero. Could you please provide some suggestions on how to handle this issue?
Thank you very much! @l-bat

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awayzjj after discussing within the team, we decided not to add actual_subset_size to rt_info because rt_info is intended to contain quantization parameters provided to algorithms. If the provided dataset is shorter than subset_size, we should display a warning during the statistics collection process:

description="Statistics collection",



@pytest.mark.parametrize("subset_size, expected_actual_subset_size", [[1, 1], [2, 1]])
def test_dump(subset_size, expected_actual_subset_size, tmp_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add 2 tests:

  1. __len()__ implemented
  2. __len()__ not implemented

@mlukasze
Copy link

hey @awayzjj will you continue work on this?

@awayzjj
Copy link
Author

awayzjj commented Jul 25, 2024

@mlukasze Yes, sorry for the delay, I will finish it this weekend :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Good First Issue][NNCF]: Dump actual_subset_size to ov.Model
3 participants