Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce DICOM write time #4164

Merged
merged 6 commits into from
Sep 16, 2024
Merged

Conversation

melissalinkert
Copy link
Member

Opening as a draft since this is a work in progress, with additional changes to both reading and writing planned.

4e81795 should reduce the total time to run bfconvert with DICOM output. This is just buffering the header writes, and does not change the tile writing. A simple test conversion using https://downloads.openmicroscopy.org/images/Vectra-QPTIFF/perkinelmer/PKI_scans/HandEcompressed_Scan1.qptiff:

bfconvert -no-upgrade -noflat -tilex 256 -tiley 256 -compression JPEG HandEcompressed_Scan1.qptiff test.dcm

is noticeably faster with 4e81795 when testing locally.

@melissalinkert melissalinkert marked this pull request as ready for review August 26, 2024 19:06
@melissalinkert melissalinkert added this to the 8.0.0 milestone Aug 26, 2024
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested using the set of sample files previously mentioned in #4190 and previous efforts to add pre-compressed support as well as the QPTIFF file mentioned in the PR description.

The conversion was executed with different number of options using both Bio-Formats 7.3.0 as the ground truth and this PR.

Source Target precompressed Compression Tile size Bio-Formats 7.3.0 Bio-Formats 8.0.0-SNAPSHOT
CMU-1.svs CMU-1.dcm no JPEG 227.903s 227.787s
CMU-1.svs CMU-1.dcm no JPEG-2000 679.377s 683.509s
CMU-1.svs CMU-1.dcm yes 78.793s 79.611s
CMU-1.svs CMU-1.dcm no JPEG 512x512 180.712s 183.06s
CMU-1.ndpi CMU-1.dcm no JPEG 378.301s 111.163s
CMU-1.ndpi CMU-1.dcm no JPEG-2000 939.062s 645.053s
CMU-1.ndpi CMU-1.dcm yes N/A 15.825s
CMU-1_0_0.dcm CMU-1.ome.tiff no JPEG 239.903s 132.872s
CMU-1_0_0.dcm CMU-1.ome.tiff no JPEG-2000 1288.689s 1230.261s
HandEcompressed_Scan1.qptiff HandEcompressed_Scan1.dcm no JPEG 256x256 262.394s 266.762s
HandEcompressed_Scan1.qptiff HandEcompressed_Scan1.dcm no JPEG 512x512 132.6s 135.336s

As can be seen in this table, this PR does not cause any regression but in most scenarios there does not seem to be any significant improvement in terms of writing performance. Conversion times are largely identical to 7.3.0 with the notable exception of the NDPI -> DICOM conversion which seems to be improved.

Especially when repacking tiles, many duplicate calls to this method can get expensive.
@melissalinkert
Copy link
Member Author

Thanks, @sbesson. The NDPI difference may be due to picking more suitable default tile sizes, i.e. https://github.com/ome/bioformats/pull/4181/files#diff-32978063b81116de867b85238d17dd0f0a9fcc06c1e1689cb64bae18b7e3ee02L317.

9946211 is a better try at reducing tile processing times. Trying CMU-1.svs and HandEcompressed_Scan1.qptiff locally with this change:

$ bfconvert -version
Version: 8.0.0-SNAPSHOT
Build date: 29 August 2024
VCS revision: 99462118cc80f1f8e9c6407f4f8825c82b301c24
$ time bfconvert -no-upgrade -noflat -compression JPEG HandEcompressed_Scan1.qptiff 4164/qptiff.dcm
HandEcompressed_Scan1.qptiff
VectraReader initializing HandEcompressed_Scan1.qptiff
Reading IFDs
Populating metadata
Populating OME metadata
[PerkinElmer Vectra/QPTIFF] -> 4164/qptiff.dcm [DICOM]
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 1: converted 1/1 planes (100%)
	Series 2: converted 1/1 planes (100%)
	Series 3: converted 1/1 planes (100%)
[done]
124.583s elapsed (354.25 14912.5ms per plane, 1735ms overhead)

real	2m5.605s
user	2m0.488s
sys	0m0.707s
$ time bfconvert -no-upgrade -noflat -compression JPEG CMU-1.svs 4164/cmu1.dcm
CMU-1.svs
SVSReader initializing CMU-1.svs
Reading IFDs
Populating metadata
Populating OME metadata
[Aperio SVS] -> 4164/cmu1.dcm [DICOM]
More than 4GB of pixel data, compression will need to be used
Tile size = 256 x 256
	Series 0: converted 1/1 planes (100%)
Tile size = 256 x 256
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 1: converted 1/1 planes (100%)
	Series 2: converted 1/1 planes (100%)
[done]
177.498s elapsed (196.2 34790.8ms per plane, 1809ms overhead)

real	2m58.588s
user	2m51.613s
sys	0m1.601s

and with current state of develop:

$ bfconvert -version
Version: 8.0.0-SNAPSHOT
Build date: 29 August 2024
VCS revision: 54cf1e8106592e9f3f44817d053e65db799dbaf9
$ time bfconvert -no-upgrade -noflat -compression JPEG HandEcompressed_Scan1.qptiff develop/qptiff.dcm
HandEcompressed_Scan1.qptiff
VectraReader initializing HandEcompressed_Scan1.qptiff
Reading IFDs
Populating metadata
Populating OME metadata
[PerkinElmer Vectra/QPTIFF] -> develop/qptiff.dcm [DICOM]
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
Tile size = 512 x 512
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 1: converted 1/1 planes (100%)
	Series 2: converted 1/1 planes (100%)
	Series 3: converted 1/1 planes (100%)
[done]
145.46s elapsed (358.125 17468.25ms per plane, 1961ms overhead)

real	2m26.473s
user	2m23.024s
sys	0m0.733s
$ time bfconvert -no-upgrade -noflat -compression JPEG CMU-1.svs develop/cmu1.dcm
CMU-1.svs
SVSReader initializing CMU-1.svs
Reading IFDs
Populating metadata
Populating OME metadata
[Aperio SVS] -> develop/cmu1.dcm [DICOM]
More than 4GB of pixel data, compression will need to be used
Tile size = 256 x 256
	Series 0: converted 1/1 planes (100%)
Tile size = 256 x 256
	Series 0: converted 1/1 planes (100%)
	Series 0: converted 1/1 planes (100%)
	Series 1: converted 1/1 planes (100%)
	Series 2: converted 1/1 planes (100%)
[done]
207.057s elapsed (218.2 40727.0ms per plane, 1630ms overhead)

real	3m28.065s
user	3m23.033s
sys	0m1.451s

which appears to be a ~15% improvement in both cases.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retested the set of sample files and conditions using the latest release of Bio-Formats 7.3.0, Bio-Formats 8.0.0-SNAPSHOT built from the current HEAD of develop 54cf1e8 and Bio-Formats 8.0.0-SNAPSHOT built from the HEAD of this Pull Request 9946211. Also validated that the output of these conversions read as expected.

Below is the table of the conversion times for the different samples and conditions:

Source Target precompressed Compression Tile size 7.3.0 54cf1e8 9946211
CMU-1.svs CMU-1.dcm no JPEG 226.751s 225.782s 202.474s
CMU-1.svs CMU-1.dcm no JPEG-2000 690.253s 684.681s 645.753s
CMU-1.svs CMU-1.dcm yes 78.024s 78.972s 82.928s
CMU-1.svs CMU-1.dcm no JPEG 512x512 181.503s 181.117s 157.338s
CMU-1.ndpi CMU-1.dcm no JPEG 379.319s 113.521s 113.802s
CMU-1.ndpi CMU-1.dcm no JPEG-2000 932.658s 647.821s 638.946s
CMU-1.ndpi CMU-1.dcm yes N/A 15.239s 16.066s
CMU-1_0_0.dcm CMU-1.ome.tiff no JPEG 238.852s 139.076s 135.982s
CMU-1_0_0.dcm CMU-1.ome.tiff no JPEG-2000 1315.766s 1180.745s 1132.103s
CMU-1_0_0.dcm CMU-1.ome.tiff yes JPEG 6.366s 6.747s
CMU-1_0_0.dcm CMU-1.ome.tiff yes JPEG-2000 15.124s 19.161s
HandEcompressed_Scan1.qptiff HandEcompressed_Scan1.dcm no JPEG 256x256 263.58s 266.288s 251.463s
HandEcompressed_Scan1.qptiff HandEcompressed_Scan1.dcm no JPEG 512x512 134.43s 137.138s 117.57s

This confirms that some of the performance improvement in the NDPI -> DICOM workflow noted in the previous review has been introduced prior to this PR in the development line.
Overall, the conversion times with this PR included are found to be either comparable or smaller than those without this PR included. Based on these numbers, I think it makes sense to include these changes in the upcoming release of Bio-Formats

@joshmoore
Copy link
Member

A quick thought: is there any possibility that the DEBUG-level logging of StopWatch could slow things down in a GUI application, e.g. ImageJ or QuPath?

@melissalinkert
Copy link
Member Author

In ImageJ, I see no issue - since the StopWatch logging is at DEBUG and the plugin uses INFO in a way that is not easy for a user to change (https://github.com/ome/bioformats/blob/develop/components/bio-formats-plugins/src/loci/plugins/LociImporter.java#L69), there shouldn't be any impact there.

In QuPath, I am having a hard time getting a test setup that includes this PR's changes, so haven't evaluated directly. As with ImageJ, the default log level is INFO, but this is easy to change. I think worst case would be someone has DEBUG/TRACE turned on and sees a problem, in which case we suggest turning it back to INFO.

If setting the StopWatches to TRACE instead of DEBUG would be preferable, I am fine with doing that.

@sbesson
Copy link
Member

sbesson commented Sep 13, 2024

@joshmoore any objection to merging this in the current state based on #4164 (comment) (before you going on the road for a few weeks)?

@joshmoore
Copy link
Member

No objections. Sorry. I didn't mean to hold this up.

@sbesson sbesson merged commit 2314528 into ome:develop Sep 16, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants