Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Euclidean distance transform (scipy.ndimage.distance_transform_edt) #7413

Merged
merged 11 commits into from
Jun 5, 2023

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Feb 17, 2023

closes #6919

cc @jakirkham, @gigony who helped review this previously for cuCIM. Given that this is part of the SciPy rather than scikit-image API, it seems that it could live in CuPy itself.

Background

This PR has an implementation of the Euclidean distance transform (corresponding to scipy.ndimage.distance_transform_edt). The kernels in pba_kernels_2d.h and pba_kernels_3d.h are adapted from MIT-licensed code in this repository. Details regarding the algorithm implementation are provided on the author's website. It is fairly complicated, so I tried to make sure the test cases are pretty extensive.

Details

The primary change to the C kernels here vs. that reference implementation are:

  • added a new function dominate_sp that can be used instead of dominate so that we can take into account floating point sample spacing. This is needed to implement the sampling kwarg from the SciPy API. (The original dominate function works with integer math, assuming spacing=1 on all axes)
  • being able to override the integer dtypes used. e.g. In the original 3D kernels there is a hard limit of size 1024 along each axis due to how three coordinates were packed into a single 32-bit integer. Here, the size can adjust to a larger dtype if necessary.

The set of kernels is different for the 2D and 3D cases so each has its own Python wrappers in _pba_2d.py and pba_3d.py, respectively. These mostly are concerned with:

  • selecting data types depending on image shape
  • selecting appropriate grid/block sizes for the kernel
  • pre/post processing steps, e.g. the raw CUDA kernels assume the size of the image is identical along all axes, so if this is not the case padding is used and then the output is cropped.

There are two CuPy-only keyword arguments introduced here that may need discussion.

  • It may be best to just not expose block_params in the API? (an advanced user could potentially fine tune to their hardware by changing this, but the default choice should be pretty reasonable).
  • The other kwarg is more generally useful. If float64_distances is set to False, then distance computations are done in 32-bit floating point math for efficiency. I left the default to True here to match SciPy's output dtype, but perhaps we should default to False?

One point of difference with SciPy is that when return_indices is True, the coordinates returned do not always match SciPy in the case of pixels that are equidistant from multiple background pixels. I think the result is equally valid and we will likely be unable to get an exact match. For that reason, not all points are checked for equality in tests involving the indices returned. The distances themselves do match SciPy to within floating point precision.

Benchmarks

Here are benchmarks vs. SciPy and ITK we did previously. These are for float64_distances=False and sampling=None.

The last two columns are the acceleration factors vs. scipy and (multi-threaded) ITK, respectively

shape return dist. return ind. SciPy ITK cuCIM accel. (CuPy vs SciPy) accel. (CuPy vs ITK)
256, 256 False True 0.00385 N/A 0.00021 18.49 N/A
256, 256 True False 0.00433 0.00370 0.00021 20.84 17.83
256, 256 True True 0.00436 N/A 0.00022 19.55 N/A
512, 512 False True 0.01724 N/A 0.00030 58.44 N/A
512, 512 True False 0.01893 0.00790 0.00029 66.42 27.73
512, 512 True True 0.01867 N/A 0.00030 62.65 N/A
2048, 2048 False True 0.33893 N/A 0.00154 220.25 N/A
2048, 2048 True False 0.40912 0.08219 0.00154 266.11 53.46
2048, 2048 True True 0.39056 N/A 0.00162 241.76 N/A
4096, 4096 False True 1.58069 N/A 0.00418 377.92 N/A
4096, 4096 True False 1.81860 0.55152 0.00418 435.36 132.03
4096, 4096 True True 1.80671 N/A 0.00448 402.85 N/A
32, 32, 32 False True 0.00319 N/A 0.00017 19.05 N/A
32, 32, 32 True False 0.00359 0.00514 0.00011 33.25 47.68
32, 32, 32 True True 0.00355 N/A 0.00019 18.37 N/A
128, 128, 128 False True 0.27217 N/A 0.00059 458.70 N/A
128, 128, 128 True False 0.31409 0.03384 0.00050 633.12 68.21
128, 128, 128 True True 0.31207 N/A 0.00066 474.26 N/A
256, 256, 256 False True 3.01718 N/A 0.00304 993.26 N/A
256, 256, 256 True False 3.32660 0.28979 0.00235 1418.07 123.53
256, 256, 256 True True 3.31874 N/A 0.00352 942.49 N/A
384, 384, 384 False True 10.67328 N/A 0.01000 1067.34 N/A
384, 384, 384 True False 11.69219 1.11274 0.00771 1517.46 144.42
384, 384, 384 True True 11.76073 N/A 0.01161 1013.03 N/A

These are adapted from the PBA  project at
[email protected]:orzzzjq/Parallel-Banding-Algorithm-plus.git

Changes have been made to support additional integer types and a variant
handling floating point pixel sizes was added
float64_distances=False is faster and uses less memory, but is inconsistent with SciPy
@emcastillo emcastillo self-assigned this Feb 20, 2023
@emcastillo emcastillo added cat:feature New features/APIs prio:medium labels Feb 20, 2023
@grlee77 grlee77 changed the title Add Euclidean distance transfrom (scipy.ndimage.distance_transform_edt) Add Euclidean distance transform (scipy.ndimage.distance_transform_edt) Feb 20, 2023
@emcastillo
Copy link
Member

/test mini

1 similar comment
@emcastillo
Copy link
Member

/test mini

@emcastillo
Copy link
Member

/test mini

@emcastillo
Copy link
Member

seems like it is not finding the header :(

FileNotFoundError: [Errno 2] No such file or directory: '/root/.local/lib/python3.11/site-packages/cupyx/scipy/ndimage/cuda/pba_kernels_2d.h'

@jakirkham
Copy link
Member

jakirkham commented May 19, 2023

Is it getting installed? If not, maybe the MANIFEST.in needs an update?

Edit: Or should they be added here?

cupy/setup.py

Lines 55 to 72 in 76906e5

# List of files that needs to be in the distribution (sdist/wheel).
# Notes:
# - Files only needed in sdist should be added to `MANIFEST.in`.
# - The following glob (`**`) ignores items starting with `.`.
cupy_package_data = [
'cupy/cuda/cupy_thrust.cu',
'cupy/cuda/cupy_cub.cu',
'cupy/cuda/cupy_cufftXt.cu', # for cuFFT callback
'cupy/cuda/cupy_cufftXt.h', # for cuFFT callback
'cupy/cuda/cupy_cufft.h', # for cuFFT callback
'cupy/cuda/cufft.pxd', # for cuFFT callback
'cupy/cuda/cufft.pyx', # for cuFFT callback
'cupy/random/cupy_distributions.cu',
'cupy/random/cupy_distributions.cuh',
] [
x for x in glob.glob('cupy/_core/include/cupy/**', recursive=True)
if os.path.isfile(x)
]

@emcastillo
Copy link
Member

emcastillo commented May 23, 2023

cupy/setup.py should be enough!

@jakirkham
Copy link
Member

@grlee77 could you please include the header files added here as items in the cupy_package_data list in setup.py?

@grlee77
Copy link
Contributor Author

grlee77 commented Jun 4, 2023

/test mini

@grlee77
Copy link
Contributor Author

grlee77 commented Jun 4, 2023

updated, please try running the tests again @emcastillo

@leofang
Copy link
Member

leofang commented Jun 4, 2023

/test mini

Copy link
Member

@emcastillo emcastillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@emcastillo emcastillo merged commit 9ac28dc into cupy:main Jun 5, 2023
@emcastillo emcastillo added this to the v13.0.0b1 milestone Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scipy.ndimage.distance_transform_edt (Euclidean distance transform)
4 participants