Package: rocsolver / 5.5.1-7~exp1

Metadata

Package Version Patches format
rocsolver 5.5.1-7~exp1 3.0 (quilt)

Patch series

view the series file
Patch File delta Description
0001 use local mathjax.patch | (download)

docs/source/conf.py | 3 3 0 - 0 !
1 file changed, 3 insertions( )

 use local mathjax

The sphinx.ext.mathjax extension defaults to loading mathjax from a
CDN, which results in the lintian warning 'privacy-breach-generic'.
Use a local copy of mathjax to prevent that problem.

0002 remove m2r2 dependency.patch | (download)

docs/source/attributions.rst | 3 2 1 - 0 !
docs/source/conf.py | 2 1 1 - 0 !
2 files changed, 3 insertions( ), 2 deletions(-)

 remove m2r2 dependency

The m2r2 (Markdown to ReStructured Text) package is not available in
Debian, but it's also not really needed.

0003 fmt compatibility.patch | (download)

clients/gtest/logging_gtest.cpp | 4 2 2 - 0 !
1 file changed, 2 insertions( ), 2 deletions(-)

 [patch] fix use of fmt 9.0 and later in logging test (#515)

The implicitly defined formatter that rocsolver was using for
std::filesystem::path has been removed from fmt 9.0 and later.

This change doesn't actually fix compatibility with the official
fmt 9.0 or 9.1 releases, because those releases are still
incompatible with HIP. However, the change is sufficient for using
rocsolver with fmt@9 from Spack (because the fix has been backported).

0004 hide kernel symbols.patch | (download)

library/src/include/lib_macros.hpp | 6 1 5 - 0 !
1 file changed, 1 insertion( ), 5 deletions(-)

 [patch] hide symbols for gpu kernels (#512)

The -fvisibility=hidden flag does not affect kernels and thus all
__global__ GPU functions become externally-visible symbols by default.
If a user defines a kernel with the same name and parameters as a
kernel that is defined in rocsolver, there will be a conflict.

There will not be any sort of compilation or linker error, but the
rocSOLVER definition of the function will be used instead of the user
definition. That is a problem, because our kernel function names do not
use any sort of prefix or namespace to prevent name conflicts with
functions defined by library users.

We have always built rocSOLVER in such a way that each translation unit
gets its own GPU code object in the final library binary. Unlike normal
template function instantiations, GPU template function instantiations
are not deduplicated by the linker. From the point of view of code
duplication, __global__ functions are already effectively static!

At some point, we should investigate whether rocSOLVER would benefit
from compiler options that combined the code objects to reduce the
amount of duplicated code in the library binary. For now, the default
behaviour means that there's no additional duplication introduced by
marking the GPU kernels as static 
0005 doxygen Add parent directory to inputs.patch | (download)

docs/Doxyfile | 2 1 1 - 0 !
1 file changed, 1 insertion( ), 1 deletion(-)

 doxygen: add parent directory to inputs

Needed for ../README.md and other inputs. Otherwise, this generates a
warning with newer doxygen, and build fails because of WARN_AS_ERROR.

Not forwarding, as documentation build has changed with ROCm 5.6, and
the issue should be reproduced there, first.

0006 rm immintrin include.patch | (download)

clients/rocblascommon/rocblas_math.hpp | 1 0 1 - 0 !
1 file changed, 1 deletion(-)

 rm immintrin include

Fix the build on ppc64el.

0007 fix reserved identifiers.patch | (download)

CONTRIBUTING.md | 3 1 2 - 0 !
library/include/rocsolver/rocsolver-extra-types.h | 8 4 4 - 0 !
library/include/rocsolver/rocsolver-functions.h | 8 4 4 - 0 !
library/include/rocsolver/rocsolver-version.h.in | 8 4 4 - 0 !
library/include/rocsolver/rocsolver.h | 6 3 3 - 0 !
5 files changed, 16 insertions( ), 17 deletions(-)

 [patch] fix reserved identifiers in include guards (#513)

The include guards have been changed to the filename in uppercase
letters with all non-alphanumeric symbols replaced by underscore.
This include guard pattern matches the guard that is used for the
generated file rocsolver-export.h.

There are two reasons for this:
1. The C and C   standards reserve all identifiers that begin with
   underscore followed by a capital letter [C99 7.1.3]
   [C  11 17.6.4.3.2].
2. In user-visible code, the rocSOLVER library should prevent name
   conflicts by only using identifiers that begin with `rocblas`,
   `ROCBLAS`, `rocsolver` or `ROCSOLVER`.

0008 check for hip errors.patch | (download)

clients/gtest/memory_model_gtest.cpp | 8 4 4 - 0 !
clients/include/testing_managed_malloc.hpp | 50 25 25 - 0 !
clients/rocblascommon/d_vector.hpp | 77 2 75 - 0 !
common/CMakeLists.txt | 5 1 4 - 0 !
common/include/common_host_helpers.hpp | 25 19 6 - 0 !
common/src/common_host_helpers.cpp | 20 17 3 - 0 !
library/src/lapack/roclapack_syevdx_heevdx_inplace.hpp | 11 8 3 - 0 !
library/src/lapack/roclapack_syevj_heevj.hpp | 12 9 3 - 0 !
library/src/lapack/roclapack_sygvdx_hegvdx_inplace.hpp | 11 8 3 - 0 !
9 files changed, 93 insertions( ), 126 deletions(-)

 [patch] check hip api return values on all calls (#493)

* Check HIP API return values on all calls

* Check for HIP errors in SYEVJ/HEEVJ

* Drop padding checks from d_vector

* Fix error logging on success

0009 verbose build of specialized kernels.patch | (download)

library/src/CMakeLists.txt | 3 3 0 - 0 !
1 file changed, 3 insertions( )

 verbose build of specialized kernels

These files take so long to build on slower ppc64el and arm64 systems
that the build may time out due to inactivity. This is typically when
building for ten or more GPU architectures. Add the verbose flag so that
there is output printed as the compiler finishes building a translation
unit for each architecture.

0010 drop f16c instructions.patch | (download)

clients/benchmarks/CMakeLists.txt | 2 0 2 - 0 !
clients/gtest/CMakeLists.txt | 2 0 2 - 0 !
2 files changed, 4 deletions(-)

 [patch] drop use of f16c instructions (#768)

The -mf16c flag is troublesome as those instructions may not be
available on older CPUs. The clang compiler also seems to emit
AVX instructions when it is told that it can use F16C.
The flag can be dropped without consequence as rocSOLVER does not use
half precision.

This was done for rocBLAS in c6bc09073959a2881a701b88ae1ed9de469354f1.

0011 fmt 10 support.patch | (download)

clients/benchmarks/client.cpp | 4 3 1 - 0 !
clients/rocblascommon/rocblas_test.hpp | 20 10 10 - 0 !
2 files changed, 13 insertions( ), 11 deletions(-)

 [patch] fix libfmt build errors (#828)

* Fix build errors with libfmt 10.2.1

* Use `fmt::print` instead of `std::cout`