NVIDIA / cutlass Public

Notifications You must be signed in to change notification settings
Fork 962
Star 5.6k

Code
Issues 165
Pull requests 29
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: NVIDIA/cutlass

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

165 Open 968 Closed

Author

Filter by author

Label

Filter by label

Use alt click/return to exclude labels

or ⇧ click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[QST] Why tma_load.get_slice(0) here always need 0? ? - Needs Triage question

Question

#1929 opened Nov 8, 2024 by ziyuhuang123

[QST] Does CUTLASS 3.5.1 support int4 x float16 GEMMs natively? ? - Needs Triage question

Question

#1928 opened Nov 7, 2024 by SimpleTheoryOfTypes

[QST] Question Regarding To The Use Of Swizzle ? - Needs Triage question

Question

#1927 opened Nov 7, 2024 by Yanksi

[QST] Why did I get a wrong result from GemmGrouped? ? - Needs Triage question

Question

#1924 opened Nov 7, 2024 by WangNorthSea

[QST] Is there a Cutlass GEMM example to read inputs with custom padding? ? - Needs Triage question

Question

#1922 opened Nov 6, 2024 by ghostplant

[FEA] Better grid size for H100 GPU with SXM5 ? - Needs Triage feature request

New feature or request

#1921 opened Nov 6, 2024 by zhipeng93

[BUG] Cutlass python does not detect GPU ? - Needs Triage bug

Something isn't working

#1919 opened Nov 5, 2024 by IzanCatalan

[QST] Modifyinf a conv2d kernel and using it with python and pytorch ? - Needs Triage question

Question

#1918 opened Nov 5, 2024 by IzanCatalan

[BUG] TMA Cooperative GeMM with Stream-K scheduler hangs ? - Needs Triage bug

Something isn't working

#1917 opened Nov 4, 2024 by NihalPotdar

[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possible? ? - Needs Triage question

Question

#1915 opened Nov 3, 2024 by areddy2022

[QST] Understanding sgemm_sm80.cu with NVIDIA Nsight Compute ? - Needs Triage question

Question

#1914 opened Nov 2, 2024 by gohar94

[BUG] Unused variable ? - Needs Triage bug

Something isn't working

#1913 opened Oct 31, 2024 by r-barnes

[DOC]Need doc to migrate from cutlass::conv::kernel::DefaultConv2dFprop to cutlass::conv::kernel::ConvUniversal ? - Needs Triage documentation

Documentation

#1911 opened Oct 30, 2024 by chacha21

[QST] Inconsistency in Rounding Implementations: Round-to-Nearest for TFloat32 vs. Round-to-Nearest-Even for BFloat16 ? - Needs Triage question

Question

#1908 opened Oct 30, 2024 by shanliang1992

[QST]Synchronizing Threads Between Loading Q/K and V in WASP ? - Needs Triage question

Question

#1900 opened Oct 27, 2024 by ziyuhuang123

[QST]why 3090 get different result with 4090 or 3060 when call get<0>(tensor) ? - Needs Triage question

Question

#1898 opened Oct 25, 2024 by liuqi123123

how to use nvrtc to run a sm90 kernel ? - Needs Triage question

Question

#1889 opened Oct 21, 2024 by mengchihe

[QST] Would there have possibilty that kernel's perf differ between unittest and real model? ? - Needs Triage question

Question

#1888 opened Oct 21, 2024 by foreverlms

[QST] Why is there bank conflict in this simple layout?

#1882 opened Oct 17, 2024 by seanxwzhang

[BUG] CUTLASS 3.6 profiler doesn't read Instantiation Level that we pass in (Hopper SM90) ? - Needs Triage bug

Something isn't working

#1881 opened Oct 17, 2024 by tonyjie

[QST] Can we use Ampere architecture Sparse Tensor Operations through CuTe API? ? - Needs Triage question

Question

#1875 opened Oct 15, 2024 by hyx1999

[QST] Don't konw how to use predicate tensor. ? - Needs Triage question

Question

#1873 opened Oct 14, 2024 by ZhangZhiPku

[QST] Conv2D PyTorch Extension, leading wrong results. ? - Needs Triage question

Question

#1872 opened Oct 14, 2024 by sycz00

[BUG] Trying to optimize mixed input for kernels ? - Needs Triage bug

Something isn't working

#1868 opened Oct 14, 2024 by NihalPotdar

[QST] Incorrect matrix multiplication result with CuTe library ? - Needs Triage question

Question

#1866 opened Oct 12, 2024 by kimiwu0

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-10-08.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly