Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Run error for nvidia v100 on branch-24.04 #2599

Open
mengyays opened this issue Mar 10, 2025 · 0 comments
Open

[BUG]Run error for nvidia v100 on branch-24.04 #2599

mengyays opened this issue Mar 10, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@mengyays
Copy link

Describe the bug
when building and runing raft test case "cpp/build/gtests/NEIGHBORS_ANN_CAGRA_TEST", reported this error report, as below list:
I run test case on nvidia v100 and cuda 11.8 , nvidia deriver 535.54

/code/raft-24.04-2/cpp/include/raft/neighbors/detail/ivf_flat_interleaved_scan-inl.cuh:756: void raft::neighbors::ivf_flat::detail::interleaved_scan_kernel(Lambda, PostLambda, unsigned int, const T *, const unsigned int *, const T *const *, const unsigned int *, unsigned int, unsigned int, unsigned int, unsigned int, const unsigned int *, unsigned int, IvfSampleFilterT, unsigned int *, float *) [with int Capacity = 256; int Veclen = 1; nv_bool Ascending = true; T = float; AccT = float; IdxT = signed long; IvfSampleFilterT = raft::neighbors::filtering::ivf_to_sample_filter<signed long, raft::neighbors::filtering::none_ivf_sample_filter>; Lambda = raft::neighbors::ivf_flat::detail::euclidean_dist<1, float, float>; PostLambda = raft::identity_op]: block: [0,131,0], thread: [126,0,0] Assertion sample_offset + list_length <= max_samples failed.
/code/raft-24.04-2/cpp/include/raft/neighbors/detail/ivf_flat_interleaved_scan-inl.cuh:756: void raft::neighbors::ivf_flat::detail::interleaved_scan_kernel(Lambda, PostLambda, unsigned int, const T *, const unsigned int *, const T *const *, const unsigned int *, unsigned int, unsigned int, unsigned int, unsigned int, const unsigned int *, unsigned int, IvfSampleFilterT, unsigned int *, float ) [with int Capacity = 256; int Veclen = 1; __nv_bool Ascending = true; T = float; AccT = float; IdxT = signed long; IvfSampleFilterT = raft::neighbors::filtering::ivf_to_sample_filter<signed long, raft::neighbors::filtering::none_ivf_sample_filter>; Lambda = raft::neighbors::ivf_flat::detail::euclidean_dist<1, float, float>; PostLambda = raft::identity_op]: block: [0,131,0], thread: [127,0,0] Assertion sample_offset + list_length <= max_samples failed.
CUDA Error detected. cudaErrorAssert device-side assert triggered
NEIGHBORS_ANN_CAGRA_TEST: /code/raft-24.04-2/cpp/build/_deps/rmm-src/include/rmm/mr/device/cuda_memory_resource.hpp:78: virtual void rmm::mr::cuda_memory_resource::do_deallocate(void
, std::size_t, rmm::cuda_stream_view): Assertion `status
== cudaSuccess' failed.

@mengyays mengyays added the bug Something isn't working label Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant