Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failure to build cuhornet with rmm #1390

Closed
wolfram77 opened this issue Nov 26, 2023 · 10 comments
Closed

[BUG] Failure to build cuhornet with rmm #1390

wolfram77 opened this issue Nov 26, 2023 · 10 comments
Labels
bug Something isn't working doc Documentation

Comments

@wolfram77
Copy link

I am trying to build cuhornet graph framework, which uses rmm as a dependency.I get the following error. It used to work earlier, so i can confirm this is a regression.

In file included from /home/graphwork/Documents/Test/rapidsai--cuhornet/externals/rmm/include/rmm/exec_policy.hpp:24,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/include/Core/HornetDevice/../SoA/impl/SoAPtr.i.cuh:37,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/include/Core/HornetDevice/../SoA/SoAPtr.cuh:249,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/include/Core/HornetDevice/HornetDevice.cuh:40,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/include/Core/Hornet.cuh:41,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/include/Hornet.hpp:4,
                 from /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/test/HornetDeleteTest.cu:1:
/home/graphwork/Documents/Test/rapidsai--cuhornet/externals/rmm/include/rmm/cuda_stream_view.hpp:23:10: fatal error: cuda/stream_ref: No such file or directory
   23 | #include <cuda/stream_ref>
      |          ^~~~~~~~~~~~~~~~~
compilation terminated.

Below are some details from CMake:

-- The CXX compiler identification is GNU 9.4.0
-- The CUDA compiler identification is NVIDIA 11.4.152
--- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "11.4")
-- CMAKE selecting appropriate gencodes for x86 or ppc64 CPU architectures
-- Building for GPU_ARCHS = 60;70;75;80;86
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- RMM: RMM_INCLUDE set to /home/graphwork/Documents/Test/rapidsai--cuhornet/externals/rmm/include
-- Using Nvidia Tools Extension
-- Configuring done (1.9s)
-- Generating done (0.0s)
-- Build files have been written to: /home/graphwork/Documents/Test/rapidsai--cuhornet/hornet/build

I am building this using the following script (on my fork of cuhornet):

#!/usr/bin/env bash
src="rapidsai--cuhornet"
out="$HOME/Logs/$src.log"
ulimit -s unlimited
printf "" > "$out"

# Download source code
if [[ "$DOWNLOAD" != "0" ]]; then
  rm -rf $src
  git clone --recursive https://github.com/wolfram77/$src
  mkdir -p $src/externals
  cd $src/externals
  git clone --recursive https://github.com/rapidsai/rmm
  cd ../..
fi
cd $src

# Build and run
export RMM_INCLUDE=$PWD/externals/rmm/include
cd hornet/build
cmake -DRMM_INCLUDE=$RMM_INCLUDE ..
make -j32
cd ../..
cd hornetsnest/build
cmake -DRMM_INCLUDE=$RMM_INCLUDE ..
make -j32
@wolfram77 wolfram77 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Nov 26, 2023
@wence-
Copy link
Contributor

wence- commented Nov 27, 2023

With the merge of #1095, RMM now depends on libcudacxx as well as (as previously) thrust and cub. If you include RMM in your cmake project as recommended here https://github.com/rapidsai/rmm#using-rmm-in-a-downstream-cmake-project then if an appropriate libcudacxx version is not found, cmake will pull one in.

From a cursory glance at the cuhornet cmake setup, it looks like it just points at $RMM_INCLUDE and treats it as a header-only library that has all its dependencies already installed. This will work, but now it is on you as the person using RMM to ensure all dependencies are in place.

However, I do note that the requirement for libcudacxx (and indeed, what version restrictions there are, if any) is not noted in the README. @miscco what minimum versions do we need? It looks like the cmake setup you added pulls in trunk?

@wence- wence- added doc Documentation and removed ? - Needs Triage Need team to review and classify labels Nov 27, 2023
@miscco
Copy link
Contributor

miscco commented Nov 27, 2023

@wence- We should require 2.1.0 with two patches applied, which happens through the rapids-cmake pull

@wence-
Copy link
Contributor

wence- commented Nov 27, 2023

Hmm, I think we should figure out a way to call that out in the installation documentation, since that is a somewhat different requirement from what (I would think) the typical "header only" library usage looks like. I suppose we do kind of hint at this because we say "here are the pre-reqs, please then install RMM via cmake", but perhaps we should also indicate that if you don't do that, there's a lot more that you to do to replicate the pulling in of transitive deps.

@wolfram77
Copy link
Author

Should i install rmm (I am new to cmake)?

I downloaded the latest 23.10.00 release of rmm, get the following error when building:

[  0%] Building CXX object _deps/spdlog-build/CMakeFiles/spdlog.dir/src/spdlog.cpp.o
In file included from /home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/spdlog.h:12,
                 from /home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/spdlog-inl.h:7,
                 from /home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/src/spdlog.cpp:8:
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:168:111: error: ‘basic_runtime’ is not a member of ‘fmt’
  168 |           std::is_convertible<T, fmt::basic_string_view<Char>>::value || std::is_same<remove_cvref_t<T>, fmt::basic_runtime<Char>>::value>
      |                                                                                                               ^~~~~~~~~~~~~
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:168:111: error: ‘basic_runtime’ is not a member of ‘fmt’
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:168:125: error: template argument 2 is invalid
  168 |           std::is_convertible<T, fmt::basic_string_view<Char>>::value || std::is_same<remove_cvref_t<T>, fmt::basic_runtime<Char>>::value>
      |                                                                                                                             ^~~~
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:168:138: error: expected ‘{’ before ‘>’ token
  168 |           std::is_convertible<T, fmt::basic_string_view<Char>>::value || std::is_same<remove_cvref_t<T>, fmt::basic_runtime<Char>>::value>
      |                                                                                                                                          ^
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h: In instantiation of ‘struct spdlog::is_convertible_to_any_format_string<const fmt::v10::basic_string_view<char>&>’:
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/logger.h:106:47:   required by substitution of ‘template<class T, typename std::enable_if<(! spdlog::is_convertible_to_any_format_string<const T&>::value), int>::type <anonymous> > void spdlog::logger::log(spdlog::source_loc, spdlog::level::level_enum, const T&) [with T = fmt::v10::basic_string_view<char>; typename std::enable_if<(! spdlog::is_convertible_to_any_format_string<const T&>::value), int>::type <anonymous> = <missing>’
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/logger.h:140:35:   required from here
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:188:129: error: incomplete type ‘spdlog::is_convertible_to_basic_format_string<const fmt::v10::basic_string_view<char>&, char>’ used in nested name specifier
  188 | struct is_convertible_to_any_format_string : std::integral_constant<bool, is_convertible_to_basic_format_string<T, char>::value ||
      |                                                                                                                           ~~~~~~^~
  189 |                                                                               is_convertible_to_basic_format_string<T, wchar_t>::value>
      |                                                                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/graphwork/Downloads/rmm-23.10.00/build/_deps/spdlog-src/include/spdlog/common.h:188:129: error: incomplete type ‘spdlog::is_convertible_to_basic_format_string<const fmt::v10::basic_string_view<char>&, wchar_t>’ used in nested name specifier
make[2]: *** [_deps/spdlog-build/CMakeFiles/spdlog.dir/build.make:76: _deps/spdlog-build/CMakeFiles/spdlog.dir/src/spdlog.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1113: _deps/spdlog-build/CMakeFiles/spdlog.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

I am able to build cuhornet if I use rmm-23.08.00 from releases.

@harrism
Copy link
Member

harrism commented Nov 28, 2023

For whatever reason, the version of fmt that cmake is finding on your system is too old for the version of spdlog. Make sure you start from a clean build directory if you changed from RMM 23.08 to 23.10.

@bdice
Copy link
Contributor

bdice commented Nov 28, 2023

It looks like rmm isn't being handled as a full CMake package, but only as an include directory. The rmm package declares exports of various other libraries, including fmt, that are needed to make rmm function.

I recommend taking this (old) code from cuhornet: https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/compiler-util/CMakeLists.txt#L124

and replacing it with what cuDF is doing:

https://github.com/rapidsai/cudf/blob/5e58e71836fd69ead04fbed5fdccb5e2e2c4d95c/cpp/CMakeLists.txt#L190

https://github.com/rapidsai/cudf/blob/5e58e71836fd69ead04fbed5fdccb5e2e2c4d95c/cpp/cmake/thirdparty/get_rmm.cmake#L16-L24

You may need rapids-cmake to make this code work. See: https://github.com/rapidsai/rapids-cmake

@wolfram77
Copy link
Author

I built and installed rmm 23.10 to ~/.local, and set RMM_INCLUDE to ~/.local/include/rmm while building cuhornet - this worked (on a different system). @bdice The steps are a bit complicated for me, atleast for now. While building cuhornet i get a warning (which is treated as error) while building triangle2.cu, so I disabled its build in cmake file.

The error is below:

/home/subhajit/Documents/Test-cuhornet/cuhornet/hornetsnest/src/Static/TriangleCounting/triangle2.cu: In member function ‘hornets_nest::triangle_t hornets_nest::TriangleCounting2::countTriangles()’:
/home/subhajit/Documents/Test-cuhornet/cuhornet/hornetsnest/src/Static/TriangleCounting/triangle2.cu:140:5: error: ‘void free(void*)’ called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
  140 |     free(h_triPerVertex);
      |     ^~~~~~~~~~~~~~~~
/home/subhajit/Documents/Test-cuhornet/cuhornet/hornetsnest/../primitives/StandardAPI.i.hpp:140:12: note: returned from ‘void* operator new [](std::size_t)’
  140 |     pointer = new T[num_items];
      |           ~^~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [CMakeFiles/hornetAlg.dir/src/Static/TriangleCounting/triangle2.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/hornetAlg.dir/all] Error 2
make: *** [all] Error 2

Thank you all for the help.

@harrism
Copy link
Member

harrism commented Nov 28, 2023

I don't want to discourage you @wolfram77, but see rapidsai/cuhornet#65, which adds:

NOTE: The cuhornet repository is a copy of https://github.com/hornet-gt/hornet that is being maintained by the RAPIDS
cugraph team while we use it in our library.  We currently only use headers to provide the ktruss implementation.

This library does not currently build.  Since we only use headers, we are not maintaining the build processes for
this library.  We expect to drop support for this entirely in early 2024.

I think you may get better support by posting a question issue on the rapidsai/cugraph repo, but I think cuHornet is approaching its end of life.

@wolfram77
Copy link
Author

I am considering cuhornet as a benchmark for my current report, and will switch to cugraph for the next work. Also thanks for the consolidation effort at NVIDIA. It is good for both users and researchers alike.

@harrism
Copy link
Member

harrism commented Nov 29, 2023

Much appreciated. Can we close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working doc Documentation
Projects
None yet
Development

No branches or pull requests

5 participants