-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Unable to build v1.8.x with cpp_package in VS2019 #20099
Comments
Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. |
Please try using the settings in https://github.com/apache/incubator-mxnet/blob/master/ci/build_windows.py |
@leezu Will (should) it work if I add USE_CPP_PACKAGE=1 in that script? |
Let's try that. Would you like to open a PR targeting the |
I tried to build with settings in https://github.com/apache/incubator-mxnet/blob/master/ci/build_windows.py
After 1800000 more lines build log ends with ninja: build stopped: subcommand failed. Quick search by words "error" or "failed" gives this:
There's also the same error message for bilinear_sampler.cu.obj and deformable_psroi_pooling.cu.obj. Can you advise next step? |
Please check the failing files in the v1.x / master branch. There might be fixes that need to be backported for v1.8.x to work |
Unfortunately I'm still facing same errors. Perhaps something is wrong with CUDA installation on my machine, or with CUDA integration with Visual Studio. Are you actually cross-compiling MXNet in Docker container? I want to try to build this way now, but I don't quite understand what exactly should I do. |
You could try updating to latest cuda and latest msvc. We don't cross-compile for windows |
@leezu Turns out the solution was not to update to the latest msvs but to downgrade to the previous version: https://devtalk.blender.org/t/cuda-compile-error-windows-10/17886/5
Quick search led to this issue NVIDIA/thrust#1090 but I didn't find any Thrust headers in util_type.cuh. I'm also not sure if error depends on USE_CPP_PACKAGE=ON setting, so I opened a PR just to test it: #20108 |
Btw, what is the exact version of MSVS that CI uses? |
This is a bug in CUDA - MSVC integration and should be solved by upgrading to Cuda 11. Note that this issue is not thrust specific, but a bug in nvcc compiler. That's why it occurs for you even without thrust. For more details, see NVIDIA/thrust#1090 @josephevans may know the precise version. Alternativel, I think you can look at the compilation log of the CI build which should contain a line with the version number |
@leezu Yes, building with CUDA 11 did help. Maybe CUDA version in build_windows.py should reflect that? Anyway, I'm too lazy for another PR |
Description
mxnet_60 project fails to build
Error Message
Lots of them but only of two kinds:
Code Description Project File Line
Rest of the errors are the same except for name of the function or file
Steps to reproduce
-T cuda=10.2,host=x64^
-DCMAKE_POLICY_DEFAULT_CMP0104=NEW^
-DCMAKE_BUILD_TYPE=Release^
-DUSE_CPP_PACKAGE=1^
-DUSE_CUDA=1^
-DUSE_CUDNN=1^
-DUSE_OPENCV=0^
-DUSE_OPENMP=1^
-DUSE_BLAS=mkl^
-DMKL_ROOT="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2020\windows\mkl"^
-DMXNET_CUDA_ARCH=6.0;6.2;7.0;7.2;7.5^
-DCUDNN_INCLUDE="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include"^
-DCUDNN_LIBRARY="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64\cudnn.lib"^
-S .\src -B .\build
What have you tried to solve it?
Version 16.9.2
VisualStudio.16.Release/16.9.2+31112.23
Microsoft .NET Framework
Version 4.8.04084
Installed Version: Community
Visual C++ 2019 00435-60000-00000-AA092
Microsoft Visual C++ 2019
NVIDIA CUDA 10.2 Wizards 10.2
NVIDIA Nsight Visual Studio Edition - CUDA support 2019.4.0.19274
Actually that how I wanted to build it initially but I've got
compiler is out of heap space in pass 2
error for the same project which I couldn't resolve (as suggested somewhere here) by settings 'Link Time Code Optimization' to '/LTCG:PGOptimize' because VS can't find mxnet_60.pgd file. However, I've successfully built version without CUDA with VS 2019.Environment
----------System Info----------
Platform : Windows-10-10.0.19041-SP0
system : Windows
node : Slepcov-PC
release : 10
version : 10.0.19041
----------Hardware Info----------
machine : AMD64
processor : Intel64 Family 6 Model 165 Stepping 5, GenuineIntel
Name
Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0338 sec, LOAD: 0.5731 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.2435 sec, LOAD: 0.3744 sec.
Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1108)>, DNS finished in 0.18741607666015625 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.1526 sec, LOAD: 0.7710 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0227 sec, LOAD: 0.7252 sec.
Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.14266681671142578 sec.
----------Environment----------
MXNET_CUDA="D:\Recognition Technologies\AvtoUragan ver 3.8\ExtLibs\MxNet\Win64"
The text was updated successfully, but these errors were encountered: