-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert CTU hydro subroutines to C++ #525
Comments
some easy targets to start with:
|
we need the |
@zingale we are trying to build a Fortran community and tooling (https://fortran-lang.org) with the aim to eventually fix some of the reasons why people currently have to move to C++. That is a long term, multi year effort. I was wondering if you would share with us some of the things that Fortran should fix, from your point of view, so that you could continue using it, instead of rewriting to C++? It seems it's portability and GPUs. For the portability part, I would be interested in more details, e.g. what platforms / compilers don't work well, etc. For the GPU I think I can guess: Fortran currently does not have a good solution, besides CUDA Fortran, which only works with some GPUs and some compilers. That's something I would like to help fix with LFortran down the road. |
Hi Ondrej, Castro builds off of AMReX which is written in C++ and provides a lot of high-level abstractions that make it easy to run on CPUs or GPUs. Initially we built our own method to offload Fortran routines to GPUs, but it is too much to maintain, so we chose to move to C++ so we can take advantage of the features of AMReX and lessen our own development burden. |
Thanks Michael. Yes, we use AMReX also for some of our stuff at LANL (we also have it in Fortran). It seems the conclusion here is that Fortran compilers and related tooling maintained by the community (as opposed to by the Castro team) should handle the GPU offload, robustly and painlessly. That would go a long way. That might not fully fix all the issues with codes like Castro because my own experience with projects that use multiple languages (whether C++ and Python, or C/Cython/Python or C++ and Fortran) is that it adds lots of complexity and requires people to know both languages well, as well as how they interact. And it's much easier to just stick to one language (say C++) and do everything in it. So perhaps part of the story where Fortran can improve is also easier interaction with C++ libraries, so that projects like Castro don't have to maintain complicated wrappers, and if things can simplify both for AMReX developers as well as their users, then perhaps it would be more viable to use Fortran with AMReX. |
Yes, improved interoperability between C++ and Fortran would go a long way toward improving what we were trying to do. The use case of creating GPU kernels from C++ and calling Fortran functions on a per-element basis as device functions did not work very efficiently for us, in part because we were gluing together two programming models (CUDA C++ and CUDA Fortran) whose design principles did not include this paradigm. |
However there are some programming methods that are just inherently easier to implement in C++. The case we care most about is that we have certain physics functions that are evaluated on a per-element basis across all of the elements in our computational domain, and these functions dominate the computational expense in certain cases. In Fortran these are most naturally expressed as module functions, allowing no opportunity to inline and thus efficiently optimize these function calls. In C++ we can (and did) implement these functions as a header-only library, allowing for efficient inlining. It's not clear to us that this is important on traditional CPU platforms but we believe it is a major issue for GPU code. If link-time optimization were available that was approximately as efficient as we would get from direct inlining of the C++ source, that would also have gone a long way toward making our Fortran implementation viable. |
@maxpkatz thank you for this feedback. As a Fortran user, I too share pretty much the same frustrations. Regarding inlining, why couldn't Fortran compilers simply inline module functions where it makes sense? I think the link time inlining is too late. I don't know, I just see a lot of opportunities to do things vastly better. I started a new compiler called LFortran (https://lfortran.org/), I am not there yet to be tackling some of these issues, but it is my plan to do that. I will keep you updated. |
For best portability, especially with GPUs, we should start converting the various hydro routines over to C++. We can do this piece by piece, test them one-by-one, and merge.
The text was updated successfully, but these errors were encountered: