FELTOR (Full-F ELectromagnetic code in TORoidal geometry) is both a numerical library and a scientific software package built on top of it.
Its main physical target are plasma edge and scrape-off layer (gyro-)fluid simulations. The numerical methods centre around discontinuous Galerkin methods on structured grids. Our core level functions are parallelized for a variety of hardware from multi-core cpu to hybrid MPI+GPU, which makes the library incredibly fast. Note that the library ships with a multitude of test and benchmark programs.
This guide discusses how to setup, test and benchmark the FELTOR library installation. Please read it before you proceed to the user guide to learn how to use the library in your own programs. We also present how to use the FELTOR software package, which requires additional external libraries to be installed on the system.
If you run into trouble during the setup, there is a chance that we already know of the problem and have an entry in troubleshooting page. If you cannot find a solution there, please inform us of the issue.
The first part assumes a Linux operating system. If you want to work on Windows, jump to Using Feltor on Windows.
Open a terminal and clone the repository into any folder you like
git clone https://www.github.com/feltor-dev/feltor
You also need to clone thrust and cusp distributed under the Apache-2.0 license. Also, we need Agner Fog’s vcl library (Apache 2.0). So again in a folder of your choice
git clone https://www.github.com/nvidia/thrust
git clone https://www.github.com/cusplibrary/cusplibrary
git clone https://www.github.com/vectorclass/version1 vcl
# We need to checkout an older version of thrust compatible with cusp
cd thrust
git checkout 1.9.3
Our code only depends on external libraries that are themselves openly available. We use version1 of the vectorclass library, as version2 requires C++-17 and does not work with the intel compiler.
Minimum system requirements | Recommended system requirements | |
Any |
support for AVX and FMA instruction set |
Compiler |
gcc-5.1 or msvc-15 or icc-17.0 (C++-14 standard) |
gcc-9.3, OpenMP-4 support, avx, fma instruction set flags |
- |
NVidia GPU with compute-capability > 6 and nvcc-11.0 |
- |
mpi installation compatible with compiler (ideally cuda-aware in case hybrid MPI+GPU is the target system) |
Our GPU backend uses the Nvidia-CUDA programming environment and in order to compile and run a program for a GPU a user needs at least the nvcc-8.0 compiler (available free of charge) and a NVidia GPU. However, we explicitly note here that due to the modular design of our software a user does not have to possess a GPU nor the nvcc compiler. The CPU version of the backend is equally valid and provides the same functionality. Analogously, an MPI installation is only required if the user targets a distributed memory system.
In order to compile one of the many test and benchmark codes
inside the FELTOR library you need to tell
the FELTOR configuration where the external libraries are located on
your computer. The default way to do this is to go into your HOME
directory, make an include directory and link the paths in this
cd ~
mkdir include
cd include
ln -s path/to/thrust/thrust # Yes, thrust is there twice!
ln -s path/to/cusplibrary/cusp
ln -s path/to/vcl
If you do not like this, you can also set the include paths in your own config file as described here.
Now let us compile the first benchmark program.
cd path/to/feltor/inc/dg
make blas_b device=cpu #(for a single thread CPU version)
make blas_b device=omp #(for an OpenMP version)
make blas_b device=gpu #(if you have a GPU and nvcc )
Run the code with
and when prompted for input vector sizes type for example 3 100 100 10
which makes a grid with 3 polynomial coefficients, 100 cells in x, 100
cells in y and 10 in z. If you compiled for OpenMP, you can set the
number of threads with e.g. export OMP_NUM_THREADS=4
This is a benchmark program to benchmark various elemental functions the library is built on. Go ahead and vary the input parameters and see how your hardware performs. You can compile and run any other program that ends in
(test programs) or_b.cu
(benchmark programs) infeltor/inc/dg
in this way.
Now, let us test the mpi setup
You can of course skip this if you don’t have mpi installed on your computer. If you intend to use the MPI backend, an implementation library of the mpi standard is required. Per default
is used for compilation.
cd path/to/feltor/inc/dg
make blas_mpib device=cpu # (for MPI+CPU)
# or
make blas_mpib device=omp # (for MPI+OpenMP)
# or
make blas_mpib device=gpu # (for MPI+GPU, requires CUDA-aware MPI installation)
Run the code with $ mpirun -n '# of procs' ./blas_mpib
then tell how
many process you want to use in the x-, y- and z- direction, for
example: 2 2 1
(i.e. 2 procs in x, 2 procs in y and 1 in z; total
number of procs is 4) when prompted for input vector sizes type for
example 3 100 100 10
(number of cells divided by number of procs must
be an integer number). If you compiled for MPI+OpenMP, you can set the
number of OpenMP threads with e.g. export OMP_NUM_THREADS=2
Now, we want to compile and run a simulation program. To this end, we have to download and install some additional libraries for I/O-operations.
First, we need to install jsoncpp (distributed under the MIT License),
which on linux is available as libjsoncpp-dev
through the package managment system.
For a manual build check the instructions on JsonCpp.
# You may have to manually link the include path
cd ~/include
ln -s /usr/include/jsoncpp/json
For data output we use the
NetCDF-C library under an
MIT - like license (we use the netcdf-4 file format).
The underlying HDF5
library also uses a very permissive license.
Both can be installed easily on Linux through the libnetcdf-dev
and libhdf5-dev
For a manual build follow the build instructions in the netcdf-documentation.
Note that by default we use the serial netcdf and hdf5 libraries alson in the mpi
versions of applications.
Some desktop applications in FELTOR use the
draw library (developed by us
also under MIT), which depends on
glfw3, an OpenGL development library under a
BSD-like license. There is a libglfw3-dev
package for convenient installation. Again, link path/to/draw
in the include
If you are on a HPC cluster, you may need to set INCLUDE and LIB variables manually. For details on how FELTOR’s Makefiles are configured please see the config file. There are also examples of some existing Makefiles in the same folder.
We are now ready to compile and run a simulation program
cd path/to/feltor/src/toefl # or any other project in the src folder
make toefl device=gpu # (compile for gpu, cpu or omp)
cp input/default.json inputfile.json # create an inputfile
./toefl inputfile.json # (behold a live simulation with glfw output on screen)
# or
make toefl_hpc device=gpu # (compile for gpu, cpu or omp)
cp input/default_hpc.json inputfile_hpc.json # create an inputfile
./toefl_hpc inputfile_hpc.json outputfile.nc # (a single node simulation with output stored in a file)
# or
make toefl_mpi device=omp # (compile for gpu, cpu or omp)
export OMP_NUM_THREADS=2 # (set OpenMP thread number to 1 for pure MPI)
echo 2 2 | mpirun -n 4 ./toefl_mpi inputfile_hpc.json outputfile.nc
# (a multi node simulation with now in total 8 threads with output stored in a file)
# The mpi program will wait for you to type the number of processes in x and y direction before
# running. That is why the echo is there.
Default input files are located in path/to/feltor/src/toefl/input
. All
three programs solve the same equations. The technical documentation on
what equations are discretized, input/output parameters, etc. can be
generated as a pdf with make doc
in the path/to/feltor/src/toefl
FELTOR’s library is the dg-library (from discontinuous Galerkin). Note that the library is header-only, which means that you just have to include the relevant header(s) and you’re good to go. For example in the following program we compute the square L2 norm of a function:
#include <iostream>
//include the basic dg-library
#include "dg/algorithm.h"
double function(double x, double y){return exp(x)*exp(y);}
int main()
//create a 2d discretization of [0,2]x[0,2] with 3 polynomial coefficients
dg::CartesianGrid2d g2d( 0, 2, 0, 2, 3, 20, 20);
//discretize a function on this grid
const dg::DVec x = dg::evaluate( function, g2d);
//create the volume element
const dg::DVec vol2d = dg::create::volume( g2d);
//compute the square L2 norm on the device
double norm = dg::blas2::dot( x, vol2d, x);
// norm is now: (exp(4)-exp(0))^2/4
std::cout << norm <<std::endl;
return 0;
To compile and run this code for a GPU use (assuming the external libraries are linked in the include
folder as described above)
nvcc -x cu -std=c++14 --extended-lambda -Ipath/to/feltor/inc -Ipath/to/include test.cpp -o test
Or if you want to use OpenMP and gcc instead of CUDA for the device functions you can also use
g++ -std=c++14 -fopenmp -mavx -mfma -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP -Ipath/to/feltor/inc -Ipath/to/include test.cpp -o test
If you do not want any parallelization, you can use a single thread version
g++ -std=c++14 -mavx -mfma -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CPP -Ipath/to/feltor/inc -Ipath/to/include test.cpp -o test
If you want to use mpi, just include the MPI header before any other FELTOR header and use our convenient typedefs like so:
#include <iostream>
//activate MPI in FELTOR
#include "mpi.h"
#include "dg/algorithm.h"
double function(double x, double y){return exp(x)*exp(y);}
int main(int argc, char* argv[])
//init MPI and create a 2d Cartesian Communicator assuming 4 MPI threads
MPI_Init( &argc, &argv);
int periods[2] = {true, true}, np[2] = {2,2};
MPI_Comm comm;
MPI_Cart_create( MPI_COMM_WORLD, 2, np, periods, true, &comm);
//create a 2d discretization of [0,2]x[0,2] with 3 polynomial coefficients
dg::CartesianMPIGrid2d g2d( 0, 2, 0, 2, 3, 20, 20, comm);
//discretize a function on this grid
const dg::MDVec x = dg::evaluate( function, g2d);
//create the volume element
const dg::MDVec vol2d = dg::create::volume( g2d);
//compute the square L2 norm
double norm = dg::blas2::dot( x, vol2d, x);
//on every thread norm is now: (exp(4)-exp(0))^2/4
//be a good MPI citizen and clean up
return 0;
Compile e.g. for a hybrid MPI + OpenMP hardware platform with
mpic++ -std=c++14 -mavx -mfma -fopenmp -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_OMP -Ipath/to/feltor/inc -Ipath/to/include test_mpi.cpp -o test_mpi
mpirun -n 4 ./test_mpi
Note the striking similarity to the previous program. Especially the line calling the dot function did not change at all. The compiler chooses the correct implementation for you! This is a first example of a container free numerical algorithm.
In order to simplify compilation in your own project we suggest to use Makefile and import the feltor configuration like so:
device=omp # default device
# use Feltor's configuration
include $(FELTOR_PATH)/config/default.mk
include $(FELTOR_PATH)/config/*.mk
include $(FELTOR_PATH)/config/devices/devices.mk
# See feltor/config/README for a list of all defined variables
# For example here the feltor/src/toefl project is set up
# The toefl.cpp program can be compiled in three different ways
# and each can be compiled for various device values e.g.
# make toefl device=gpu
all: toefl toefl_hpc toefl_mpi
# shared memory version using glfw, jsoncpp and netcdf
toefl: toefl.cpp toefl.h parameters.h diag.h
# shared memory version using jsoncpp and netcdf without glfw
toefl_hpc: toefl.cpp toefl.h parameters.h diag.h
$(CC) $(OPT) $(CFLAGS) $< -o $@ $(INCLUDE) $(LIBS) $(JSONLIB) -g
# mpi version using jsoncpp and netcdf without glfw
toefl_mpi: toefl.cpp toefl.h parameters.h diag.h
.PHONY: clean
rm -f toefl toefl_hpc toefl_mpi
FELTOR has been developed mostly on Linux machines. Recently, it has become possible to develop also on Windows using Microsoft Visual Studio. We here describe how to work with FELTOR’s OpenMP shared memory backend on Windows.
Unfortunately, the msvc compiler only supports an outdated OpenMP version so consider a performance penalty of approximately a factor 2, when running the OpenMP backend on Windows.
We suggest to install the Github desktop https://desktop.github.com.
Please clone all four of the following URLs using File → Clone repository…
# Checkout thrust to 1.9.3
https://www.github.com/vectorclass/version1 # local path "vcl"
Please also have a look at the relevant system requirements Table.
In Visual Studio we suggest to create a Property Sheet for FELTOR.
The Property Sheet can then be conveniently added to any project that includes
the FELTOR library headers dg/algorithm.h
and/or dg/geometries/geometries.h
Open an existing solution in Visual Studio or create a new project with
File → New → Projet …
selectingEmpty Project
in Visual C++. -
In the Solution Explorer change to the
Property Manager
tab (you may have to google how to activate it if it is not there), then click onAdd New Project Property Sheet
, name itFeltorPropertySheet.props
and save it to a convenient location. -
Double click on
(expand your solution and any of the Debug or Release tabs to find it)-
VC++ Directories → Include Directories
click onEdit
Then add the four linespath\to\feltor\inc
C/C++ → Optimization → Enable Intrinsic Functions
selectYes (/Oi)
C/C++ → Preprocessor → Preprocessor Definitions
(Selects the CPU backend in FELTOR) -
C/C++ → Code Generation → Enable Enhanced Instruction Set
selectAdvanced Vector Extensions 2 (/arch:AVX2)
(If your CPU supports it, of course)
Don’t forget to click
in the end.
That’s it.
You can add your Feltor Property Sheet to any new project
by switching to the Property Manager
click Add Existing Property Sheet
and select FeltorPropertySheet
We suggest that you generate a new project for each executable program.
In order to test the Feltor Property Sheet let us add a source file to the project and compile
In the Solution Explorer right click on
Source Files → Add → New Item … → C++ File (.cpp)
. As an example we name ittest.cpp
and copy the contents of test.cpp -
Change the Platform from x86 to x64.
Compile with
Ctrl + F5
then run the code
If you want to prevent the console from closing on program exit, set
Properties → Linker → System → SubSystem → Console (/SUBSYSTEM:CONSOLE)
in your Property Sheet.
Our simulation codes typically depend on jsoncpp for parameter input, glfw3 for plotting or netcdf-4 for file output and come with a LaTeX file containing documentation. You will need to download these additional libraries and adapt the project properties accordingly.
- jsoncpp
Download and Install Anaconda. (to get a working python3 installation)
In Github desktop:
File → Clone repository…
Execute the file
. The only way to confirm its success is to look for adist
folder containingjsconcpp.cpp
and a folder containing two header files. -
toProperties → VC++ Directories → Include Directories
In the Solution Explorer Right click
Source Files → Add → Existing Item
and selectpath\to\jsconcpp\dist\jsoncpp.cpp
- Glfw3
In Github desktop:
File → Clone repository…
Download and extract the Windows binaries from https://www.glfw.org/download.html
toProperties → VC++ Directories → Include Directories
Properties → Linker → General → Additional Library Directories
Finally, in
Properties → Linker → Input → Additional Dependencies
add the linesglfw3.lib
(there needs to be a newline in between!)
- NetCDF
Download and install the
package from https://www.unidata.ucar.edu/downloads/netcdf/index.jsp (make sure to Check "Add netCDF to system PATH" during the installation process) -
toProperties → VC++ Directories → Include Directories
Properties → Linker → General → Additional Library Directories
Finally, in
Properties → Linker → Input → Additional Dependencies
add the linenetcdf.lib
- LaTeX
Install MikTex and TeXstudio (in that order) in order to be able to compile the tex file(s) of the documentation.
This is an error of the unmaintained cusp that does not like the newly updated thrust version on github. Currently, you can either go back to version 1.9.3 in thrust:
cd path/to/thrust
git checkout 1.9.3
or alternatively there is a fix in cusp that can be accessed via
cd path/to/cusplibrary
git checkout cuda10
