1. How to build and install galario

1.1. Operating system

galario runs on Linux and Mac OS X. Windows is not supported.

1.2. Quickest installation: using conda

By far the easiest way to install galario is via conda. If you are new to conda, you may want to start with the minimal miniconda. With conda all dependencies are installed automatically and you get access to galario’s C++ core and python bindings, both with support for multithreading. To install galario:

conda install -c conda-forge galario

To create a conda environment for galario, see Section 1.4, step 2.

Due to technical limitations, the conda package does not support GPUs at the moment. If you want to use a GPU, read on as you have to build galario by hand.

1.3. Build requirements

To compile galario you will need:

  • a working internet connection (to download 1.5 MB of an external library)

  • either g++>=4.8.1 or clang++>=3.3 with full support of C++11. To use multiple threads, the compiler has to support openMP

  • cmake: download from the cmake website or install with conda install -c conda-forge cmake

  • make

  • the FFTW libraries, for the CPU version: more details are given below

  • [optional] the CUDA toolkit >=8.0 for the GPU version: it can be easily installed from the NVIDIA website

  • [optional] Python and numpy for Python bindings to the CPU and GPU


If you want to use the GNU compilers on Mac OS, you need to manually download and install them, e.g. following these instructions. The default gcc/g++ commands shipped with the OS are aliases for the clang compiler that supports openMP only as of version 3.7 but unfortunately Apple usually ships an older version of clang.

1.4. Quick steps to build and install

Here a quick summary to compile and install galario with default options, below are more detailed instructions to fine-tune the build process.

The following procedure will always compile and install the CPU version of galario. On a system with a CUDA-enabled GPU card, also the GPU version will be compiled and installed. To manually turn ON/OFF the GPU CUDA compilation, see these instructions below.

  1. Clone the repository and create a directory where to build galario:

    git clone https://github.com/mtazzari/galario.git
    cd galario
    mkdir build && cd build
  2. to make the compilation easier, let’s work in a Python environment. galario works in Python 2.7, 3.5, 3.6, 3.7, and 3.8

    For example, if you are using the Anaconda distribution, you can create and activate a Python 3.7 environment with:

    conda create --name galario3 python=3.7 numpy cython pytest scipy
    source activate galario3
  3. Use cmake to prepare the compilation from within galario/build/:

    cmake ..

    This command will produce configuration and compilation logs listing all the libraries and the compilers that are being used. It will use the internet connection to automatically download this additional library (1.5 MB).

  4. Use make to build galario and make install to install it inside the active environment:

    make && make install

    If the installation fails due to permission problems, you either have to use sudo make install, or see the instructions below to specify an alternate installation path. Permission problems may arise when you are using, e.g., a shared conda environment: in that case, it is preferable to create your own environment in a directory where you have write permissions.

These instructions should be sufficient in most cases, but if you have problems or want more fine-grained control, check out the details below. If you find issues or are stuck in one of these steps, consider writing us an email or opening an issue on GitHub.


If you compile galario only for the CPU, gcc/g++ >= 4.0 works fine. If you compile also the GPU version, check in the NVIDIA Docs which gcc/g++ versions are compatible with the nvcc compiler shipped with your CUDA Toolkit.

1.5. Detailed build instructions

The default configuration to build galario is

git clone https://github.com/mtazzari/galario.git
cd galario
mkdir build && cd build
cmake .. && make
There are many options to affect the build when cmake is invoked. When playing

with options, it’s best to remove the cmake cache first

rm build/CMakeCache.txt

In the following, we assume cmake is invoked from the build directory.

1.5.1. Compiler

Set the C and C++ compiler

export CC="/path/to/bin/gcc"
export CXX="/path/to/bin/g++"
cmake ..

# alternative
cmake -DCMAKE_C_COMPILER=/path/to/gcc -DCMAKE_CXX_COMPILER=/path/to/g++ ..

When changing the compiler, it is best to start with a fresh empty build directory.

1.5.2. Optimization level

By default galario is built with all the optimizations ON. You can check this with:

cmake --help-variable CMAKE_BUILD_TYPE

The default built type is Release, which is the fastest. If you want debug symbols as well, use RelWithDebInfo.

To turn on even more aggressive optimization, pass the flags directly. For example for g++:

cmake -DCMAKE_CXX_FLAGS='-march=native -ffast-math' ..

Note that these further optimization might not work on any system.

To turn off optimizations:

cmake -DCMAKE_BUILD_TYPE=Debug ..

1.5.3. Python

To build the python bindings, we require python 2.7 or 3.x, numpy, cython, and pytest. To run the tests, we additionally need scipy>0.14.

Specify a Python version if Python 2.7 and 3.x are in the system and conflicting versions of the interpreter and the libraries are found and reported by cmake. In build/, do

cmake -DPython_ADDITIONAL_VERSIONS=3.5 ..

galario should work with both python 2 and 3. For example, if you are using the Anaconda distribution, you can create conda environments with

# python 2
conda create --name galario2 python=2 numpy cython pytest
source activate galario2

# or python3
conda create --name galario3 python=3 numpy cython pytest
source activate galario3

To run the tests, install some more dependencies within the environment

conda install scipy

cmake may get confused with the conda python and the system python. This is a general problem https://cmake.org/Bug/view.php?id=14809

A workaround to help cmake find the interpreter and the libs from the currently loaded conda environment is


If you still have problems, after the cmake command, check whether the FFTW libraries with openMP flags are found and whether the path to Python is correctly set to the path of the conda environment in use, e.g. /home/user/anaconda/envs/galario3.

1.5.4. FFTW

The FFTW libraries are required for the CPU version of galario. You can check if they are installed on your system by checking if all libraries listed below are present, for example in /usr/lib or /usr/local/lib/.

galario requires the following FFTW libraries:

  • libfftw3: double precision

  • libfftw3f: single precision

  • libfftw3_threads: double precision with pthreads

  • libfftw3f_threads: single precision with pthreads

galario has been tested with FFTW 3.3.6.

The easiest way to install FFTW is to use a package manager, for example apt on Debian/Ubuntu or homebrew on the Mac. For example,

sudo apt-get install libfftw3-3 libfftw3-dev

If you really want to build FFTW from source, for example because you don’t have admin rights, read on. Manual compilation

To compile FFTW, download the .tar.gz from FFTW website. On Mac OS, you have to explicitly enable the build of dynamic (shared) library with the --enable-shared option, while on Linux this should be the default. You can create the libraries listed above with the following lines:

cd fftw-<version>/
mkdir d_p && cd d_p && \
  CC=/path/to/gcc ../configure --enable-shared && make && sudo make install && cd ..
mkdir s_p && cd s_p && \
  CC=/path/to/gcc ../configure --enable-shared --enable-single && make && sudo make install && cd ..
mkdir d_p_omp && cd d_p_omp && \
  CC=/path/to/gcc ../configure --enable-shared --enable-openmp && make && sudo make install && cd ..
mkdir s_p_omp && cd s_p_omp && \
  CC=/path/to/gcc ../configure --enable-shared --enable-single --enable-openmp && make && sudo make install && cd ..

If you have no sudo rights to install FFTW libraries, then provide an installation directory via make install --prefix="/path/to/fftw".


Before building galario, FFTW_HOME has to be set equal to the installation directory of FFTW, e.g. with:

export FFTW_HOME="/usr/local/lib/"

in the default case, or to the prefix specified during the FFTW installation. Also, you need to update the LD_LIBRARY_PATH to pick the FFTW libraries:


To speedup building FFTW, you may add the -jN flag to the make commands above, e.g. make -jN, where N is an integer equal to the number of cores you want to use. E.g., on a 4-cores machine, you can do make -j4. To use -j4 as default, you can create an alias with:

alias make="make -j4" Setting paths

To find FFTW3 in a nonstandard directory, say $FFTW_HOME, tell cmake about it:


For multiple directories, use a ; between directories:

cmake -DCMAKE_PREFIX_PATH=${FFTW_HOME};/opt/something/else ..

In case the directory with the header files is not inferred correctly:

cmake -DCMAKE_CXX_FLAGS="-I${FFTW_HOME}/include" ..

In case the openmp libraries are not in ${FFTW_HOME}/lib


1.5.5. CUDA

cmake tests for compilation on the GPU with cuda by default except on Mac OS, where version conflicts between the NVIDIA compiler and the C++ compiler often lead to problems; see for example this issue.

To manually enable or disable checking for cuda, do

cmake -DGALARIO_CHECK_CUDA=0 .. # don't check
cmake -DGALARIO_CHECK_CUDA=1 .. # check

If cuda is installed in a non-standard directory or you want to specify the exact version, you can point cmake

cmake -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-9.1 ..

1.5.6. Timing

For testing purposes, you can activate the timing features embedded in the code that produce detailed printouts to stdout of various portions of the functions. The times are measured in milliseconds. This feature is OFF by default and can be activated during the configuration stage with


1.5.7. Documentation

This documentation should be available online here. If you want to build the documentation locally, from within the build/ directory run:

make docs

which creates output in build/docs/html. The docs are not built by default, only upon request.

First install the build requirements with

conda install sphinx
pip install sphinx_py3doc_enhanced_theme sphinxcontrib-fulltoc

within the conda environment in use. This ensures that the sphinx version matches the Python version used to compile galario. If you still have problems, remove the CMakeCache.txt, rerun cmake, and observe which location of sphinx is reported in CMakeCache.txt, for example:

-- Found Sphinx: /home/myuser/.local/miniconda3/envs/galario3/bin/sphinx-build

The galario library needs to be imported when building the documentation (the import would fail otherwise) to extract docstrings.

To delete the sphinx cache in case the docs don’t update as expected

rm -rf docs/_doctrees/

1.6. Install

To specify a path where to install the C libraries of galario (e.g., if you do not have sudo rights to install it in usr/local/lib), do the conventional:

cmake -DCMAKE_INSTALL_PREFIX=/path/to/galario/lib ..

and, after building, run:

make install

This will install the C libraries of galario in /path/to/galario/.


By default the C libraries and the Python bindings are installed under the same prefix. If you want to install the Python bindings elsewhere, there is an extra cache variable GALARIO_PYTHON_PKG_DIR that you can edit with ccmake . after running cmake.

If you are working inside an active conda environment, both the libraries and the python wrapper are installed inside the environment defined by $CONDA_PREFIX, e.g.:

conda activate myenv
cmake ..
make && make install

Example output during the install step

-- Installing: /path/to/conda/envs/myenv/lib/libgalario.so
-- Installing: /path/to/conda/envs/myenv/include/galario.h
-- Installing: /path/to/conda/envs/myenv/lib/python2.7/site-packages/galario/single/__init__.py

From the environment myenv it is now possible to import galario.

1.6.1. Uninstall

After installation, remove all installed files with

make uninstall

1.7. Tests

After building, just run ctest -V --output-on-failure from within the build/ directory.

Every time python/test_galario.py is modified, it has to be copied over to the build directory: only when run there, import pygalario works. The copy is performed in the configure step, cmake detects changes so always run make first.

py.test fails if it cannot collect any tests. This can be caused by C errors. To debug the testing, first find out the exact command of the test:

make && ctest -V

py.test captures the output from the test, in particular from C to stderr. Force it to show all output:

make && python/py.test.sh -sv python_package/tests/test_galario.py

By default, tests do not run on the GPU. Activate them by setting an environment variable GALARIO_TEST_GPU; e.g. GALARIO_TEST_GPU=1 py.test.sh .... To select a given parametrized test named test_sample, just run py.test.sh -k sample.

A cuda error such as

[ERROR] Cuda call /home/user/workspace/galario/build/src/cuda_lib.cu: 815
invalid argument

can mean that code cannot be executed on the GPU at all rather than that specific call being invalid. Check if nvidia-smi fails

$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch