libgdf is a C library for implementing common functionality for a GPU Data Frame. For more project details, see the wiki.
The following instructions are tested on Linux and OSX systems.
Compiler requirement:
- g++-4.8 or
- g++-5.4
Note: This repo uses submodules. Make sure you cloned recursively:
git clone --recurse-submodules [email protected]:gpuopenanalytics/libgdf.git
Or, after cloning:
cd libgdf
git submodule update --init --recursive
Since cmake will download and build Apache Arrow you may need to install Boost C++:
$ sudo apt-get install libboost-all-dev
To run the python tests it is recommended to setup a conda environment for the dependencies.
# create the conda environment (assuming in build directory)
$ conda env create --name libgdf_dev --file ../conda_environments/dev_py35.yml
# activate the environment
$ source activate libgdf_dev
$ conda install arrow-cpp=0.7.1 -c conda-forge
$ conda install pyarrow=0.7.1 -c conda-forge
This installs the required cmake
and pyarrow
into the libgdf_dev
conda
environment and activates it.
For additional information, the python cffi wrapper code requires cffi
and
pytest
. The testing code requires numba
and cudatoolkit
as an
additional dependency. All these are installed from the previous commands.
The environment can be updated from ../conda_environments/dev_py35.yml
as
development includes/changes the depedencies. To do so, run:
$ conda env update --name libgdf_dev --file ../conda_environments/dev_py35.yml
This project uses cmake for building the C/C++ library. To configure cmake, run:
mkdir build # create build directory for out-of-source build
cd build # enter the build directory
cmake .. # configure cmake (will download and build Apache Arrow and Google Test)
To build the C/C++ code, run make
. This should produce a shared library
named libgdf.so
or libgdf.dylib
.
If you run into compile errors about missing header files:
cub/device/device_segmented_radix_sort.cuh: No such file or directory
See the note about submodules in the Get dependencies section above.
To make development and testing more seamless, the python files and tests
can be symlinked into the build directory by running make copy_python
.
With that, any changes to the python files are reflected in the build
directory. To rebuild the libgdf, run make
again.
Currently, all tests are written in python with py.test
. A make target is
available to trigger the test execution. In the build directory (and with the
conda environment activated), run below to exceute test:
make pytest # this auto trigger target "copy_python"