Installation

Requirements

Required:

  1. numpy

  2. pandas

  3. cramjam

  4. thrift

cramjam provides compression codecs: gzip, snappy, lz4, brotli, zstd

Optional compression codec:

  1. python-lzo/lzo

Installation

Install using conda:

conda install -c conda-forge fastparquet

install from PyPI:

pip install fastparquet

or install latest version from github, “main” branch:

pip install git+https://github.com/dask/fastparquet

Please be sure to install numpy before fastparquet when using pip, as pip sometimes can fail to solve the environment. Depending on your platform, pip may pull binary wheels or attempt to rebuild fastparquet.

Dev requirements

To run all of the tests, you will need the following, in addition to the requirements above:

  1. python=3.8

  2. bson

  3. lz4

  4. lzo

  5. pytest

  6. dask

  7. moto/s3fs

  8. pytest-cov

  9. pyspark

Some of these (e.g., pyspark) are optional and will result in skipped tests if not present.

Tests use pytest.

Building Docs

The docs/ directory contains source code for the documentation. You will need sphinx and numpydoc to successfully build. sphinx allows output in many formats, including html:

# in directory docs/
make html

This will produce a build/html/ subdirectory, where the entry point is index.html.