simplified and condensed distributions
IMPORTANT: distl is currently still under development, is not yet well-tested, and is subject to significant API changes. Please check back until an official release is ready.
distl provides a python object-interface on top of several distribution (random variable) functions in scipy.stats and allows for:
- serialization of distributions (so they can be saved to disk or pickled and sent to processors within MPI)
- support for units and wrapping
- conversion between different types of distributions
- math between distributions, handling covariances from multivariate distributions wherever possible
- plotting convenience functions
distl requires the following dependencies:
- python 2.7+ or 3.6+
- scipy 1.0+
- numpy 1.10+
and the following optional dependencies:
- matplotlib 2.2+ (required for plotting distributions)
- corner (required for plotting multivariate distributions and distribution collections)
- astropy 1.0+ (required for units support)
- dill (required for saving/loading Function distributions)
To install the latest release via pip:
pip install distl
To install from source locally for a single user:
python setup.py build python setup.py install --user
Or to install globally:
python setup.py build sudo python setup.py install
Now from within python we can import the
and then create, sample from, and plot our first distribution:
g = distl.gaussian(10, 1) print(g.sample()) print(g.sample(10)) g.plot(show=True)
Supported Distribution Types
Creation functions for the following distribution types are currently implemented and available at the top-level of distl:
- normal (shortcut to gaussian)
- boxcar (shortcut to uniform)
- histogram_from_data or histogram_from_bins
Converting Between Distribution Types
Distributions within distl allow for converting to other distribution types.
See the API documention for the appropriate distribution type
and look for the
to_ methods to convert along with a description of the options
and limitations. Below is a summary of all implemented translation methods:
To sample from any distribution, call the sample method, optionally passing the number of desired samples.
g = distl.gaussian(10, 2) g.sample(10)
array([ 8.07893271, 12.51150027, 7.56756268, 7.29151051, 5.55049747, 8.67495845, 11.61104165, 10.11544651, 11.96864228, 10.54677169])
See these sampling examples for more details.
To ensure consistent results (when needed), pass
seed to sample
or set the random seed in numpy prior to sampling.
g = distl.gaussian(10, 2) g.sample(seed=1234) g.sample(seed=1234) np.random.seed(1234) g.sample()
See this seeding example for more details.
NOTE: matplotlib is required for plotting support.
To plot the distribution, call one of the following:
gh = distl.gaussian(5, 3).to_histogram() out = gh.plot(200, show=True, plot_gaussian=True)
See these plotting examples for more details.
g = distl.gaussian(5, 3) g = distl.from_dict(g.to_dict())
See the API docs on the following for more details:
Math with Distribution Objects
Any (supported) math operator between two Distribution objects, or between a Distribution object and a float or integer, will return another Distribution object. In most cases, this will return a Composite Distribution. In some cases where it is possible to return the same type of Distribution, that will be done instead. For example, a Gaussian Distribution multiplied by a float can return another Gaussian Distribution where that float is interpreted as a Delta Distribution with that value.
This means that in the following case
2 * g is equivalent to
d * g, but not
g + g:
g = distl.gaussian(10, 2) d = distl.delta(2)
Currently supported operators include:
- multiplication, division, addition, subtraction
- np.sin, np.cos, np.tan (but not math.sin, etc)
See these math examples for more details.
Support for Units
NOTE: astropy is required for units support.
Units can be set for a distribution by setting the unit, by passing
unit to the constructor, or by multiplying the distribution object by an astropy.unit object.
To change units, you can then call to to return a new distribution in the requested units.
See these units examples for more details.
g = distl.gaussian(10, 2, wrap_at=12) out = g.plot(show=True)
See these wrapping examples for more details.
Slicing Multivariate Distributions
mvg = distl.mvgaussian([5,10, 12], np.array([[ 2, 1, -1], [ 1, 2, 1], [-1, 1, 2]]), allow_singular=True, labels=['mvg_a', 'mvg_b', 'mvg_c']) mvg_a = mvg.slice('a') mvg_a.sample() mvg_a.plot(show=True)
See these slicing examples for more details.
Drawing and Computing Probabilities for Multiple Distributions via DistributionCollections
g = distl.gaussian(10, 2, label='gaussian') u = distl.uniform(0, 5, label='uniform') dc = distl.DistributionCollection(g, u) dc.plot(show=True)
See these collections examples for more details.
See the API documentation for full details on each type of available distribution.
Contributions are welcome! Feel free to file an issue or fork and create a pull-request.