import distl
import numpy as np

Copy vs Deepcopy

distl assigns each distribution a uniqueid which tracks whether copies of underlying distributions should be treated as from the same underlying distribution or independent.

Let's first create a gaussian distribution and access its automatically assigned random uniqueid.

d = distl.gaussian(5, 1)
print(d.uniqueid)
qXmRdLGhfHpHnaDtVqhD

When calling copy, this uniqueid is retained which links the new copy to the original.

print(d.copy().uniqueid)
qXmRdLGhfHpHnaDtVqhD

Calling deepcopy, however, assigns a new random uniqueid, effectively unlinking the two distribution objects.

print(d.deepcopy().uniqueid)
tRqFpWAZWlZvWbzHHrYQ

Implications in CompositeDistributions and math operations

By default, using math operators on distributions will use copy. To force unlinking the distributions, you must call deepcopy manually.

Here we'll show that the original distribution and the distribution after applying math operators maintain the same uniqueid, by default, and are therefore sampled simultaneously.

d = distl.gaussian(5, 1)
print(d.uniqueid)
yzjcXLHzImrgJDsJGKms
print((5*d).uniqueid)
yzjcXLHzImrgJDsJGKms

We can see the implications of this in action when a CompositeDistribution contains two or more "copies" of the same underlying distribution.

In this case, we'll create a distribution for d + 5*d. By default, the underlying references to d will remain linked when creating the CompositeDistribution (i.e. when sampling the CompositeDistribution, the same value will be drawn from all instances of d before applying the math operators between them).

comp = d + 5*d
_ = comp.plot(show=True)

png

If we instead force the two instances to be unlinked by using a deepcopy, the two references to d are sampled independently before applying the math operators. Here we can see that this does result in a slightly different distribution (in the case of simple operations between gaussian distributions, this only affects the spread on the resulting distribution).

comp_indep = d + 5*d.deepcopy()
_ = comp_indep.plot(show=True)

png

Implications in Function Distributions

Similarly, a Function Distribution will sample from any input distributions following these same rules. This does work even if vectorized=False, although it does get more computationally expensive.

def custom_math(a, b):
    return a + 5*b
f = distl.function(custom_math, args=(d, d))
_ = f.plot(show=True)

png

f_indep = distl.function(custom_math, args=(d, d.deepcopy()))
_ = f_indep.plot(show=True)

png

Implications in DistributionCollections

This sampling logic can be seen even more clearly in the corner plots within DistributionCollections. First we'll create a collection with d and 5*d. Since this creates a linked copy by default, we can see that the resulting samples strictly follow a 1:1 relationship.

dc = distl.DistributionCollection(d, comp, f)
_ = dc.plot(show=True)

png

If instead we force a deepcopy, both resulting Gaussian distributions are sampled independently from each other and no longer show this correlation.

dc_indep = distl.DistributionCollection(d, comp_indep, f_indep)
_ = dc_indep.plot(show=True)

png