histogram_approximation#

pymc_extras.distributions.histogram_approximation(name, dist, *, observed, **h_kwargs)[源代码]#

使用直方图势函数近似分布。

参数:
  • name (str) – Potential 的名称

  • dist (TensorVariable) – pm.Distribution.dist() 的输出

  • observed (ArrayLike) – 用于构建直方图的观测值。直方图在第 0 轴上计算。支持 Dask。

返回:

Potential

返回类型:

TensorVariable

示例

离散变量被简化为唯一的重复项(最多 min_count)

>>> import pymc as pm
>>> import pymc_extras as pmx
>>> production = np.random.poisson([1, 2, 5], size=(1000, 3))
>>> with pm.Model(coords=dict(plant=range(3))):
...     lam = pm.Exponential("lam", 1.0, dims="plant")
...     pot = pmx.distributions.histogram_approximation(
...         "pot", pm.Poisson.dist(lam), observed=production, min_count=2
...     )

连续变量被离散化为 n_quantiles

>>> measurements = np.random.normal([1, 2, 3], [0.1, 0.4, 0.2], size=(10000, 3))
>>> with pm.Model(coords=dict(tests=range(3))):
...     m = pm.Normal("m", dims="tests")
...     s = pm.LogNormal("s", dims="tests")
...     pot = pmx.distributions.histogram_approximation(
...         "pot", pm.Normal.dist(m, s),
...         observed=measurements, n_quantiles=50
...     )

对于连续变量中的零膨胀等特殊情况,有一个标志。该标志为零添加一个单独的 bin

>>> measurements = abs(measurements)
>>> measurements[100:] = 0
>>> with pm.Model(coords=dict(tests=range(3))):
...     m = pm.Normal("m", dims="tests")
...     s = pm.LogNormal("s", dims="tests")
...     pot = pmx.distributions.histogram_approximation(
...         "pot", pm.Normal.dist(m, s),
...         observed=measurements, n_quantiles=50, zero_inflation=True
...     )