R2D2M2CP#

pymc_extras.distributions.R2D2M2CP(name: str, output_sigma: Variable | Sequence[Variable] | ArrayLike, input_sigma: Variable | Sequence[Variable] | ArrayLike, *, dims: Sequence[str], r2: Variable | Sequence[Variable] | ArrayLike, variables_importance: Variable | Sequence[Variable] | ArrayLike | None = None, variance_explained: Variable | Sequence[Variable] | ArrayLike | None = None, importance_concentration: Variable | Sequence[Variable] | ArrayLike | None = None, r2_std: Variable | Sequence[Variable] | ArrayLike | None = None, positive_probs: Variable | Sequence[Variable] | ArrayLike | None = 0.5, positive_probs_std: Variable | Sequence[Variable] | ArrayLike | None = None, centered: bool = False) R2D2M2CPOut[源代码]#

R2D2M2CP 先验分布。

参数:
  • name (str) – 分布的名称

  • output_sigma (Tensor) – 输出标准差

  • input_sigma (Tensor) – 输入标准差

  • dims (Union[str, Sequence[str]]) – 分布的维度

  • r2 (Tensor) – \(R^2\) 估计值

  • variables_importance (Tensor, optional) – 变量重要性的可选估计,正值,默认为 None

  • variance_explained (Tensor, optional) – 变量重要性的替代估计,是方差解释的点估计,总和应为 1,默认为 None

  • importance_concentration (Tensor, optional) – 围绕方差解释或变量重要性估计的置信度

  • r2_std (Tensor, optional) – \(R^2\) 的可选不确定性,默认为 None

  • positive_probs (Tensor, optional) – 变量贡献为正的可选概率,默认为 0.5

  • positive_probs_std (Tensor, optional) – 效应方向概率的可选不确定性,默认为 None

  • centered (bool, optional) – 分布的中心化或非中心化参数化,默认为非中心化。建议两种都检查。

返回:

输出方差(sigma 平方)被分解为残差方差和已解释方差。

返回类型:

residual_sigma, coefficients

引发:

TypeError – 如果参数化错误。

注释

R2D2M2CP 先验分布是 R2D2M2 先验分布的修改版。

示例

以下是在合成示例中解释的参数

警告

要在线性回归中使用先验分布

  • 确保 \(X\) 以零为中心

  • \(X\) 中心化时,截距表示先验预测均值

  • 需要设置命名维度

import pymc_extras as pmx
import pymc as pm
import numpy as np
X = np.random.randn(10, 3)
b = np.random.randn(3)
y = X @ b + np.random.randn(10) * 0.04 + 5
with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        positive_probs=[0.8, 0.5, 0.1],
        # NOTE: if you have different opinions about
        # where a variable should go.
        # NOTE: if you put 0.5 previously,
        # just put 0.1 there, but other
        # sigmas should work fine too
        positive_probs_std=[0.3, 0.1, 0.2],
        # NOTE: variable importances are relative to each other,
        # but larget numbers put "more" weight in the relation
        # use
        # * 1-10 for small confidence
        # * 10-30 for moderate confidence
        # * 30+ for high confidence
        # EXAMPLE:
        # "a" - is likely to be useful
        # "b" - no idea if it is useful
        # "c" - a must have in the relation
        variables_importance=[10, 1, 34],
        # NOTE: try both
        centered=True
    )
    # intercept prior centering should be around prior predictive mean
    intercept = y.mean()
    # regressors should be centered around zero
    Xc = X - X.mean(0)
    obs = pm.Normal("obs", intercept + Xc @ beta, eps, observed=y)

通过选择特定的参数集,可能存在特殊情况

这里 beta 的先验分布是 Normal(0, y.std() * r2 ** .5)

with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        centered=False
    )
    # intercept prior centering should be around prior predictive mean
    intercept = y.mean()
    # regressors should be centered around zero
    Xc = X - X.mean(0)
    obs = pm.Normal("obs", intercept + Xc @ beta, eps, observed=y)

可以不指定某些 _std 参数。您也可以仅指定 positive_probs,并且假定所有变量解释相同数量的方差(相同的重要性)

with pm.Model(coords=dict(variables=["a", "b", "c"])) as model:
    eps, beta = pmx.distributions.R2D2M2CP(
        "beta",
        y.std(),
        X.std(0),
        dims="variables",
        # NOTE: global shrinkage
        r2=0.8,
        # NOTE: if you are unsure about r2
        r2_std=0.2,
        # NOTE: if you know where a variable should go
        # if you do not know, leave as 0.5
        positive_probs=[0.8, 0.5, 0.1],
        # NOTE: try both
        centered=True
    )
    intercept = y.mean()
    obs = pm.Normal("obs", intercept + X @ beta, eps, observed=y)

注释

要引用 R2D2M2CP 实现,您可以使用以下 BibTeX 条目

@misc{pymc-extras-r2d2m2cp,
    title = {pymc-devs/pymc-extras: {P}ull {R}equest 137, {R2D2M2CP}},
    url = {https://github.com/pymc-devs/pymc-extras/pull/137},
    author = {Max Kochurov},
    howpublished = {GitHub},
    year = {2023}
}