BuildModelFromDAG#
- class pymc_marketing.mmm.causal.BuildModelFromDAG(*, dag=FieldInfo(annotation=NoneType, required=True, description='DAG in DOT string format or A->B list'), df=FieldInfo(annotation=NoneType, required=True, description='DataFrame containing all DAG node columns'), target=FieldInfo(annotation=NoneType, required=True, description='Target node name present in DAG and df'), dims=FieldInfo(annotation=NoneType, required=True, description='Dims for observed/likelihood variables'), coords=FieldInfo(annotation=NoneType, required=True, description='Required coords mapping for dims and priors. All coord keys must exist as columns in df.'), model_config=FieldInfo(annotation=NoneType, required=False, default=None, description="Optional model config with Priors for 'intercept', 'slope' and 'likelihood'. Keys not supplied fall back to defaults."))[source]#
Build a PyMC probabilistic model directly from a Causal DAG and a tabular dataset.
The class interprets a Directed Acyclic Graph (DAG) where each node is a column in the provided
df
. For every edgeA -> B
it creates a slope prior for the contribution ofA
into the mean ofB
. Each node receives a likelihood prior. Dims and coords are used to align and index observed data viapm.Data
and xarray.- Parameters:
- dag
str
DAG in DOT format (e.g.
digraph { A -> B; B -> C; }
) or as a simple comma/newline separated list of edges (e.g."A->B, B->C"
).- df
pandas.DataFrame
DataFrame that contains a column for every node present in the DAG and all columns named by the provided
dims
.- target
str
Name of the target node present in both the DAG and
df
. This is not used to restrict modeling but is validated to exist in the DAG.- dims
tuple
[str
, …] Dims for the observed variables and likelihoods (e.g.
("date", "channel")
).- coords
dict
Mapping from dim names to coordinate values. All coord keys must exist as columns in
df
and will be used to pivot the data to match dims.- model_config
dict
, optional Optional configuration with priors for keys
"intercept"
,"slope"
and"likelihood"
. Values should bepymc_extras.prior.Prior
instances. Missing keys fall back to :pyattr:`default_model_config`.
- dag
Examples
Minimal example using DOT format:
import numpy as np import pandas as pd from pymc_marketing.mmm.causal import BuildModelFromDAG dates = pd.date_range("2024-01-01", periods=5, freq="D") df = pd.DataFrame( { "date": dates, "X": np.random.normal(size=5), "Y": np.random.normal(size=5), } ) dag = "digraph { X -> Y; }" dims = ("date",) coords = {"date": dates} builder = BuildModelFromDAG( dag=dag, df=df, target="Y", dims=dims, coords=coords ) model = builder.build()
Edge-list format and custom likelihood prior:
from pymc_extras.prior import Prior dag = "X->Y" # equivalent to the DOT example above model_config = { "likelihood": Prior( "StudentT", nu=5, sigma=Prior("HalfNormal", sigma=1), dims=("date",) ), } builder = BuildModelFromDAG( dag=dag, df=df, target="Y", dims=("date",), coords={"date": dates}, model_config=model_config, ) model = builder.build()
Methods
BuildModelFromDAG.__init__
(*[, dag, df, ...])Construct and return the PyMC model implied by the DAG and data.
Return a copy of the parsed DAG as a NetworkX directed graph.
Return a Graphviz visualization of the built PyMC model.
Attributes
default_model_config
Default priors for intercepts, slopes and likelihood using
pymc_extras.Prior
.