Causal inference - Luke Report 🫡

# Goal Translate a proof of Robert Spekkens's [no-go theorem on non-contextuality](https://arxiv.org/abs/quant-ph/0406166) into the language of causal inference, counterfactuals, and [consistency](https://pubmed.ncbi.nlm.nih.gov/19829187/). ## Non-contextuality [[@spekkensContextualityPreparationsTransformations2005]] 0. An operational theory is specified by the probabilities $p(k|P,M)$for outcomes $k$ that may result from a measurement procedure $M$ given a particular preparation procedure $P$. Operational equivalence between preparation procedures is defined as $P \simeq P' \iff p(k|P, M) = p(k|P',M) \text{ for all } M$and likewise operational equivalence for measurements$M \simeq M' \iff p(k|P, M) = p(k|P, M')\text{ for all } P.$ 1. An ontological model of a preparation and measurement procedure provides a specification of: 1. for every preparation procedure $P$, a probability density $\mu_P : \Omega \to [0,1]$ over the ontological model's variables $\lambda \in \Omega$ such that $\int \mu_P(\lambda) d\lambda = 1$. 2. for every measurement procedure $M$ and outcome $k$, $\xi_{M, k} : \Omega \to [0,1]$ such that $\sum_k \xi_{M, k}(\lambda) = 1$ for all $\lambda$. which satisfy 3. $p(k|P, M) = \int_\Omega d\lambda \xi_{M, k}(\lambda) \mu_P(\lambda)$ for all $P$ and $M$. 2. Preparation non-contextuality is defined as: $\mu_P = \mu_{P'} \text{ for all } P \simeq P'$ 3. Measurement non-contextuality is defined as: $\xi_{M, k} = \xi_{M',k} \text{ for all } M \simeq M'$ 4. An ontological model of an operational theory is call non-contextual if it is both preparation and measurement non-contextual. ## Consistency [[@vanderweeleConcerningConsistencyAssumption2009]] We denote by $Y_j(x,k)$ the potential outcome for individual $j$ if exposure $X$ is set to the value $x$ by means $k$, and by $p(Y_j|x, k)$ the distribution of potential outcomes for individual $j$ under the intervention $x$ by means $k$. 1. Treatment-variation irrelevance is defined for individuals as$Y_j(x,k_x) = Y_j(x, k_x') \text { for all } k_x, k_x' \in K_x$which says the potential outcome for individual $j$ under the intervention $x$ does not depend on the means by which the intervention was performed. For ensembles of individuals,$p(Y_j|x,k_x) = p(Y_j|x,k_x') \text{ for all } k_x, k_x' \in K_x$says that the distribution of potential outcomes for individual $j$ only depends on the type of treatment, not the means of treatment (e.g. which doctor performed it). 2. If treatment-variation irrelevance holds, then consistency is defined as$∃k_x ∈K_x \text{ such that } Y_j^\text{obs} = Y_j(x,k_x)$and$∃k_x ∈K_x \text{ such that } p(Y_j|x) = p(Y_j|x,k_x)$which says that the empirical distribution # Translation For simplicity, I am folding the transformation procedures into the preparations and measurements. - Intervention $x$ : Procedure equivalence class $([P], [M])$ - $P, P' \in [P] \iff P \simeq P'$ as defined above. - Likewise for $M$. - Means $k_x \in K_x$ : Contextual features of a procedure $(P, M) / ([P], [M])$ - Example: idiosyncrasies of doctor and patient undergoing a procedure $x$. - "It follows that one can distinguish two types of features of an experimental procedure: the first type of feature is one that is specified by specifying **the equivalence class that the procedure falls in**, while the second type is **one that is not**. The set of features of the second type – those that are not specified by specifying the equivalence class – we call the context of the experimental procedure." [[@spekkensContextualityPreparationsTransformations2005]] - Together we identify: $X = (x, k_x) = (P, M)$ - Spekkens labels the densities $\mu_P$ and $\xi_{k, M}$ with $P$ or $M$, but we should have a map that takes some $\lambda$ and tells us what the context is: $X = X(\lambda) = (P(\lambda), M(\lambda))$ given a state of the ontological model. - [ ] Counterfactual $Y_j(x,k_x)$ : Define a function $Y : \Omega \to \mathcal P \times \mathcal M \to \mathcal K$ such that $Y(X(\lambda)) = Y(P(\lambda), M(\lambda)) \in \mathcal K$ given model state $\lambda$. - Or should we allow e.g. an average over model variables to recover the state? What's the difference between a state and the degrees of freedom of the system? - Stochastic counterfactual $p_{Y_j}(x, k)$ : Ontic densities $(\mu_P : \Omega \to [0,1], \xi_{M, k} : \Omega \to [0,1])$ - Observed value $Y^\text{obs}(x,k_x)$ : Measurement outcome $k|P,M$ - Observed values of distribution $p(Y_j^\text{obs} |x,k_x)$ : Empirical distribution $P(k|P, M)$ - [ ] Study stochastic counterfactuals in more detail [[@vanderweeleStochasticCounterfactualsStochastic2012]]