A Tutorial in Data Science: Lecture 4 – Statistical Inference via Systems of Hypothesis-Trees

by | Jan 12, 2021 | Math Lecture

As from Lecture 1, letting \(X^n\) be a random variable representing the n qualities that can be measured for the thing under investigation, \(\Omega\), itself the collected gathering of all its possible appearances, \(\omega \in \Omega\) such that each manifestation can be measured as a real number, i.e. (X^n:\omega \rightarrow {\mathbb{R}}^n\). Each sampled measurement of \(X^n\) through interaction with \(\omega\) is given as an \(\hat{X}^n(t_i)\), each one constituting a unit of indexable time in the catalogable measurement process. Thus, the set of sampled measurements, a sample space, is a partition of ‘internally orderable’ test times within the measurement action, \(\{ \hat{X}^n(t): t \in \pi \}\).

\(\Omega\) is a state-system, i.e the spatio-temporality of the thing in question, in that it has specific space-states \(\omega\) at different times \(\Omega(t)=\omega\). \(X\) is the function that measures \(\omega\). What if the measurement is not Real, but Complex: \(X: \Omega \rightarrow \mathbb{C}\)? While a real number results from a finite, approximate, or open-ended process of objective empirical measurement, an imaginary number results from a subjective intuition or presupposition to measurement. Every interaction with \(\Omega\) lets it appear as \(\omega\), which is quantified by \(X\). From these interactions, we seek to establish truths about \(\Omega\) as quantifying the probability that the Claim \(C\) is correct, which is itself a quantifiable statement about \(\Omega\).

Ultimately, we seek the nature of how \(\Omega\) appears differently depending on one’s interactions with it (i.e. samplings), as thus the actual distribution (\(\mathcal{D}\)) of the observed measurements, using our measurement apparatus $X$, that is, we ask about \(\mathcal{D}X(\Omega)=f{X(\Omega)}\). The assumptions will describe the class \(\mathcal{C}\) of the family \(\mathcal{F}\) of distribution functions which \(f_X\) belongs to, i.e. \(f_X \in \mathcal{F}{\mathcal{C}}\), for the \(\hat{X}\) measurements of the appearances of \(\Omega\), while the sampling will give the parameter \(\theta\), such that \(f_X =f{\mathcal{C}}(\theta)\). The hypothesis distribution-parameter (\(\theta^*\)) may be either established by prior knowledge (\(\theta_0\)) or some the present n-sampling of the state-system (\(\theta_1\)). Thus, the parameter obtained from the present sampling \(\hat{\theta}=\Theta(\hat{X_1}, \cdots \hat{X_n})\) is either used to judge the validity of a prior parameter estimation (\(\theta^=\theta_0\)) or is assessed in its own right (i.e. \(\theta^*=\theta_1=\hat{\theta}\)) as representative of the actual object’s state-system distribution, the difference between the two hypothesis set-ups, \textit{a priori vs. a posteriori}, being whether the present experiment is seen has having a bias or not. In either the prior or posteriori cases, \(H_{-}:\theta_0=\theta|\hat{\theta}\) or \(H_{+}:\hat{\theta}=\theta\), one uses the present sampling to establish the validity of a certain parameter value. If \(\hat{\Delta} \theta =\theta_0-\hat{\theta}\) is the expected bias of the experiment, then \(H_{-}:\hat{\theta}+\hat{\Delta}\theta=\theta|\hat{\theta}\) \& \(H_{+}:\hat{\theta}=\theta|\hat{\theta}\). Thus, in all experiments, the statistical question is primarily that of the bias of the experiment that samples a parameter, whether it is 0 or not, i.e. \(H_{-}:|\hat{\Delta}\theta|>0\) or \(H_{+}:\hat{\Delta}\theta=0\).

The truth of the bias of the experiment, i.e. how representative it is, can only be given by our prior assumptions, \(A\), such as to know the validity of our claim about the state-system’s distributional parameter, \(P(C|A)=P(\theta=\theta^*|\hat{\theta})=P(\Delta \theta=\hat{\Delta}\theta)\), as the probability our expectation of bias is correct. Our prior assumption, \(A: f_X \in \mathcal{F}_{\mathcal{C}}\) is about the distribution of the \(k\)-parameters in the class-family of distributions, where \(\mathcal{F}_{\mathcal{C}}={f(k)}, \ s.t. \ f_X=f(\theta)\), that is about \(\mathcal{D}K(\mathcal{F}_{\mathcal{C}})\). Here, \(K\) is a random variable that samples state-systems in the wider class of generally known objects, or equivalently their distributions (i.e. functional representations), measuring the \(k\)-parameter of their distribution, such that \(f_K(\mathcal{F}_{\mathcal{C}})=\mathcal{D}_K(\mathcal{F}_{\mathcal{C}})\). The distributed-objects in \(\mathcal{F}_{\mathcal{C}}\) are themselves relative to the measurement system \(X\) although they may be transformed into other measurement units, in that this distribution class is of all possible state-systems which \(X\) might measure sample-wise, for which we seek to know specifically about the \(\Omega\) in question to obtain its distributional \(k\)-parameter value of \(\theta\). Essentially, the assumption \(A\) is about a meta-state-system as the set of all objects \(X\) can measure, and thus has more to do with \(X\), the subject’s method of measurement, and \(\Theta\), the parametrical aggregation of interest, than with \(\Omega\), the specific object of measurement.

\(\theta \in \Theta\), the set of all the parameters to the family \(\mathcal{F}\) of relevant distributions, in that \(\Theta\) uniquely determines \(f\), in that \(\exists M: \Theta \rightarrow f \in \mathcal{F}\), or \(f=\mathcal{F}(\Theta)\).

Responses to this post:


Submit a Comment

Your email address will not be published. Required fields are marked *

All Categories

Math Resources

Student Tips

Math Learning

Math Test Preps

Math Lectures

Professional Math 

Math Fun Facts

Math Blogs

Share This