Updated: Apr 6
The test statistic is that random variable which compares your data from the experiment (or survey) with what is expected under the null hypothesis. The value for this test statistic is calculated from the sample data. This value is then used in a hypothesis test to decide in favor of null hypothesis or in favor of alternate hypothesis. The test statistic is used to calculate the p-value.
For instance, a pharma company thinks that Vaccine X will cure 70% of COVID-19 patients. A widely accepted fact (null hypothesis) is that all the symptoms will naturally go away within a period of 2 - 3 weeks. An trial is conducted on the infected patients and find that 76% of the patients are cured with the vaccine X.
Is this result significant?
Does the vaccine work?
Is 76% positive result because of biased sampling?
All such questions can be answered using a test statistic and hypothesis testing.
Specific statistic is used for different hypothesis test based on the probability model assumed in the null hypothesis. For example, the test statistic for a Z-test is the Z-statistic (also called a Z-value), which has the standard normal distribution under the null hypothesis. Most common statistics in machine learning being:
Z-Score: Used with Z-Test
T-Score: Used with T-Test
F-statistic: Used with ANOVA test
Chi-square statistic: Used with Chi-Square Test
As standardized test statistic is used in hypothesis testing, its generic formula is -
Standardized test statistic = (statistic - parameter) / (standard deviation of the statistic)
While this is the generic form to calculate standardized statistic, actual formula might change according to the specific test being deployed.
Example for Z-Scores: Z = (X - µ) / σ
X is the test statistic
µ is the mean of the statistic or population parameter
σ is the standard deviation of the statistic X