1. Further Pure Mathematics 1

1.1 Matrices

1.1.1 Matrix operations and inverse of 2x2 and 3x3 matrices

1.1.2 Geometric transformations using 2x2 matrices

1.1.3 Invariant points and lines under matrix transformations

1.2 Polar coordinates

1.2.1 Conversion between Cartesian and polar forms

1.2.2 Sketching and analysing polar curves

1.2.3 Area enclosed by polar curves

1.3 Vectors

1.3.1 Plane equations in vector and Cartesian forms

1.3.2 Vector product and its applications

1.3.3 Line and plane intersections and perpendiculars

1.3.4 Angle between lines, planes and shortest distance

1.4 Proof by induction

1.4.1 Mathematical induction for sequences and formulae

1.4.2 Conjectures and proofs via induction

2. Further Probability & Statistics

2.1 χ²-tests

2.1.1 Goodness of fit and distribution fitting

2.1.2 Independence testing using contingency tables

2.2 Non-parametric tests

2.2.1 Sign test, Wilcoxon signed-rank and rank-sum tests

2.2.2 Hypothesis testing using non-parametric methods

2.3 Probability generating functions

2.3.1 PGFs for common distributions

2.3.2 Mean and variance from PGFs

2.3.3 Sums of independent variables via PGFs

2.4 Continuous random variables

2.4.1 Piecewise PDF and calculation of expectations

2.4.2 Relationship between PDF and CDF

2.4.3 CDF transformations and related variables

2.5 Inference using normal and t-distributions

2.5.1 t-tests for population mean with small samples

2.5.2 Pooled variance and two-sample comparisons

2.5.3 Confidence intervals using t and normal distributions

3. Further Pure Mathematics 2

3.1 Hyperbolic functions

3.1.1 Definitions and graphs of hyperbolic functions

3.1.2 Identities and inverse hyperbolic functions

3.2 Matrices

3.2.1 Solving systems of linear equations using matrices

3.2.2 Consistency of systems and geometric interpretation

3.2.3 Eigenvalues, eigenvectors and diagonalisation

3.2.4 Matrix powers and characteristic equation

3.3 Differentiation

3.3.1 Differentiating inverse and hyperbolic functions

3.3.2 Second derivatives and parametric/implicit cases

3.3.3 Maclaurin series for standard functions

3.4 Integration

3.4.1 Integration of hyperbolic and standard forms

3.4.2 Trigonometric and hyperbolic substitutions

3.4.3 Reduction formulae and area bounds via rectangles

3.4.4 Arc length and surface area of revolution

3.5 Complex numbers

3.5.1 De Moivre’s theorem and its applications

3.5.2 Multiple angle identities and roots of unity

3.6 Differential equations

3.6.1 First-order linear equations using integrating factor

3.6.2 Complementary function and particular integral

3.6.3 Substitution methods to simplify equations

3.6.4 Solving with initial conditions and interpretation

4. Further Mechanics

4.1 Motion of a projectile

4.1.1 Equations of motion for projectiles

4.1.2 Trajectory and Cartesian equation of a projectile

4.2 Equilibrium of a rigid body

4.2.1 Moments and centre of mass

4.2.2 Composite bodies and equilibrium conditions

4.3 Circular motion

4.3.1 Angular speed and radial acceleration

4.3.2 Motion in vertical and horizontal circles

4.4 Hooke's law

4.4.1 Elastic force and modulus of elasticity

4.4.2 Elastic potential energy and energy methods

4.5 Linear motion under a variable force

4.5.1 Differential equations for variable force motion

4.6 Momentum

4.6.1 Coefficient of restitution and Newton’s experimental law

4.6.2 Oblique and direct impact using conservation laws

Hypothesis testing using non-parametric methods

Topic 2/3

Your Flashcards are Ready!

15 Flashcards in this deck.

Hypothesis Testing Using Non-parametric Methods

Introduction

Hypothesis testing using non-parametric methods is a fundamental topic in the study of further probability and statistics, particularly within the AS & A Level Mathematics curriculum (9231). Unlike parametric tests, non-parametric methods do not assume a specific distribution for the data, making them versatile tools for analyzing various types of data. This article explores the concepts, applications, and advanced aspects of non-parametric hypothesis testing, providing students with a comprehensive understanding essential for academic success.

Key Concepts

Understanding Hypothesis Testing

Hypothesis testing is a statistical method used to make inferences or draw conclusions about a population based on sample data. It involves formulating two competing hypotheses: the null hypothesis ($H_0$) and the alternative hypothesis ($H_1$). The null hypothesis typically represents a statement of no effect or no difference, while the alternative hypothesis indicates the presence of an effect or difference.

The process of hypothesis testing involves the following steps:

Formulate Hypotheses: Define the null and alternative hypotheses.
Select Significance Level ($\alpha$): Commonly set at 0.05, it denotes the probability of rejecting $H_0$ when it is true.
Choose the Appropriate Test: Depending on the data characteristics and research question.
Calculate the Test Statistic: A standardized value that measures the degree of agreement between the sample data and $H_0$.
Determine the p-value or Critical Value: Assess the evidence against $H_0$.
Make a Decision: Reject $H_0$ if the p-value is less than $\alpha$ or if the test statistic exceeds the critical value.

Parametric vs. Non-parametric Tests

Parametric tests assume underlying statistical distributions in the data (e.g., normal distribution) and have specific parameters (e.g., mean, variance) that describe these distributions. Common parametric tests include the t-test and ANOVA. However, when data do not meet these assumptions, non-parametric tests offer a robust alternative.

Non-parametric tests, also known as distribution-free tests, do not rely on data belonging to any particular distribution. They are particularly useful when dealing with ordinal data, nominal data, or when sample sizes are small and do not meet the assumptions required for parametric tests.

Common Non-parametric Tests

Several non-parametric tests are frequently used in hypothesis testing:

Mann-Whitney U Test: Compares differences between two independent groups when the dependent variable is either ordinal or continuous but not normally distributed.
Wilcoxon Signed-Rank Test: Assesses differences between two related samples or matched pairs.
Kruskal-Wallis H Test: An extension of the Mann-Whitney U test for comparing more than two independent groups.
Friedman Test: Used for detecting differences in treatments across multiple test attempts.
Chi-Square Test: Evaluates the association between categorical variables.

Mann-Whitney U Test

The Mann-Whitney U test is a non-parametric alternative to the independent samples t-test. It evaluates whether there is a significant difference between the distributions of two independent groups. $$ U = n_1 n_2 + \frac{n_1 (n_1 + 1)}{2} - R_1 $$ Where:

$n_1$, $n_2$ are the sample sizes of the two groups.
$R_1$ is the sum of the ranks for the first group.

**Procedure:**

Combine and rank all observations from both groups.
Calculate the sum of ranks for each group.
Compute the U statistic using the formula above.
Compare the U statistic to critical values from the Mann-Whitney distribution table to determine significance.

Wilcoxon Signed-Rank Test

The Wilcoxon Signed-Rank test is used for comparing two related samples or repeated measurements on a single sample to assess whether their population mean ranks differ. It is the non-parametric counterpart to the paired t-test. **Steps:**

Calculate the differences between paired observations.
Rank the absolute differences, ignoring signs.
Assign ranks to positive and negative differences separately.
Sum the ranks for positive and negative differences.
The smaller of these sums is used as the test statistic.
Compare the test statistic to critical values to determine significance.

Kruskal-Wallis H Test

The Kruskal-Wallis H test extends the Mann-Whitney U test to more than two independent groups. It assesses whether the medians of the groups are different. **Formula:** $$ H = \left(\frac{12}{N(N+1)}\right) \sum \frac{R_i^2}{n_i} - 3(N+1) $$ Where:

$N$ is the total number of observations.
$R_i$ is the sum of ranks for group $i$.
$n_i$ is the sample size for group $i$.

**Procedure:**

Rank all combined observations.
Calculate the sum of ranks for each group.
Apply the Kruskal-Wallis formula to compute the H statistic.
Compare $H$ to the chi-square distribution with $k-1$ degrees of freedom, where $k$ is the number of groups.

Chi-Square Test

The Chi-Square test evaluates the association between two categorical variables. It determines whether observed frequencies differ from expected frequencies under the null hypothesis of independence. **Formula:** $$ \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} $$ Where:

$O_i$ = Observed frequency.
$E_i$ = Expected frequency.

**Procedure:**

Construct a contingency table with categories of the variables.
Calculate expected frequencies for each cell.
Compute the Chi-Square statistic using the formula.
Compare the statistic to the Chi-Square distribution with appropriate degrees of freedom.

Advantages of Non-parametric Tests

Do not assume a specific distribution, making them versatile.
Applicable to ordinal and nominal data.
Robust to outliers and skewed data.
Useful with small sample sizes.

Limitations of Non-parametric Tests

Generally less powerful than parametric tests when assumptions of parametric tests are met.
Often provide less information about the data (e.g., no estimates of effect size).
Harder to interpret results in terms of measures like mean or variance.

Examples and Applications

**Example 1: Mann-Whitney U Test** A researcher wants to compare the effectiveness of two teaching methods on student performance. Due to the non-normal distribution of test scores, the Mann-Whitney U test is appropriate. **Example 2: Wilcoxon Signed-Rank Test** Assessing whether a new diet plan leads to weight loss by comparing pre-diet and post-diet weights of the same individuals. **Example 3: Chi-Square Test** Investigating the association between gender and voting preference in an election.

Advanced Concepts

Theoretical Foundations of Non-parametric Hypothesis Testing

Non-parametric hypothesis testing is grounded in the concept of rank-based methods. Instead of relying on parameter estimates like means and variances, these tests use the order or ranks of data points to derive statistical measures. This approach makes non-parametric tests distribution-free, providing flexibility in analyzing data that do not conform to traditional parametric assumptions. **Ranks and Medians:** Ranks (`R_i`) play a pivotal role in non-parametric tests. By transforming data into ranks, these tests mitigate the influence of extreme values and non-normal distributions. The median often serves as the central tendency measure in non-parametric statistics, replacing the mean used in parametric tests. **Wilcoxon Signed-Rank Test Derivation:** The Wilcoxon Signed-Rank test statistic is derived from the differences between paired observations: $$ d_i = X_{i} - Y_{i} $$ Each difference is ranked by its absolute value: $$ R_i = \text{rank}(|d_i|) $$ The test statistic ($W$) is the sum of the ranks of the positive differences: $$ W = \sum R_i \text{ for } d_i > 0 $$ Under the null hypothesis, the distribution of $W$ is symmetric, allowing for the calculation of p-values without assuming normality.

Asymptotic Properties

As sample sizes increase, non-parametric test statistics often approximate normal distributions due to the Central Limit Theorem. This property allows for the use of Z-scores and related methods in large samples, facilitating easier interpretation and comparison with parametric counterparts. **Example: Mann-Whitney U Test as Normal Approximation** For large sample sizes, the U statistic can be approximated by a normal distribution with mean and variance calculated as: $$ \mu_U = \frac{n_1 n_2}{2} $$ $$ \sigma_U = \sqrt{\frac{n_1 n_2 (n_1 + n_2 +1)}{12}} $$ Thus, the Z-score is: $$ Z = \frac{U - \mu_U}{\sigma_U} $$

Power of Non-parametric Tests

The power of a statistical test refers to its ability to correctly reject a false null hypothesis. While non-parametric tests are more flexible, they can be less powerful than parametric tests when the assumptions of parametric tests are met. However, when data violate these assumptions, non-parametric tests can be more powerful. **Factors Influencing Power:**

Sample size: Larger samples generally increase power.
Effect size: Greater differences between groups enhance power.
Data distribution: Non-parametric tests maintain higher power with non-normal data.

Interdisciplinary Connections

Non-parametric hypothesis testing intersects with various fields, highlighting its broad applicability:

Medicine: Comparing patient outcomes across different treatment protocols without assuming normal distribution of recovery times.
Social Sciences: Analyzing survey data where responses are ordinal or categorical.
Economics: Evaluating market trends where data may be skewed or contain outliers.
Engineering: Assessing the reliability of components where failure times do not follow a normal distribution.

Advanced Problem-Solving Techniques

Solving complex problems using non-parametric methods often involves multiple steps and integration of various statistical concepts. Consider the following advanced problem: **Problem:** A researcher conducts a study to evaluate the effectiveness of three different diets on weight loss. The data collected includes weights of participants before and after the diet. However, the distribution of weight loss does not follow a normal distribution. How should the researcher proceed with hypothesis testing? **Solution:** 1. **Identify the Appropriate Test:** Since there are three related groups (before and after each diet), the Friedman Test is suitable. 2. **Formulate Hypotheses:** - $H_0$: There is no difference in weight loss across the three diets. - $H_1$: At least one diet results in different weight loss. 3. **Data Preparation:** - Rank the weight loss within each participant across the three diets. - Sum the ranks for each diet. 4. **Calculate the Friedman Test Statistic:** $$ \chi^2_F = \frac{12}{n k (k+1)} \sum R_j^2 - 3 n (k+1) $$ Where: - $n$ = number of participants. - $k$ = number of diets. - $R_j$ = sum of ranks for diet $j$. 5. **Determine Significance:** Compare $\chi^2_F$ to the chi-square distribution with $k-1$ degrees of freedom. 6. **Interpret Results:** If $\chi^2_F$ exceeds the critical value, reject $H_0$ and conclude that at least one diet differs significantly in effectiveness.

Comparison Table

Aspect	Parametric Methods	Non-parametric Methods
Data Assumptions	Assume specific distributions (e.g., normality)	Do not assume specific distributions
Data Type	Interval or ratio data	Ordinal, nominal, or non-normal interval data
Examples of Tests	t-test, ANOVA	Mann-Whitney U, Wilcoxon, Chi-Square
Power	Higher when assumptions are met	More robust with non-normal data but generally less powerful
Robustness	Sensitive to outliers and deviations from assumptions	Less sensitive to outliers and assumption violations

Summary and Key Takeaways

Non-parametric hypothesis tests offer flexibility without assuming data distribution.
Common tests include Mann-Whitney U, Wilcoxon Signed-Rank, Kruskal-Wallis, and Chi-Square.
These tests are ideal for ordinal, nominal, or non-normally distributed data.
Understanding both parametric and non-parametric methods enhances analytical capabilities.
Application spans various disciplines, emphasizing their interdisciplinary relevance.

Examiner Tip

Tips

• **Memorize Test Purposes:** Know which non-parametric test suits your data type and research question. • **Rank Carefully:** Always rank your data accurately, handling ties appropriately to ensure correct test statistics. • **Understand Assumptions:** Even though non-parametric tests are flexible, they have their own assumptions like independent samples for Mann-Whitney U. • **Use Visual Aids:** Graphs like box plots can help visualize differences between groups before performing tests. • **Practice with Examples:** Regularly solve varied problems to reinforce your understanding and application skills for exams.

Did You Know

Non-parametric methods have been pivotal in groundbreaking research. For instance, the Mann-Whitney U test was employed in early clinical trials to compare patient responses before parametric methods were widely accepted. Additionally, the Chi-Square test played a crucial role in the development of genetics by helping scientists understand the distribution of traits in populations. These tests continue to be essential tools in various scientific discoveries today.

Common Mistakes

1. **Misapplying Tests:** Students often use non-parametric tests for data that meet parametric assumptions, leading to less powerful results. Incorrect: Using Mann-Whitney U for normally distributed data. Correct: Use a t-test when data are normally distributed. 2. **Ignoring Ties in Ranks:** Failing to properly account for tied ranks can skew results. Incorrect: Treating tied values as unique ranks. Correct: Assign the average rank to tied values. 3. **Confusing Hypotheses:** Mixing up null and alternative hypotheses regarding median differences. Incorrect: Stating $H_0$ as there is a difference. Correct: $H_0$: There is no difference in medians.

FAQ

What are non-parametric tests used for?

Non-parametric tests are used when data do not meet the assumptions required for parametric tests, such as normal distribution. They are ideal for ordinal, nominal, or skewed interval data.

How do non-parametric tests differ from parametric tests?

Parametric tests assume specific data distributions and rely on parameters like mean and variance, while non-parametric tests do not assume a particular distribution and often use data ranks.

When should I use the Mann-Whitney U Test?

Use the Mann-Whitney U Test when comparing two independent groups with ordinal data or continuous data that are not normally distributed.

Can non-parametric tests handle small sample sizes?

Yes, non-parametric tests are especially useful for small sample sizes as they do not rely on large-sample distribution assumptions.

What is the significance level in hypothesis testing?

The significance level ($\alpha$) is the threshold probability for rejecting the null hypothesis. A common value is 0.05, indicating a 5% risk of concluding that a difference exists when there is none.

Are non-parametric tests always better than parametric tests?

No, non-parametric tests are not always better. They are more flexible but generally less powerful than parametric tests when the parametric assumptions are met.

1. Further Pure Mathematics 1

1.1 Matrices

1.1.1 Matrix operations and inverse of 2x2 and 3x3 matrices

1.1.2 Geometric transformations using 2x2 matrices

1.1.3 Invariant points and lines under matrix transformations

1.2 Polar coordinates

1.2.1 Conversion between Cartesian and polar forms

1.2.2 Sketching and analysing polar curves

1.2.3 Area enclosed by polar curves

1.3 Vectors

1.3.1 Plane equations in vector and Cartesian forms

1.3.2 Vector product and its applications