## Research Methods for Ph. D. Studies: Statistical Research Methods Part 1 of 2.

Written by Dr Hannes Nel

Introduction

The value of qualitative research has never been amplified as it is now that we are facing a virus that will drastically change the world and how we interact socially and economically.

And we have never needed quantitative research as much as we do now if we are to survive the COVID-19 onslaught.

Now we know that both approaches to research are critically important if we are to survive and cope with the new world order that has only just begun.

Statistical research methods can be complicated and specific to a field of natural science.

Therefore, even if I had the knowledge, it would have taken up many more than the 40 posts that I posted so far to share it with you.

Besides statistical research methods mostly require the use of software specifically developed for the purpose of a field of research.

Those of you who wish to embark on such research will probably already know the computer programme that you will use for your research.

Else you will need to familiarize yourself with the software before you do your research.

I will discuss the following issues related to statistical research methods in this article:

1. Investigating a statistical hypothesis.
2. Conducting statistical regression analysis.

Investigating a statistical hypothesis

You will mostly use a hypothesis in statistical research, although it is also possible to base your research on a problem statement or question.

You will need to formulate two opposing hypotheses – the null hypothesis and the alternative hypothesis.

The null hypothesis, indicated with H0, (H-naught) is a statement about the population that you believe to be true.

The alternative hypothesis, indicated with H1, is a claim about the population that is contradictory to H0. It is what we will conclude when you reject H0.

A null hypothesis can often be proved or disproved by means of statistical research.

One of your samples will support the H0 hypothesis while the other will support the H1 hypothesis.

You will reject the H0 hypothesis if the sample information favours the H1 hypothesis.

Or you will not reject the H0 hypothesis if the sample information is insufficient to reject it.

For example, your H0 hypothesis can be:

30% or less of the people who contracted the COVID-19 virus lived in rural areas.

You can also write the null hypothesis like this: H0 ≤ .3

You H1 hypothesis will then be:

More than 30% of the people who contracted the COVID-19 virus did not live in rural areas.

You can also write the alternative hypothesis like this: N1 > .3

You will also need to calculate the size of the sample that you should use with a certain accuracy probability.

Dedicated computer programmes will do this for you.

Once you have composed a sample that will give you some answers with an acceptable level or probability, you will need to interpret the data that was probably analyzed with dedicated software.

You will need to set certain norms, or criteria, for the analysis of the data that you collected for the population first.

The samples also need to meet those norms, criteria or parameters.

A null hypothesis needs to be proven by comparing two sets of data.

If you reject the null hypothesis, then we can assume that there is enough evidence to support the alternative hypothesis.

That is: More than 30% of the people who contracted the COVID-19 virus did not live in rural areas.

You will probably compare the mean of observations or responses for the two sets of data.

It might sometimes be necessary to use the mode, median or correlation between the sets of data.

Random variability between different samples will also always be present.

There might also be small differences between the statistical relationship in the sample and the population.

This can be just a matter of sample error.

Dedicated computer software will do the statistical calculations for you.

A null hypothesis does not “prove” anything to be true, but rather that the hypothesis is false.

If you cannot prove the two phenomena or populations to be different, then they are probably the same.

Then again, if the statistical analysis does not enable you to reject the null hypothesis, it does not necessarily mean that the null hypothesis is true.

Conducting statistical regression

Statistical regression analysis is a generic term for all methods in which quantitative data is collected and interpreted to numerically express the relationship between two groups of variables.

The expression may be used either to describe the relationship between the two groups of variables.

It can also be used to predict values, although one must be careful of trying to predict future trends based on statistical data.

The two data groups, popularly represented by X and Y, are compared numerically or graphically to identify a relationship between the items or groups of items X and Y.

You can mostly use such comparisons to determine trends and correlation between variables.

It might, for example, be possible to identify a correlation between the hours that a student spends studying and his or her eventual performance in the exam.

Correlation measures the strength of association between two variables and the direction of the relationship.

In terms of the strength of the relationship, the values of the correlation coefficient will always vary between +1 and -1.

A value of +1 indicates a perfect degree of association between the two variables.

That means that if one thing happens, then something else will also happen.

For example, if you cut your arm you will bleed.

A value of -1 indicates a negative relationship between two variables.

For example, the faster you drive, the less time will it take you to reach your destination.

For example, the decrease in the number of new individuals that test positive for the COVID-19 virus does not enable us to predict when the pandemic will come to an end.

You can, perhaps, argue that correlation enables you to predict what will happen to one variable if a second variable changes.

However, predicting that such change will take place is often difficult, if not impossible in social sciences.

You can predict with a good measure of accuracy what will happen if you add certain amounts of yeast to the dough for baking bread.

But you cannot always predict how the baker will respond if she or he serves you the bread and you criticize it.

The situation is different in exact sciences, such as chemistry, where the scientist can initiate the change and control the size, measure and frequency of change.

Summary

You will probably use two opposing hypotheses in statistical research.

The null hypothesis is a statement about a population that you believe to be true.

The alternative hypothesis should contradict the null hypothesis.

You will use to samples to prove or disprove your hypothesis.

The findings that you gather from your analysis of the samples should apply to the population as well.

There might, however, be a sample error of which you should take note.

Statistical regression analysis investigates the relationship between two sets of variables.

It can show a correlation between the sets of variables.

It can also sometimes be used to predict values.

I would rather call it “foresee” values, because prediction based on statistics can be risky.

Relationships can be compared numerically or graphically.

Correlation between two variables can be anything between +1 and -1.

A value of +1 would indicate a perfectly positive correlation.

A value of -1 would indicate a perfectly negative correlation.