## ARTICLE 91: Research Methods for Ph. D. and Master’s Degree Studies: Data Analysis: Part 4 of 7 Parts: Elementary Analysis

Written by Dr. Hannes Nel

Most social qualitative research requires the analysis of several variables simultaneously (called “multivariate analysis”), for example the analysis of the simultaneous association of age, education, and gender would be an example of multivariate analysis. Specific techniques for conducting a multivariate analysis include factor analysis, multiple correlation, regression analysis, and path analysis. All techniques are based on the preparation and interpretation of comparative tables and graphs, so you should practise doing this if you do not already know how.

These are largely quantitative techniques. Fortunately, the statistical calculations are done for you by the computer, so just be aware of the definitions.

Factor analysis. Factor analysis is a statistical procedure used to uncover relationships among many variables. This allows numerous inter-correlated variables to be condensed into fewer dimensions, called factors. It is possible, for example, that variations in three or four observed variables mainly reflect the variations in a single unobserved variable, or in a reduced number of unobserved variables. Clearly this type of analysis is mostly numerical in nature. Factors are analysed inductively to determine trends, relationships, correlations, causes of phenomena, etc. Factor analysis searches for variations in response to variables that are difficult to observe and that are suspected to have an influence on events or phenomena.

Multiple correlation. Multiple correlation is a statistical technique that predicts values of one variable based on two or more other variables. For example, what will happen to the incidence of HIV AIDS (variable that we are doing research on) in a particular area if unemployment increases (variable 1), famine breaks out (variable 2) and the incidence of TB (variable 3) increases?

Multiple correlation is a linear relationship among more than two variables. It is measured by the coefficient of multiple determination, which is a measure of the fit of a linear regression. A linear regression falls somewhere between zero and one (assuming a constant term has been included in the regression); a higher value indicates a stronger relationship between the variables, with a value of one indicating a perfect relationship and a value of zero indicating no relationship at all between the independent variables collectively and the dependent variable.

Path analysis. Path analysis can be a statistical method of finding cause/effect relationships, a method for finding the trail that leads users to websites or an operations research technique. We also have “critical path analysis” which is mostly used in project management and is a method by means of which activities in a project are planned to be executed in a logical sequence of events to ensure that the project is completed in an efficient and effective manner. We are concerned about path analysis as an operations research technique here.

Path analysis is a method of decomposing correlations into different pieces of interpretation of effects (e.g. how does parental education influence children’s income when they are adults?). Path analysis is closely related to multiple regression; you might say that regression is a special case of path analysis. It is a “causal model” because it allows us to test theoretical propositions about cause and effect without manipulating variables.

Regression analysis. Regression analysis can be used to determine which factors influence events, phenomena, or relationships.

Regression analysis includes a variety of techniques for modelling and analysing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. If, for example, you wish to determine the effect of tax, legislation and education on levels of employment, levels of employment will be the dependent variable while tax, legislation and education will be the independent variables. More specifically, regression analysis helps one understand how to maintain control over a dependent variable. In the level of employment example, you might wish to know what should be done in terms of tax, legislation and education to improve employment or at least to maintain a healthy level of employment. In this example it is of interest to characterise the variation of the dependent variable around the regression function, which can be described by a probability distribution (how much the level of employment would change and in what direction if all, some or one of the independent variables change by a particular value).

Regression analysis typically estimates the conditional expectation of the dependent variable given the independent variables – that is, the average value of the dependent variable when the independent variables are held fixed. Seen from this perspective, the example of employment levels would mean investigating what would happen if tax, legislation and education remain unchanged.

Regression analysis is widely used for prediction and forecasting, although this should be done with circumspection. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, to explore the forms of these relationships. Regression analysis presupposes causal relationships between the independent and dependent variables, although investigation can also show that such relations do not exist. An example of using regression analysis, also called “multiple regression” is to determine which factors from colour, paper type, number of advertisements and content (independent variables) have the biggest effect on the number of magazines sold (dependent variable).

Summary

Multivariate analysis can be used for the analysis of several variables simultaneously.

Techniques that can be used for conducting multivariate analysis include factor analysis, multiple correlation, path analysis and regression analysis.

Factor analysis is used to uncover relationships among many variables.

Factors are analysed inductively to determine trends, relationships, correlations, cause of phenomena, etc.

Multiple correlation predicts values of one variable based on two or more other variables.

Multiple correlation is a linear relationship among more than two variables.

Path analysis seeks cause/effect relationships.

It can also be used to find data or to manage projects.

Regression analysis can be used to determine which factors influence events, phenomena or relationships.

It includes a variety of techniques for modelling and analysing several variables when the focus is on the relationship between a dependent variable and one or more independent variables.

Regression analysis helps us to understand how to maintain control over a dependent variable.

Close

Statistics are a wonderfully flexible way in which to analyse data.

Dedicated computer software can do the calculations for us and show us the numbers in tabular and graphic format.

All we need to do, is to analyse the numbers or graphs.

It is mostly quite easy to interpret visual material.

And you will impress your study leader, lecturer and other stakeholders in your research if you use such analysis techniques.

Most importantly, it will be so much easier and faster to come to conclusions and to derive valid and accurate findings from your conclusions.