ME-413+quantitative+data+analysis

Dear members, please find our first draft for your comments and action. Thanking you. University of Agder

Master in Development Management

ME-413 Research Methods in Development Studies

Task 2.3, Activity B Deadline: 24/02/12 Pages:

** Conducting Quantitative Data Analysis **

** Gbowee group: **

Susan Ajambo (**weaver**) Hege Lovise Lundgreen Linn Vestbø Teklebrhan [H1] Gebremichael Veronica Donoso Orgaz

// Copying text written by other people or using the work of other people without cross-referencing, may be considered as cheating. //
 * I/we confirm that I/we do not refer to others or in any other way use the work of others without stating it hence I/we confirm that all references are given in the bibliography. || Yes || No ||

=Question 1: variables=

1.1 Types of variables
Classifications of variables are divided into four main types: interval/ratio variables; ordinal variables; nominal variables; dichotomous variables (Bryman, 2008 p. 321).
 * Interval/ratio variables:** These are variables where the distance between the variables are identical across the range of categories.
 * Ordinal variables:** These are variables can be rank ordered but the distances between the categories are not equal across the range.
 * Nominal variables:** These variables are also known as categorical variables and they cannot be rank ordered.
 * Dichotomous variables:** These variables contain data that have only two categories

The types of the 6 variables presented in the “Statistics Notebook” Sheet 1, are therefore, classified as below:
 * Variable 1 = Nominal
 * Variable 2 = Dichotomous
 * Variable 3 = Nominal
 * Variable 4 = Interval/ratio.
 * Variable 5 = Interval/ratio.
 * Variable 6 = Interval/ratio.

1.2 Why it is important to distinguish between the four different types of variable?
Distinguishing between the four different types of variables is important because it helps the researcher to understand his/her data and to determine the appropriate methods of analysis. Conducting quantitative research involves generation of different types of data. The data can be in form of real numbers, lists of categories, in some cases it can be rank ordered while in others it cannot, (Bryman, 2008, p.321). The diversity of the data generated justifies its classification into variables and makes it easier to deal with it (data).

In addition, some data analysis methods are used in relation to some variables and not the others. According to Bryman, (2008, p.325-327), methods such as the arithmetic mean and Pearson’s r for example should only be used in relation to interval/ration variable. This implies that data analysis involves matching the techniques to the types of variables created and used. Therefore, distinguishing between variables is important in determining the appropriate methods to use for analysis. =Question 2: Univariate Analysis=

Univariate analysis refers to the analysis of one variable at a time and it utilizes techniques such as the frequency table and diagrams like the bar chart, pie chart or histogram. A frequency table provides the number of respondents and the percentages for each category in the variable and it can be used in relation to all kinds of variables. Diagrams too can be used to display data and are relatively easy to interpret and understand. However, the type of variable created from the data determines the diagram to use for analysis, for example, bar charts or pie charts are appropriate for nominal or ordinal variables whereas histograms can be used dealing with interval/ratio variables, (Bryman, 2008, pp. 322-324).

2.1 Frequency table showing the gender breakdown
Table 1: showing the gender breakdown
 * **Gender ** ||  ||
 * || **Frequency ** || **Percent ** || **Valid Percent ** ||||  ||
 * Valid || Female || 5 || 50,0 || 50,0 ||||  ||
 * || Male || 5 || 50,0 || 50,0 ||||  ||
 * || Total || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">100,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">100,0 ||||  ||


 * **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;"><range type="comment" id="823437">Statistics ** ||
 * <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Gender ||
 * <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">N || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Valid || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10 ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Missing || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">0 [H2] ||

There are equal numbers of males and females in the distribution. [H3]

<range type="comment" id="579811">2.2 A bar chart showing the country of birth break down
Table 2: Showing country of birth breakdown
 * **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Country of Birth ** ||  ||
 * || **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Frequency ** || **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Percent ** || **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Valid Percent ** ||||  ||
 * <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Valid || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Chile || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">1 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Ghana || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">2 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">20,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">20,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Norway || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">1 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">South Africa || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">2 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">20,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">20,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Uganda || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">3 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">30,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">30,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">USA || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">1 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10,0 ||||  ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Total || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">100,0 || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">100,0 ||||  ||


 * **<span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Statistics ** ||
 * <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Country of Birth ||
 * <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">N || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Valid || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">10 ||
 * || <span style="font-family: 'Arial','sans-serif'; font-size: 12px;">Missing || <span style="display: block; font-family: Arial,sans-serif; font-size: 12px; text-align: right;">0 ||

2.3 Calculate the //mean//, //median//, and //mode// of the first class test results (measure of central tendency)
The Measure of central tendency summarizes a distribution of values in one figure - it presents an average for a distribution. Quantitative data analysis utilizes three different forms of averages explained below:


 * //Mean//**is the average - all values in a distribution is summed up and divided by the number of values present.


 * //Median//**is the mid-point of all the values present. While the mean is vulnerable to outliers (extreme values in either end) the median is not and in case of an uneven number, the mean of the two middle numbers present the median.


 * //Mode//** //i//s the value that occurs most often in a distribution (Bryman, 2008, p. 325).

Table 3: Class test 1- mean median and mode [N4]

The mean is 14, the median is also 14 and the mode is …. <range type="comment" id="285640">So the central tendency is 14 correct answers out of 20. [N5]

2.4 Calculate the //range// of the second class test results (measure of dispersion)
The measure of dispersion refers to the amount of variation in a sample. This can be measured using the range or the standard deviation.

The **range** is the difference between the maximum and the minimum value in a distribution of values associated with an interval ratio variable, (Bryman, 2008, p.325).

Table 4: Class test 2- Range

The range is 14 - the lowest number of correct answers is 6 and the highest is 20. Thus the dispersion, or range, between the two is 14. =Question 3- Bivariate Analysis<span style="font-family: 'Times New Roman','serif'; font-size: 10.6667px;">[N6] =

<range type="comment" id="473893">3.1 Which measure of correlation should be used to determine a possible relationship between class test 1 and class test 2? Explain your answer.
In order analyze data; one cannot simply apply any kind of method to all kinds of data. As Bryman states: "Techniques have to be appropriately matched to the types of variables that you have created through your research" (2008, p.314). This is the basis for the answer and explanation that follows; The Pearson’s r method would be the most appropriate measure of correlation in determining a possible relationship between class tests 1 and 2.

The reason for this is that both class tests 1 and class test 2 are interval/ratio variables and the Pearson’s r method is the most suited for examining relationships between interval/ ratio variables, (Bryman, 2008, p.327). Using this method, it is possible to determine the strength of the relationship between the two class test (varying between 0 and 1, where 1 is perfect relationship and 0 is no relationship between the two variables), and also the direction of this relationship (which depends on whether the coefficient is negative or positive). If the correlation, for example, is below 1, this means that class test 2 is related to at least one other variable in addition to class test 1.

In addition, the Pearson’s r method can only be applied when the relationship between two variables is linear, and not curved (Bryman, 2008, pp.326-329) just like in class tests 1 and 2. Therefore it is the most appropriate.

In order to determine a possible relationship between class test 1 and class test 2 the measure of correlation that should be used is Pearson’s r. The reason for this is that both class test 1 and class test 2 can be defined as interval/ratio variables, as the distance between the categories is identical across the range of categories. These class tests refer to the amount of correct answers that a person gives in a test consisting of 20 questions. For example, 14 correct answers is 3 less than 17 correct, and 9 correct answers is 3 more than 6 correct answers. Pearson’s r, which is a method for examining relationships between interval/ratio intervals, can both give an indication of the strength of the relationship between the two class test (varying between 0 and 1, where 1 is perfect relationship and 0 is no relationship between the two variables), and also the direction of this relationship (which depends on whether the coefficient is negative or positive). If the correlation is below 1, this means that class test 2 is related to at least one other variable in addition to class test 1. Pearson’s r can only be applied when the relationship between two variables are linear, and not curved (Bryman, 2008, pp.326-329). That is why this measure is suited to interval/ration variables, where the distance between the categories are identical, such as in class test 1 and class test 2.
 * <range type="comment" id="550990">This is my version. I prefer that we use this one, so I hope that is ok:**

3.2 Which measure of correlation should be used to determine a possible relationship between age and class test 1? Explain your answer.
Similarly, determining a possible relationship between age and class test 1 requires using the Pearson’s r correlation measure.

As in the above answer, both age and class test are interval/ ratio variables. However, if one of the variables is grouped into categories, such as, below 20; 21-30, 31-40 etc., it is transformed into an ordinal variable, (Bryman, 2008, p. 321) and in that case, the Spearman’s rho method should be used to determine a possible relationship. This is because the relationship being examined would be between an ordinal and interval variable and for such measures the Spearman’s rho would be the most appropriate, (Bryman, 2008, p. 329). Also in Spearman's rho, the outcome when calculating can either be positive or negative, with a range between 0 and 1.

As Bryman highlights, an important fact that it is important to keep in mind is that these methods can only uncover the relationship between variables. These cannot say with confidence which variable that causes the other, that is which one is the independent and dependent variable (Bryman, 2008, p.326).

Also when determining a possible relationship between age and class test 1, Pearson’s r should be used as both of these variables are defined as interval/ratio variables. As in the above answer, also here the distance between the categories is identical across the range of categories. This is not only the case for class test 1, but also for age- as they are separated with one year difference. It should however be mentioned that if people’s ages had been grouped into categories, such as e.g. below 20; 21-30, 31-40 etc, the variable of ages would have to be defined as an ordinal variable, where the categories of the variables can be ranked in order, but the distance between the categories not necessarily are equal across the range (Bryman, 2008, p. 321). In that case Spearman’s rho should be used to determine a possible relationship. Pearson’s r cannot be used, because the relationship between the variables thus not necessarily will be linear (Bryman, 2008, p. 329). Also in Spearman’s rho the outcome will be either positive or negative, and range between 0 and 1.
 * <range type="comment" id="841052">This is my version of question 3.2. I hope we can use this instead of the edited version that Susan has made! **

As Bryman highlights, an important fact that it is important to keep in mind is that these methods can only uncover the relationship between variables. These cannot say with confidence which variable that causes the other, that is which one is the independent and dependent variable (Bryman, 2008, p.326).

3.3 Could you determine the relationship between ‘Country of birth’ and class test 1? Explain your answer
Yes, the relationship between country of birth and class test 1 can be determined using various measures of correlation as shown below.

To be able to determine the relationship between “country of birth” and class test 1, the first step is to determine what kind of variables “country of birth” and class test 1 are. As “country of birth” is a type of variable with categories that cannot be placed in ranked order, it is a nominal variable (as also stated in question 1). It cannot be stated that being born in South Africa is something more or less than being born in Chile- it is just different. Class test 1, on the other hand, is an interval/ratio variable: they can be placed in ranked order, and the distance between the categories is identical across the range. According to Bryman (2008, p.326) contingency table+chi-square and Cramér’s //V// can thus be used to determine the relationship between “country of birth” and class test 1. If the interval/ratio variable (in this case class test 1) can be identified as the dependent variable, also comparable means+eta could be used (Bryman, 2008, p. 326). Thus, there is a range of choices and alternatives to determine such a relationship.

<span style="font-family: 'Times New Roman','serif'; font-size: 16px;">With regard to contingency table, which is the most flexible method for analyzing relationships, it allows analyzing two variables simultaneously, so that relationships and patterns of association can be searched for (Bryman, 2008, pp. 326-327). Cramér’s //V//, which often is reported along with a contingency table, can only indicate the strength of a relationship and not its direction, as this statistic can only take on a positive value (Bryman, 2008, p. 330). The chi-square test is also applied to contingency tables, and allows establishing how confident one can be that there is a relationship between two variables in the population. It should be mentioned that the chi-square value says nothing on its own- it has to be analyzed in relation to the associated level of statistically significance (Bryman, 2008, pp. 334-335). <span style="font-family: 'Times New Roman','serif'; font-size: 16px;"><range type="comment" id="438135">Comparing means and eta is, as mentioned above, a measure of correlation that could be applied and be very fruitful when one is to measure a relationship between an interval/ratio variable and a nominal variable, and the interval/ratio variable relatively unambiguously could be identified as the dependent variable (Bryman, 2008, p. 330). In the case of the relationship between “country of birth” and class test 1, it can be argued that class test 1, which is the interval/ratio variable, is the dependent variable- the amount of correct answers to class test 1 can by no means affect which country you are born in. It can however be interesting to see if what country you come from, affect the amount of correct answers to class test 1. While means compare the means of the interval/ratio variable for each subgroup of the nominal variable, this is often accompanied with an eta-test, where the statistic expresses the level of association between the two variables. Eta-squared expresses the amount of variation in the interval/ratio variable that is caused by the nominal variable. This value will always be positive. Another strength of using eta is that the relationship between the variables not needs to linear (Bryman, 2008, p. 330). <span style="font-family: 'Times New Roman','serif'; font-size: 10.6667px;">[N7]

<span style="font-family: 'Times New Roman','serif'; font-size: 16px;">It should be mentioned that generalizing the measuring of such correlation might be problematic in this case as the sample in this dataset is very small- only consisting of a total of 10 people (n=10). Only one or two persons, maximum three, represent one country. It can be argued that the sample is too small to be representative to the entire population from which it is selected. Even though the correlation results might be of interest, this limitation is worth keeping in mind. =Question 4: Multivariate analysis=

4.1 What is a spurious relationship?
A spurious relationship is a seeming relationship between two variables but actually this relationship is rather produced by a third variable.Thus, the relationship between the variables is not real. In other words: A spurious relationship occurs when there is a third intervening variable which affects both of the two other variables (Bryman, 2008, pp.330-331). If the third variable is controlled for, the relationship between the two other variables disappears, as it is not a direct one (Bryman, 2008, p. 699). <range type="comment" id="731179">An example that could be applied to illustrate what a spurious relationship is is the following: There may seem to be a relationship between income and voting behavior. However, as Alan Bryman states, one may ask: “could the relationship be an artefact of age?” (2008, p.331). Income is likely to increase the older one gets, in the same manner as it is known that age influences voting behavior. Thus, if age is found to produce the apparent relationship between income and voting behavior, the relationship is spurious (Bryman, 2008, p.331). [N8]

4.2 Draw a contingency table showing the relationship between age and ‘Country of birth’
Table 5: The relationship between age and country of birth From this contingency table showing the relationship between age and country of birth, one can read that 20% of the students in the sample are in age category “20 and under”, 60% of the students are in the “21-30” age group, and 20% is above 31 years. Thus, there is a great overweight in the “21-30 year” category- most of the students in the sample are in this age group. One can also read that for example both South Africans are over 31 years, and both students from Ghana are under 20 years. All the students from Chile, Norway, Uganda and USA are between 21 and 30 years. Contingency tables are interesting, as they make it possible to examine relationships, and can reveal patterns of association (Bryman, 2008, p.327). [N9] =Question 5: Statistical significance=
 * **Age-group * Country of birth Cross tabulation** ||
 * |||||||||||| **Country of birth** || **Total** ||
 * ^  || Chile || Ghana || Norway || South Africa || Uganda || USA ||^   ||
 * **Age-group** || 20 and under || Count || 0 || 2 || 0 || 0 || 0 || 0 || 2 ||
 * ^  ||^   || % within Country of birth || .0% || 100.0% || .0% || .0% || .0% || .0% || 20.0% ||
 * ^  || 21-30 || Count || 1 || 0 || 1 || 0 || 3 || 1 || 6 ||
 * ^  ||^   || % within Country of birth || 100.0% || .0% || 100.0% || .0% || 100.0% || 100.0% || 60.0% ||
 * ^  || 31 and over || Count || 0 || 0 || 0 || 2 || 0 || 0 || 2 ||
 * ^  ||^   || % within Country of birth || .0% || .0% || .0% || 100.0% || .0% || .0% || 20.0% ||
 * Total || Count || 1 || 2 || 1 || 2 || 3 || 1 || 10 ||
 * ^  || % within Country of birth || 100.0% || 100.0% || 100.0% || 100.0% || 100.0% || 100.0% || 100.0% ||

According to Bryman (2008, p.699, 333), statistical significance is an estimate of how confident a researcher can be that the results from a randomly selected sample are generalizable to the population from which the sample was drawn. This test gives the researcher insight into the risk of concluding that a relationship exists when it doesn’t.

The level of statistical significance therefore, is the level of risk that a researcher is prepared to take when he/she infers that there is a relationship between 2 variables when no such relationship exists. Levels of statistical significance are expressed as probability levels and most social researchers agree that the maximum level of statistical significance that is acceptable is p<0.05 //(p means probability)// which implies that there are fewer than 5 chances in 100 samples that a researcher could have a sample that shows a relationship when in reality there isn’t- the risk is fairly small. <range type="comment" id="84687">In other words accepting a significance level of p>0.1 would imply a greater risk of 10 chances in 100 samples. Undertaking this test requires a researcher to set up a null hypothesis (a hypothesis that stipulates that two variables are not related) which is then tested. If the findings indicate that the level of statistical significance is p<0.05(the generally acceptable level), this would imply a low risk of inferring a relationship when it actually does not exist thus the researcher would reject the hypothesis and infer that the relationship does exist. ==<range type="comment" id="346282">5.1 //<span style="font-family: 'Cambria','serif';">What does it mean to say that a correlation of 0.78 is statistically significant at p < 0.05? //== Based on the explanation above therefore, to say that a correlation of 0.78 is statistically significant at p < 0.05 means that the risk of inferring a relationship between two variables whose correlation is 0.78, when it does not actually exist is fairly low with 5 in 100 chances. The null hypothesis can thus be rejected and it can be inferred that there are only 5 chances in 100 that a correlation of 0.78 could have risen by chance alone. =Question 6: More on mean and median= 6.1 Based on the data on ‘sheet 2’ – salary figures for 166 workers, Calculate the mean wage and the median wage for public sector workers and for private sector workers. <range type="comment" id="105796">This question was worked out in excel using the procedure below: The data was filtered to separate public sector workers and the private sector. The total number of public sector workers was **89** while the private workers were **77**. The arithmetic mean and median were derived using appropriate formulas e.g. =average (cells containing the wages e.g. D2: D78) to get the mean. The results are presented i n the table below; [N10] Table 6: Mean and median wage for public and private sector workers
 * **Employee** || **Mean** || **Median** ||
 * Public workers || 573.1461 || 580 ||
 * Private workers || 597.5974 || 530 ||

6.2 <span style="font-family: 'Cambria','serif';">In the public sector, the median is higher than the mean, but in the private sector, the mean is higher than the median. Why is this?
According to Bryman (2008, p.325), the mean is vulnerable to outliers (extreme values at either end of the distribution) which exert considerable upward or down ward pressure on the mean. This explains why the median is higher than the mean in the wages of public sector workers and vice versa. In the case of private sector workers, the median is lower than the mean because of some considerably higher wages of; 2550 and 3500 which inflated the mean.

**<span style="font-family: 'Cambria','serif';">6.3 **//<span style="font-family: 'Cambria','serif';">Do you think it is better to use the mean or the median when examining the income of a group? //
We argue that it is better to examine the mean when examining the income of a group but this does not mean that the median is disregarded completely as explained below: The median only informs about the income which is at the middle of the distribution yet the mean takes into consideration all values in the distribution thus it deviates from the mid-point. This makes it a better option compared to the mean. We do however, acknowledge that the mean is vulnerable to outliers unlike the median and to limit this affecting the results, we argue that the median can be employed alongside the mean to countercheck the results. We believe that employing more than one method for a single variable increases measurement validity. In addition, the wages are interval /ratio variables thus using the mean to examine them is appropriate, <range type="comment" id="954351">Bryman (2008, p.325). Similarly, <range type="comment" id="463535">using the employing the median to countercheck the mean is fitting since it can be employed for both the interval /ratio and ordinal variables. =<range type="comment" id="780533">Question 7: Factors influencing wages =

In this group of people, how strongly does a person's sex affect their wage? How strongly does it matter whether they live in an urban or a rural area? (Keeping in mind that in this example, London is urban and all other locations are assumed to be rural.) Do you think the relationship between a person's sex or location and their wages shows correlation, causation, coincidence, or nothing at all? Show and explain all details (e.g. tables, charts, etc.) of the methods you use.

[H1] Do you think we should include him on this assignment? Well I do not think we should. [H2] Do we have to include this table as well in the final assignment? [H3] Do we need to include such analysis for all our tables? [N4] I do not seem to see the mode presented in this table- I tried to calculate in excel and the mode was 20 but by just looking at the distribution there are two numbers, 20 and 14.

I was wondering if we can construct a simple table showing the mean, median and mode for both class tests – would it make our work clearer? [N5] Considering that the central tendency refers to one figure that summarises the distribution which can be the mean, the median or the mode, I think this statement is not relevant. We just have to mention the mode too. What do you think?

[N6] I have for the mean time removed parts in Linn’s answers that were explaining the variables to be interval because I think this would not be necessary at this point especially after we have classified the variable in question one. However, if the group thinks they are necessary, I will include them- no cause for alarm [N7] Dear Linn, you mention various measures of correlation in this part yet you do not show how they can be used save for a few. I request you to revisit this and mention only those measures of correlation that can be used to determine this relationship and how they can be used please.

[N8] Do we need to include Bryman’s example here? [N9] This does not explain the relationship between the age and the country- you are simply interpreting the table- repeating the data already presented in the table. [N10] Do you think this is necessary?