A plant distills liquid air to produce oxygen, nitrogen,and argon. The percentage of impurity in the oxygenis though to be linearly related to the amount ofimpurities in the air as measured by the “pollutioncount” in parts per million (ppm). A sample of plantoperating data is shown below:Purity (%) 93.3 92.0 92.4 91.7 94.0 94.6 93.6Pollutioncount (ppm) 1.10 1.45 1.36 1.59 1.08 0.75 1.20Purity (%) 93.1 93.2 92.9 92.2 91.3 90.1 91.6 91.9Pollutioncount (ppm) 0.99 0.83 1.22 1.47 1.81 2.03 1.75 1.68(a) Fit a linear regression model to the data.(b) Test for significance of regression.(c) Find a 95% confidence interval on
Analysis of Variance and Regression
Question 1
Choice of statistical analysis is based largely on the way in which the variables have been measured. Consider the following variable and identify if it is more likely to be measured on a metric or categorical scale:
Variable: Type of transport (bus, train, tram)
is the anwser (please highlight the anwser or remove the wrong anwser)
Metric or Categorical
Question 2
Choice of statistical analysis is based largely on the way in which the variables have been measured. Consider the following variable and identify if it is more likely to be measured on a metric or categorical scale:
Variable: Cost of ticket ($)
is the anwser (please highlight the anwser or remove the wrong anwser)
Metric or Categorical
Question 3
Choice of statistical analysis is based largely on the way in which the variables have been measured. Consider the following variable and identify if it is more likely to be measured on a metric or categorical scale:
Variable: Was the carriage/tram/bus overcrowded? (Yes/No)
is the anwser (please highlight the anwser or remove the wrong anwser)
Metric or Categorical
Question 4
Choice of statistical analysis is based largely on the way in which the variables have been measured. Consider the following variable and identify if it is more likely to be measured on a metric or categorical scale:
Variable: Time taken (in minutes) to reach destination
is the anwser (please highlight the anwser or remove the wrong anwser)
Metric or Categorical
Question 5
Apart from describing our variables as metric or categorical, we can indicate the level of measurement of said variables. Consider the following variables and select the most appropriate level of measurement:
On a scale from 1 to 5, how would you rate your satisfaction with the service provided?
1. 2. 3. 4. 5.
—-Very— Neutral —-Very—
Dissatisfied Satisfied
is the anwser (please highlight the anwser or remove the wrong anwsers)
Interval or Ordinal or Nominal or None of the above
Question 6
Apart from describing our variables as metric or categorical, we can indicate the level of measurement of said variables. Consider the following variables and select the most appropriate level of measurement:
Make of car (Holden, Ford, Mitsubishi, Other)
is the anwser (please highlight the anwser or remove the wrong anwsers)
Interval or Ordinal or Nominal or None of the above
Question 7
The following SPSS output was produced:
Choose the most appropriate statement from one of the following options:
is the anwser (please highlight the anwser or remove the wrong anwsers)
It is appropriate to use Pearson’s r.
or
It is not appropriate to use Pearson’s r because the relationship is curved.
or
It is not appropriate to use Pearson’s r because there are outliers.
Question 8
Give the regression coefficient (slope) correct to three (3) decimal places.
Anwser =
Question 9
What is the best interpretation of the regression coefficient?
is the anwser (please highlight the anwser or remove the wrong anwsers)
For each additional hour of sleep, on average, people made 2.72 less errors.
or
For each error made, on average, people had 2.72 hours less sleep.
or
For each additional 2.72 hours of sleep, on average, people made 1 error more.
or
For each additional 2.72 errors, on average, sleep was reduced by 1 hour.
Question 10
What is the best interpretation of the 95% confidence interval for the correlation?
is the anwser (please highlight the anwser or remove the wrong anwsers)
We can be 95% confident that the strength of the correlation between amount of sleep and number of errors is between -0.54 and -0.75.
or
We can be 95% confident that the proportion of errors made is between -0.54 and 0.75.
or
We can be 95% confident that the sample correlation between amount of sleep and number of errors is between -0.54 and 0.75.
Bivariate and ANOVA Statistics Assignment All Cases require the use of a computer and software (SPSS). Use a 5% significance level unless specified otherwise. Research background The survey was conducted to assess the satisfaction levels of staff from an educational institution with branches in a number of locations across Australia. Staff were asked to complete a short, anonymous questionnaire containing questions about their opinion of various aspects of the organisation and the treatment they have received as employees. This study explores employee status differences in staff satisfaction score. The two variables used are employment status (employstatus) and total staff satisfaction scores (totsatis), which is the total score that participants recorded on a ten-item Staff satisfaction Scale. Use the satisfactionsurvey.sav file to answer the following questions. Case 1 Based on the study of staff satisfaction above, answer the following questions: The education institution set the standard mean of staff satisfaction is 41.5. If the mean of staff satisfaction is higher than 41.5, it can be concluded that the staffs are satisfied. Research question: Are the education institution staffs (permanent and casual) satisfied? a. Write and explain the null and alternative hypothesis of this research question. b. Test to determine whether the hypothesis is support. c. Explain your findings from the statistical analysis above Case 2 Based on the study of staff satisfaction above, answer the following questions: Research question: Is there a significant difference in the mean staff satisfaction scores for permanent and casual? a. Write and explain the null and alternative hypothesis of this research question b. Test to determine whether the hypothesis is support c. Explain your findings from the statistical analysis above Case 3 Based on the study of staff satisfaction above, answer the following questions: Research questions: Are there significant differences in the mean staff satisfaction scores for across each of the length of service categories (use the servicegp3 variable)? a. Write and explain the null and alternative hypothesis of this research question b. Test to determine whether the hypothesis is support c. Explain your findings from the statistical analysis above
Case 4 In the study of staff satisfaction, the respondents were asked ten questions on what their agreement on their work environment satisfaction factors and ten questions on how importance of work environment satisfaction factors. In the SPSS file you find the total of both agreement scale (agree_satf) and importance scale (importance_satf). Using this survey data, answer the following question: Research question: Is there significant difference between staff satisfaction agreement factors and staff satisfaction importance factors? a. Write and explain the null and alternative hypothesis of this research question b. Test to determine whether the hypothesis is supported c. Explain your findings from the statistical analysis above
This document should be read in conjunction with the following which are available on the Assessment 2 page of Blackboard: 1. PUN105 Assessment 2.pdf, and 2. PUN105 Criteria for Marking.pdf You should analyse Framingham_A.sav and write an analytical report which answers the following questions. 1. Describe the characteristics of the sample at the baseline exam. [Hint: Make sure you describe the all of the variables including demographic and clinical] 2. Is there an association between the use of anti-hypertensive medication at baseline and participant sex? 3. Is there an association between the use of anti-hypertensive medication at baseline and the education level of the participant? 4. Is there a difference in systolic blood pressure at baseline between participants with different education levels? 5. Is the difference between systolic and diastolic blood pressure at baseline the same for participants using anti-hypertensive medication and those not using anti-hypertensive medication? 6. In a model that predicts systolic blood pressure at baseline exam, how much variation is explained by age and body mass index? Which variable is most important? Describe how systolic blood pressure changes with age. Describe how systolic blood pressure changes with body mass index. 7. Develop the minimum model to explain variation in systolic blood pressure at baseline exam by considering the following variables: age, body mass index, sex, education level and antihypertensive medication use. When developing your minimum model you should also consider whether sex is an effect modifier of the relationship between systolic blood pressure and use of hypertensive medication. 8. Is there a change in the proportion of the sample who use anti-hypertensive medication between the baseline and follow-up examinations? If so, over what time period has this change occurred? 9. (Optional question) Is there any potential bias introduced into the follow-up data due to missing data? Explain your answer and provide evidence. Hint: are the participant demographic characteristics the same for those with and without data for the follow-up examination, and what impact may this have?