R Exam project case study

R exam project case study
Instructions: You need to upload two files on Blackboard as answers for this test: (1) A word document with descriptive answers to the questions; (2) A .txt file (Textedit in Mac and Notepad in Windows) with
input-output from the R-console. Copy and paste your work from R Console to the .txt file. Answer all
questions:

  1. Upload the “Affairs” dataset from the AER library. It has data which would allow us to analyze
    the determinants of the number of extra-marital affairs that people may have.
    It is a data frame containing 601 observations on 9 variables:
    affairs numeric.
  2. How often engaged in extramarital sexual intercourse during the past year? 0
    = none, 1 = once, 2 = twice, 3 = 3 times, 7 = 4–10 times, 12 = monthly, 12 = weekly, 12 = daily.
    gender factor indicating gender.
    age numeric variable coding age in years: 17.5 = under 20, 22 = 20–24, 27 = 25–29, 32 =
    30–34, 37 = 35–39, 42 = 40–44, 47 = 45–49, 52 = 50–54, 57 = 55 or over.
    years married numeric variable coding number of years married: 0.125 = 3 months or less,
    0.417 = 4–6 months, 0.75 = 6 months–1 year, 1.5 = 1–2 years, 4 = 3–5 years, 7 = 6–8 years, 10
    = 9–11 years, 15 = 12 or more years.
    children factor.

Do you have a complex assignment?

Does you homework involve coding, data analysis, lab report, math, or forensic analysis?

Topgrades Expert assignment help
  1. Are there children in the marriage?
    religiousness numeric variable coding religiousness: 1 = anti, 2 = not at all, 3 = slightly, 4 =
    somewhat, 5 = very.
    education numeric variable coding level of education: 9 = grade school, 12 = high school
    graduate, 14 = some college, 16 = college graduate, 17 = some graduate work, 18 = master’s
    degree, 20 = Ph.D., M.D., or other advanced degree.
    occupation numeric variable coding occupation according to Hollingshead classification
    (reverse numbering).
    rating numeric variable coding self rating of marriage: 1 = very unhappy, 2 = somewhat
    unhappy, 3 = average, 4 = happier than average, 5 = very happy
  2. Carry out the following operations on the data.
    a. Present a brief description of the data. In particular, display the first six rows of the
    data to reveal the type of variables the dataset has. [6 points]
    b. Plot affairs against the rating variable. What kind of regression model specification
    would be suitable to represent the relationship between the two variables? Run the
    appropriate regression model. Create additional variables if required for this model.
    Interpret the variables. [17 points]

  3. c. Report the summary of results. Report the confidence intervals of the coefficients.
    Interpret the coefficients and comment on their individual significance. Comment on
    the goodness of fit of the regression model. [15 points]
    d. Describe the omitted variable bias that may arise in the regression specification you
    have estimated above. [15 points]
    e. Run a regression model by including the gender, age, years married, children,
    religiousness, education, occupation variables in addition to the specification you have
    estimated in (b). [5 points]
    f. Compare the summary statistics from (c) with the specification we ran in (e). [16
    points]
    g. Report and interpret the F statistic. Why might the t-test for significance of coefficients
    be inadequate in the specification described in (e)? [8 points]
    h. Compute and report the standard error of regression, SER. [8 points]
    i. Do you think the specification in (e) is a better representation of determinants of
    affairs? [10 points]

Discover how TopGrades.org will assist you

Our custom writing service helps you get control over your studies in a professional way

Affairs NN 2022-12-14 R Markdown library(AER) ## Warning: package ‘AER’ was built under R version 4.1.3 ## Loading required package: car ## Warning: package ‘car’ was built under R version 4.1.2 ## Loading required package: carData ## Loading required package: lmtest ## Warning: package ‘lmtest’ was built under R version 4.1.2 ## Loading required package: zoo ## ## Attaching package: ‘zoo’ ## The following objects are masked from ‘package:base’: ## ## as.Date, as.Date.numeric ## Loading required package: sandwich ## Warning: package ‘sandwich’ was built under R version 4.1.2 ## Loading required package: survival ## Warning: package ‘survival’ was built under R version 4.1.2 data(“Affairs”) attach(Affairs) a) description of the data summary(Affairs) ## affairs gender age yearsmarried children ## Min. : 0.000 female:315 Min. :17.50 Min. : 0.125 no :171 ## 1st Qu.: 0.000 male :286 1st Qu.:27.00 1st Qu.: 4.000 yes:430 ## Median : 0.000 Median :32.00 Median : 7.000 ## Mean : 1.456 Mean :32.49 Mean : 8.178 ## 3rd Qu.: 0.000 3rd Qu.:37.00 3rd Qu.:15.000 ## Max. :12.000 Max. :57.00 Max. :15.000 ## religiousness education occupation rating ## Min. :1.000 Min. : 9.00 Min. :1.000 Min. :1.000 ## 1st Qu.:2.000 1st Qu.:14.00 1st Qu.:3.000 1st Qu.:3.000 ## Median :3.000 Median :16.00 Median :5.000 Median :4.000 ## Mean :3.116 Mean :16.17 Mean :4.195 Mean :3.932 ## 3rd Qu.:4.000 3rd Qu.:18.00 3rd Qu.:6.000 3rd Qu.:5.000 ## Max. :5.000 Max. :20.00 Max. :7.000 Max. :5.000 head(Affairs) ## affairs gender age yearsmarried children religiousness education occupation ## 4 0 male 37 10.00 no 3 18 7 ## 5 0 female 27 4.00 no 4 14 6 ## 11 0 female 32 15.00 yes 1 12 1 ## 16 0 male 57 15.00 yes 5 18 6 ## 23 0 male 22 0.75 no 2 17 6 ## 29 0 female 32 1.50 no 2 17 5 ## rating ## 4 4 ## 5 4 ## 11 4 ## 16 5 ## 23 3 ## 29 5 ##b Plot affairs against the rating variable. plot(rating, affairs, main = “Affairs against the rating variable”, xlab = “Ratings”, ylab = “Affairs”, pch = 19, frame = FALSE) # Add regression line plot(rating, affairs, main = “Affairs against the rating variable”, xlab = “Ratings”, ylab = “Affairs”, pch = 19, frame = FALSE) abline(lm(affairs ~ rating, data = mtcars), col = “blue”) A linear regression model would be suitable to represent the relationship between affairs and ratings. model<-lm(affairs~rating) summary(model) ## ## Call: ## lm(formula = affairs ~ rating) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.9063 -1.3989 -0.5631 -0.5631 11.4369 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.7421 0.4790 9.900 <2e-16 *** ## rating -0.8358 0.1173 -7.125 3e-12 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 3.17 on 599 degrees of freedom ## Multiple R-squared: 0.07813, Adjusted R-squared: 0.07659 ## F-statistic: 50.76 on 1 and 599 DF, p-value: 3.002e-12 Creating additional variables for the model model2<-lm(affairs~rating+children) summary(model2) ## ## Call: ## lm(formula = affairs ~ rating + children) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.9246 -1.5072 -0.7014 -0.3280 11.6720 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.3570 0.5657 7.702 5.57e-14 *** ## rating -0.8058 0.1196 -6.739 3.76e-11 *** ## childrenyes 0.3734 0.2921 1.278 0.202 ## — ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1 ## ## Residual standard error: 3.168 on 598 degrees of freedom ## Multiple R-squared: 0.08064, Adjusted R-squared: 0.07756 ## F-statistic: 26.23 on 2 and 598 DF, p-value: 1.209e-11 The additional variable in the model was chidren. The variable was chosen because of the believe that couples with children have varying attention on each other as they have to attend to the children, while others get united more by the children. The analysis entailed creating a children_yes dummy variable since the variable was categorical-binary. ##c) Reporting the summary results The analysis produced coefficients equal to 4.3570, -0.8058, and 0.3734 for the intercept, rating, and children_yes respectively. The standard errors were 0.5657, 0.1196, and 0.2921, for the constant, rating and children_yes, respectively. The summary results produced R-square value equal to 0.08064. The R-squared value was very low indicating poor model fit to the data. hence, rating and children_yes were not good predictors of affair. ##d) Describing the omited variable bias Clearly, the R-squared value was low indicating that the independent variables explained very little of the variations in the affairs. The results sggested a possibility of many other variables not included in the model, which could explain the variations in the affairs variable. hence, there is evidence of omited variable bias. The bias can be reduced by adding mre variables in the model. ##e) Regression including more independent variables model3<-lm(affairs~rating+gender+age+yearsmarried+children+religiousness+education+occupation) summary(model3) ## ## Call: ## lm(formula = affairs ~ rating + gender + age + yearsmarried + ## children + religiousness + education + occupation) ## ## Residuals: ## Min 1Q Median 3Q Max ## -5.0503 -1.7226 -0.7947 0.2101 12.7036 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 5.87201 1.13750 5.162 3.34e-07 *** ## rating -0.71188 0.12001 -5.932 5.09e-09 *** ## gendermale 0.05409 0.30049 0.180 0.8572 ## age -0.05098 0.02262 -2.254 0.0246 * ## yearsmarried 0.16947 0.04122 4.111 4.50e-05 *** ## childrenyes -0.14262 0.35020 -0.407 0.6840 ## religiousness -0.47761 0.11173 -4.275 2.23e-05 *** ## education -0.01375 0.06414 -0.214 0.8303 ## occupation 0.10492 0.08888 1.180 0.2383 ## — ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1 ## ## Residual standard error: 3.095 on 592 degrees of freedom ## Multiple R-squared: 0.1317, Adjusted R-squared: 0.12 ## F-statistic: 11.23 on 8 and 592 DF, p-value: 7.472e-15 ##f) The summary statistics produced R-squared value of 0.1317. The value implied that the independent variables explained 13.17% of the variations in the affairs. Comparing the R-square to the one obtained in (b), the new model had a higher value. Hence, it is evident that adding more variables improved the model. The analysis of the regression model revealed that rating, age, years married, and religiousness had significant effect on affair. Other variables including gender male, children_yes, education, and occupation, had insignificant effects on affairs. ##g) Report the F-statistic The F-statistic was equal to F=11.23, p-value<0.001. The p-value associated with the F-statisic was less than 0.05, implying that the test rejected the null hypothesis of insignificance. Therefore, the analysis concluded that the model fit was good and that the independent variables were good predictors of the affairs. ##h) Standard error of the regression stder<-sqrt(1-0.12)*3.095 stder ## [1] 2.903367 ##i) The specifications in e) is better representation of determinnatis of affairs. This is because the F-test yielded a significant result with a p-value less than 0.05. besides, the analysis produced a greater R-squared value than the earlier model in b).
Order a unique copy of this paper
(550 words)

Approximate price: $22

Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency