R Dataset / Package datasets / anscombe

Submitted by pmagunia on March 9, 2018 - 1:06 PM
Dataset License
GNU General Public License v2.0
Attachment Size
dataset-84646.csv 364 bytes
Documentation

Anscombe's Quartet of ‘Identical’ Simple Linear Regressions

Description

Four x-y datasets which have the same traditional statistical properties (mean, variance, correlation, regression line, etc.), yet are quite different.

Usage

anscombe

Format

A data frame with 11 observations on 8 variables.

x1 == x2 == x3 the integers 4:14, specially arranged
x4 values 8 and 19
y1, y2, y3, y4 numbers in (3, 12.5) with mean 7.5 and sdev 2.03

Source

Tufte, Edward R. (1989) The Visual Display of Quantitative Information, 13–14. Graphics Press.

References

Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.

Examples

require(stats); require(graphics)
summary(anscombe)##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- lm(ff, data = anscombe)
  print(anova(lmi))
}## See how close they are (numerically!)
sapply(mods, coef)
lapply(mods, function(fm) coef(summary(fm)))## Now, do what you should have done in the first place: PLOTS
op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13))
  abline(mods[[i]], col = "blue")
}
mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex = 1.5)
par(op)
--

Dataset imported from https://www.r-project.org.

Documentation License
GNU General Public License v2.0

From Around the Site...

Title Authored on Content type
R Dataset / Package MASS / beav2 March 9, 2018 - 1:06 PM Dataset
airmiles February 26, 2017 - 11:28 AM Dataset
R Dataset / Package Ecdat / DoctorAUS March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Ecdat / PSID March 9, 2018 - 1:06 PM Dataset
R Dataset / Package DAAG / frostedflakes March 9, 2018 - 1:06 PM Dataset