An illustrative exercise in never trusting the summary statistics without also visualizing them.

Datasaurus

## Format

A data frame with 1,846 observations on the following 3 variables.

dataset

the particular data set, one of 12

x

a random variable

y

another random variable

## Details

Data were created by Alberto Cairo to illustrate you should always visualize your data beyond the summary statistics. These are 12 data sets, in long form, each with a mean of x about 54.26, a mean of y about 47.83. The standard deviation for x is about 16.76 and the standard deviation of y is about 26.93. x and y will correlate weakly, about -.06.

## References

Cairo, Alberto. 2016. Download the Datasaurus: Never trust summary statistics alone; always visualize your data''. URL: http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html

Matejka, Justin and George Fitzmaurice. 2017. Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing.'' ACM SIGCHI Conference on Human Factors in Computing Systems. URL: https://www.autodesk.com/research/publications/same-stats-different-graphs

## Author

Alberto Cairo, Justin Matejka, George Fitzmaurice