Simulating multivariate normally distributed data in R

In my graduate class on path analysis, we do a lot of analysis on our own data. This year, I suggested that people consider analyzing simulated data based upon the statistics of their data. This way they’ll use a data set that looks like their data, but they aren’t doing a lot of model fitting on data they care about and what to use in real research. Thus, today I typed up a quick guide to simulating multivariate normal data in R for use in our class.

If you find typos, errors, etc., please let me know.

Significance testing, p-values, and confidence intervals

You have to enjoy the introduction of Sander Greenland, et al.’s article in the supplemental material posted with the American Statistical Association’s statement on p-values (full text here):

“Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature.” (p. 1, emphasis mine)

Working scientists should be able to handle this.