Sampling from Normal distribution and add correlation

Original by reinie, 2016 

This summary note was Posted on

General Case

If you want to draw a N×1random vector x with a multivariate normal distribution with mean zero and N×N variance matrix Σ , then you do the following:

  • Draw N independent and identically distributed N(0,1) random variables, and stack them up in a vector Z.
  • Calculate the Cholesky decomposition of Σ (the CC’ form). If Z is a vector of length k of independent random variables with unit (or at least constant) standard deviation; and § is a correlation matrix with Cholesky decomposition S=CC′S, then CZ with have population correlation S.
  • Multiply x=CZ.
  • Now, x is distributed normal with mean zero and variance Σ.

Two Distributions Case

Population correlation. This is a simple matter in the bivariate case of taking independent random variables with the same standard deviation and creating a third variable from those two that has the required correlation with one of the two random variables. If X1 and X2 are independent standard normal variables, then ( Y = rX2+sqrt(1-rr)*X1 ) will have correlation r between Y and X2 .

Here’s an example in R:

n = 10
r = 0.8
  x1 = rnorm(n)
    x2 = rnorm(n)
    y1 = rx2+sqrt(1-rr)*x1 

This is a mash up from: