Statistics and probability is a foundational course for most degrees in the current information age. Learning key mathematical concepts is more fruitful than repetitively practicing problem sets that you don’t fully understand the theory for. Let us see the difference between Variance and Covariance.

**What is the Difference Between Variance and Covariance?**

To understand the difference between variance and covariance, we need to go back to the basics. Googling their definitions is not enough to complete your Maths assignments. First, we must understand what random variables are, what an expected value is, and how both tie into variance and covariance. We will explain the building blocks of statistics, and then break down the meaning of variance and covariance.

**What is a Random Variable?**

Let’s say your friend offers you a bet – they’re going to throw a pair of six-sided dice exactly ten times for you, and if you get a sum of 7 even once in those ten throws, you will receive 50 dollars from them. Otherwise, you’ll have to pay them 50 dollars. You’re a safe gambler – you’ll take the bet but only if you know you’ll win it.

How many ways can you get a sum of 7 with a pair of six-sided dice?

(1, 6), (2, 5), or (3, 4), or (4, 3), or (5, 2) or (6, 1).

There are 6 outcomes where you get a sum of 7, out of a total of 36 possible outcomes. The terms of the game are simple – you need to get a sum of 7 out of those ten rolls to get that 50 dollars. So it doesn’t matter to you whether you roll a 3 and a 4, all you’re interested in is the outcome and the probability of you getting it.

Guess what? Getting a sum of 7 is what your random variable is! The value of a random variable is the outcome of the experiment, and in statistics, we assign probabilities to each of these possible outcomes.

We can call our random variable X. We’d like to find the probability of X = 7.

P{X = 7} = P{(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} = 636 = 16

In simple terms, the probability of our random variable X is 7, is 6 outcomes out of a total of 36 possible outcomes. Which is a 1 out of 6 chance of getting a sum of 7. With 10 possible tries? I’d take that bet.

**What is the Expectation?**

For a game of dice, random variables are very simple and discrete. You have 1/6 chance to get a given number. With two dice? You have 1/36 chance to get a given sum. In real life, each of these random variables is assigned different probabilities.

The most important concept in statistics and probability theory is the expectation or “expected value” of a random variable, denoted by E[X], where X is our random variable.

The golden formula for a discrete random variable is E[X] = ixiP{X=xi}

To translate the language of maths, what this equation means in layman’s terms is that the expected value of your random variable is a *weighted* average of the possible values that X can be, with each of these possible values being weighted by the probability that X will be that value.

For example, p(0)=12 means that the probability that X will be 0 is 12. In statistics, lowercase p is called a “probability mass function” of a given random variable X.

Let’s say we have two probability mass functions provided to us:-

**Example 1 **

p(0)=12 and p(1)=12

What is the expected value for our random variable X?

E[X] = 0(12)+1(12)=12

Quite simply, a very easy average of the two possible values of 0 and 1 that X could assume.

**Example 2**

p(0)=13 and p(1)=23

What is the expected value for our random variable X?

E[X] = 0(13)+1(23)=23

Now, this is a weighted average. For two possible values, X = 0 and X = 1, we have an expected value of 23, because 1 is given more weight than 0. To simplify it, one outcome is more likely than the other.

**What is the Expected Value?**

By this point, most of you will assume that an “expected value” is what we expect our random variable X to be. This is wrong!

While we call E[X] the “expectation of X”, it is not what outcome we expect our random variable X to be, but rather the average value of X after a lot of repetitions of the same experiment. Repetitions is what probability and statistics are all about!

One simple way to showcase this is to find the E[X] when X is the outcome of rolling a fair six-sided die. For a fair die, there is no difference in weights to each outcome, because each outcome has a probability mass function of 16. In mathematical terms, p(1)=16, p(1)=p(2)=p(3) = p(4)=p(5)=p(6)

What is the expected value E[X]?

E[X] = 1(16)+2(16)+3(16)+4(16)+5(16)+6(16)

E[X] = 16+26+36+46+56+66 =216=72

Note here, that our random variable X couldn’t possibly yield an outcome of 7/2 or 3.5. There is no 3.5 on the die. That’s how we know that the expected value is an average, and not the value you expect X to be.

In this case, if you kept rolling that die again and again, eventually you’d realise all the outcomes average out to around 7/2. Try it!

**What is a Probability Distribution?**

Given that we now know what a random variable is, each random variable will have a probability mass function, eventually leading to a total expected value. It’s only natural to want to see the spread of values that our random variable X could be and at what probabilities. Rarely can we do anything with just the average or expected value!

This is the essence of a probability distribution. We’re looking at the spread of our data, and most of the time you will see a probability distribution depicted as a Bell Curve. Please note there are many kinds of distributions, the most common is the Normal distribution.

To boil this graph down to brass tacks, it’s quite simple. The very middle denoted by is our expected value. E[X]=. Meanwhile, the rest is the spread of our random variable’s outcomes. Assuming our expected value is a score of 50% on a test, and our deviation is 5% we can see that 68% of the students will receive a score within the range of 45% and 55%.

**What is Variance?**

Variance is how we measure how far our data spreads from the expected value. Why do we want to know the spread of data?

Let’s say we have random variables X and Y, each with its probability distribution functions.

X = -1 with p(-1)=1/2

X = 1 with p(1)=1/2

Y= -100 with p (-1)=1/2

Y = 100 with p(1)=1/2

Both random variables X and Y will have the same expected value.

E[X] = -1(12)+1(12)=0

E[Y] = -100(12)+100(12)=0

That’s crazy! Random variable X spreads just between a value of -1 and 1, whereas random variable Y spreads from -100 all the way to 100. With only the expected values, we would think these two distributions are exactly the same. This is why we need variance.

Variance is essentially the measure of how far a random variable X is from its mean value, on average. In mathematical language, this is E[(X-)]. If the mean is a 50% score, how far is any given student’s score from it on average?

While it would be incredibly easy to just use E[(X-)] as our formula for calculating variance, it poses a lot of complications and so mathematicians devised a better, more accurate formula which uses squared values.

Our golden formula for Variance is thus:-

Var(X)=E[(X-)2]

or Var(X)=E[X2]-(E[X])2 which is derived from the same formula above.

What it means is simple, if X is a random variable with a mean value of , then its variance is the difference between the expected value of the square of random variable X and the square of the expected value of random variable X.

At its core, we’re just measuring how far random value X will fall from the mean in either direction.

**What is Covariance?**

Covariance is how we measure the *joint variability *of two random variables. If variance is the spread of data for random variable X. Say we have another random variable Y, which impacts variable X. The covariance of two random variables X and Y, which is denoted mathematically by Cov(X,Y) is

Cov(X,Y)=E[(X-x)(Y-y)]

or Cov(X,Y)=E[XY]-E[X]E[Y] which is derived from the same formula above.

This translates to “the covariance of random variables X and Y is the expected values of each multiplied together, deducted from the joint distribution of X and Y.”

Why do we need covariance at all? Let’s say random variable X is the scores of students on an Algebra III test. Random variable Y is the scores of students on a Physics I test.

Professors have found that these two subjects generally have an impact on the other, so they want to see how people’s scores vary on them together. Will a student who scores low in Algebra III also score low in Physics I?

Covariance can help us find the answer to that question! Knowing the variance of each distribution alone cannot.

**Conclusion **

In conclusion, the difference between variance and covariance is quite simple. While variance is the spread of our data from the expected value for one random variable’s distribution, covariance is the spread of data for two joint random variables.

All in all, understanding how random variables work, and what expected values are is key to understanding the difference between the two, and the basis for statistics and probability as a whole.