In this video we discuss the difference between the mean, or average and the median of a data set. We go through an example and cover how skewed data sets can affect the mean and median.
Transcript/notes
The average mean vs the median
Two statistics that are often used to provide us with information to help in decision making are the average or the mean, and the median, and these are considered measures of central tendency. In this video we are going to compare these.
Lets say that this is a data set of the amount of money spent by 10 different couples at a restaurant.
To calculate the mean or average, we add up all the values and divide by the total number of values. Adding them up we get $625, and there are 10 total values. So, $625 divided by 10 equals $62.50, which is the mean or average spent per couple.
Now for the median, which is the middle value, or between the middle values. Since we have 10 total values, and an even number of values, it will be between the middle values. We will arrange the values in ascending or increasing order, draw a line here in the middle, where we have 5 values lower, and 5 values higher. And add these 2 middle values together and divide that total by 2, which gives us $53, which is our median.
So, we have an average of $62.50 and a median of $53, a $9.50 difference.
Both of these tools, the average and the median, can be influenced by the skew of the data. For instance, a symmetrical distribution, as you see in this graph, has some high and some low values, and some values in between. So, the mean, and median will lie near the middle.
In a positively skewed distribution, the higher values in the data set will pull the mean upwards, and the median will be less than the mean, and more towards the center of the graph.
And in a negatively skewed distribution, the lower values will pull the mean downwards, the median will again be near the center of the graph, higher than the mean.
Here is a dot plot for our example data set of money spent by couples, and you can see this $133 data point is really far away from the other data points, so this is pulling the average upwards.
The mean or average is the most used measure of central tendency, because it uses all of the values in a data set. But, for data from a skewed data set the median can be more beneficial because it isn’t influenced by extremely large or small values. So, the median is less sensitive to outliers.
In statistics, it’s always a good idea to look at the mean or average, the median and also take a look at a dot plot or other type of graph to see if the data is skewed in any way.
Timestamps
0:00 Measures of central tendency
0:18 How to calculate the mean or average
0:39 How to calculate or find the median
1:14 How skewed data affects the mean and median
1:50 Example dot plot comparing the mean and median
2:05 Differences between the mean and median
Ещё видео!