Watch videos with subtitles in your language, upload your videos, create your own subtitles! Click here to learn more on "how to Dotsub"

StandDev1

0 (0 Likes / 0 Dislikes)
In this video tutorial we will look at sample data and using the sample data we will compute some measures of central tendancy and measures of dispersion the sample data we will look at will be the 6 test scores you see 80, 60, 70, 80, 100 and 90 and so let's determine some measures of central tendancy the sample mean, the median and the mode. now the same mean is known as x bar, and x bar represents the sample mean whereas the population mean is represented by mu, now we are not doing population mean so we will just get rid of that and we will talk about the equation for the sample mean. which is x equals the sum of, that's the summation symbol, the sum of x all the data values divided by n. So x bar, the sample mean equals the sum of the data values divided by the frequency well to find the sum of the data values we will take the data values 80, plus 60, plus 70, plus 80 (it's hard to write here!), plus 100, plus 90 and divide by 6 when we compute this the sum of the data values sigma x is 480 and when we divide 480 by 6 which is the freequency we find that the sample mean is 80. So our sample mean is 80. Now the median, the median is the middle data value and in order to find the median what we want to do is first put the data values in order we don't want to try to find the median unless your data values are ordered. Now is there is an odd amount of data values, then there is always a true median. And if there is an even amount of data values then what you need to do is, there is in this case, is find the average of the two middle values. Obviously this one is 80 but just to show you what you would do is you would take the two data values you would add them up and divide by 2 average them. So we have 160 divided by 2 and that also is 80. So our mean is 80, our median is 80 and the mode is the data value that occurs the most now by most we mean more frequent than the other data values so there could be no mode if all the data values occur once, or twice, or three times maybe there is no mode but if one data value occurs more than the others then there is a mode and in this case 80 occurs twice and the other data values only occur once so the mode is 80. So for our measures of central tendancy, we have 80 for all of our measures of central tendancy and what you have learned in this chapter is that if the mean, equals the median and the mode which they do in this case then we have a Normal Distribution. And in a Normal Distribution the mean, is the best measure of central tendancy. If the distribution were skewed, in other words if the mean were greater than or less than the median then that would indicate that the mean is being drawn towards an outlier so the mean would be less than the median, there would be a low lying outlier, if the mean were greater than the median there would be a high valued outlier, which would draw the mean from center thus leaving the median as the better measure of central tendancy. OK, so now for our measures of dispersion, the range, the sample variance and the sample standard deviation, the range is simply the maximum value minus the minimum value in our case then the range will be 100 minus 60 so our range is 40. And the range really all that is going to tell you is the width if you will of your x-axis so we have a range of 40, we have a low end value of 60 and a high end value of 100. Now for sample variance and sample standard deviation these are a little more confusing the sample variance, the notation is s-squared. If we had population variance it would be sigma-squared but again we are not talking about population here we are talking about a sample and the formula for sample variance is s-squared equals the sum of each data value minus the mean quantity squared divided by n minus 1 Now if this were population data we would replace this with a capital N and use all the data values in our population but with sample data we use 1 less than the frequency size. So in order to start this I find that a chart helps the best we will write down our x values, our data values, and they are 60, 70, 80, 80, 90 and 100 I have to get used to writing with this! Now the next step and actually the first step in computing the variance and standard deviation is to compute the mean we've already done that here. We have computed the mean up here to be 80 so once you compute the mean you then need to compute the deviations. and the deviations are simply the difference between the data value and the mean. so we take our data values and we subtract the mean and this gives us the deviations now when we say deviations, we are referring to the change from the mean so in other words 60 is 20 below, hence the negative, the mean 70 is 10 below the mean of 80 80 is the mean so it has a deviation of 0 and again here hopefully as I go along and do this more I will get better with this pen... 90 is 10 greater than the mean so we will have a positive 10 and 100 is 20 greater than the mean. Now a quick check here if we do not round our mean which we didn't in this case, our mean came out to be exactly 80 when we find the sum of the deviations we should find that it's zero. and as long as you don't have to round your mean this should hold true Now if you have to round your mean it may not hold exactly true. It maybe "off" a bit. But here we have negative 30 and a positive 30 we do get zero when we find the sum of the deviations. So to do anything with this data would not make sense because we would get zero! So our next step here, we found the deviations, Now we have to square each deviation in order to get rid of the negatives. so we are going to take our deviations, the data value minus the mean, and we are going to square them. so negative 20 times negative 20 is positive 400 negative 10 times negative 10 is positive 100 0 times 0 is 0 10 times 10 is 100 20 times 20 is 400 so now we have squared our deviations Next keeping with the numerator we want to find the sum of the squared deviations well that simply means we want to add all these values up so.... to find our variance what we are going to do is take 400, plus 100, plus our 0's, plus 100, plus 400 and we are going to divide by, 6 is n, but 6 minus 1 which is actually 5 when we find the sum of our squared deviations we compute 1000 and 1000 divided by 5 is 200. Therefore, our sample variance is 200. And now our sample standard deviation, well the symbol for standard deviation is simply s, and you see s is just the square root of s-squared well, s therefore is just the square root of the variance the variance was 200 the formula for the standard deviation is the same as the variance it is just the square root so it is the SQUARE ROOT of the sum of the squared deviations divided by n minus 1 and again in the sample standard deviation it is n minus 1 whereas in a population standard deviation this would be all the data values so we just need the square root of 200 and we find that to be about, if we round, 14, so our sample standard deviation is 14.

Video Details

Duration: 11 minutes and 57 seconds
Country: United States
Language: English
Producer: Dianna Cichocki
Director: Dianna Cichocki
Views: 145
Posted by: cichocki on Sep 16, 2010

Computing measures of central tendency & dispersion of sample data

Caption and Translate

    Sign In/Register for Dotsub to translate this video.