Wednesday, February 23, 2011

Accuracy and Precision

"Lies, damned lies, and statistics." 
Mark Twain


I so often use statistics in my work that I forget that other do not. Not only is unfamiliarity a problem, but so often statistics are manipulated by those that use them to support one argument or another. When in doubt, ask for the raw data and do it yourself. One of the things that my father always reminded me when I was young was to look at my source. That is a very valuable thing in the day of the internet where anything and everything is out there.

To aid in some understanding I thought I would describe certain statistical terms that arise in journal articles in my field.

Accuracy: Accuracy is how close a measurement is to the "true" value. In a laboratory setting this is often how far a measured value is from a standard with a known value that was measured by different technology or on a different instrument.

Precision: Precision is how close repeated measurements are to each other. Precision has no bearing on a target value, it is simply how close multiple measurements are together. Reproducibility is key to scientific research and precision is important in this aspect.

Now obviously the goal is to have a measurement that is both accurate and precise, but being one doesn't mean that the other is as well. For example, if we had a bow and arrow and target, accurate shots would be ones in which hit the bulls eye. If we landed a shot in the bulls eye and others around the target we would be accurate, but not precise. If all of our shots landed close together, but not near the target than we would be precise but not accurate. To be accurate and precise we would need our shots to be together and in the bullseye.
The images above are a good reason to look further than just an average. If the average is based on a wide range of values than that average may indeed be meaningless. For example, if I have the values of 1 and 100, the average is 50.5. I could have that same average of 50.5 if my two values were 50 and 51. Both pairs of numbers result in the same average, but in the first example, the average is meaningless because the two numbers are so far apart.

This is where standard deviation comes in.

Standard Deviation: Standard deviation is a measure of the dispersion of numbers around a population average. So if you envision a bell curve of data, the standard deviation would be how far apart the numbers are from the average. Our first example above would have a very large standard deviation where as our second one would be small.Depending on how many standard deviations are used will tell you how many of the values are covered. One standard deviation will include 68% of all values whereas three standard deviations will include 99% of all values.



It's always important to look at how the data is being represented and to keep those three aspects in mind. If you are looking at a set of numbers or a conclusion drawn from repeated testing you want to be aware of the terms I mentioned above to know how much credibility to give the data.


No comments:

Post a Comment