The comments on Dr. Guyenet's lastest post included more -Woo-bashing, as usual. In the paleodome, bashing -Woo is about as popular as bashing Dr. Oz and Dr. Kruse. Is it really warranted?
I guess I have always been kind of a weird statistician. There is a typical protocol usually followed and recommended when a pile of data comes in. One of the most comical (and I think most wrong) techniques is to scrub the data by plotting it all and then throwing out the outliers. A typical stupid way to do this is to calculate the standard deviation, and then automatically throw out all the data points that are beyond three sigma. This technique pretty much guarantees that the researcher will systematically throw out perfectly good data, and it also ensures that any totally cool thing about what they are studying will be tossed as well. Sort of like cracking the egg, separating it, and then throwing out the yolk if you are Dr. Oz, or throwing out the white if you are some of the lower-protein paleo's, or throwing it all out if it tastes good if you are Dr. Guyenet.
In my twisted mind, ItstheWoo certainly points to some excellent data. She is the poster child for outliers, and also the poster child for how outliers are usually treated by some really stupid academics and play-acting statisticians. It it really necessary for commenters to continually remind each other how they flip through -Woo's posts without reading? How many minutes did they waste typing that over and over again on numerous blogs? Wouldn't it just be easier to read some of them, and, you know, maybe learn a little something about an outlier?
So, real researchers.........not sure if you are visiting yet, but here's a little advice anyway. Before you just toss out data because it is "beyond three sigma", you need to take a look at it. Taking a look at it doesn't include disparaging and making fun of the data point. "Out! darn point!! Be done with it!" (What research dweeb does that in real life? Obviously there is some other deep-seated hostility going on here.) Data points can't just be removed and discarded because they "mess up the error term" or otherwise make either the analysis or the researcher uncomfortable. I only do data checking to make sure there isn't a typo or other similar problem. If I can't find a reason, it stays. I have had to fight my position on this for years, and in many situations, even resorting to doing two analyses, one with all the data, and "one with some data points thrown out". And that's what I call it. It's not scrubbing. The data hasn't been cleaned, it has been lobotomized and I'll have none of that.
And on a final note, I got a large spike in readership today. I am drawing out a few more readers, commenters and lurkers, mostly from the low carb linking sites. More data to follow......