Wikipedia Screws Up (for 10 years straight)


This is the third in a series on the metaphysics of statistics, designed to remind us that statistics are not a science.  Part 1 is here.

On June 22011 (while preparing the long-delayed second part of this series), I looked up “statistics” on Wikipedia.  This is what I found:

“Statistics is the science of the collection, organization, and interpretation of data.[1][2]

Uh-oh, I thought.  We’re four words in and they’ve called statistics a science. Worried, I clicked on the link to “science” in the hope that somehow they were using the term loosely. What I found was a confused article whose first 1,861 words would mislead anyone not already explicitly aware that statistics is not a science. The article begins as follows:

“This article is about the general term, particularly as it refers to experimental sciences. For the specific topics of study by scientists, see Natural science. For other uses, see Science (disambiguation). Science (from Latinscientia meaning “knowledge”) is a systematic enterprise that builds and organizes knowledge in the form of testable  explanations and predictions about the world.”

Hoo-boy, I thought. The link from “statistics” to “science” is going to mislead people into thinking that statistics is an empirical discipline, when in fact it is a rational discipline used as a tool in empirical science.

It got worse before it got better:

“In modern use, science is ‘often treated as synonymous with ‘natural and physical science’, and thus restricted to those branches of study that relate to the phenomena of the material universe and their laws, sometimes with implied exclusion of pure mathematics. This is now the dominant sense in ordinary use.””

Right, I thought.  Which means that you, Wikipedia, know that everyone who clicks on the science link is going to think that statistics is a science.

The article “saved itself” (at least for people who were still reading) by halfway down finally getting to the point that statistics are not in fact a science in the “dominant sense in ordinary use.”  However this did not happen until word #1,861, which is an awfully long time for someone to wait to find out that statistics is not, in fact, a science in the common sense of the word.  Still, it did happen:

Mathematics, which is classified as a formal science, has both similarities and differences with the empirical sciences (the natural and social sciences). It is similar to empirical sciences in that it involves an objective, careful and systematic study of an area of knowledge; it is different because of its method of verifying its knowledge, using a priori rather than empirical methods. The formal sciences, which also include statistics and logic, are vital to the empirical sciences. Major advances in formal science have often led to major advances in the empirical sciences. The formal sciences are essential in the formation of hypothesestheories, and laws, both in discovering and describing how things work (natural sciences) and how people think and act (social sciences).”

So why, I asked myself, wasn’t  the statistics entry linking to the formal science entry, rather than this confusing piece on science generally? that entry begins wonderfully:

“The formal sciences are the branches of knowledge that are concerned withformal systems, such as logicmathematicstheoretical computer science,information theorysystems theorydecision theorystatistics, and some aspects of linguistics. Unlike other sciences, the formal sciences are not concerned with the validity of theories based on observations in the real world, but instead with the properties of formal systems based on definitions and rules. Methods of the formal sciences are, however, applied in constructing and testing scientific models dealing with observable reality.”

As I am not paid to write this blog, my attention turned to other matters.

You may think I’m kidding, but I got nervous the next day that I had mis-read the statistics entry, and the next day, June 3, I re-checked Wikipedia.  To my shock – I was certainly not the responsible party – the entry now read as it now reads:

Statistics is the study of the collection, organization, and interpretation of data.”

What? The word “study” now replaced the word “science.” Had I dreamed it? Where had science gone?

I checked the revision history, and it appeared that someone was on the same track as me.  The word “science” had been removed, and the word “study” inserted, five hours earlier, as this screenshot shows.

I naturally assumed that I had been inadvertently caught in the crossfire of one of those Wikipedia wars where two opponents battle for conceptual territory on a page.  There must be some nut, I thought to myself, who thinks it is useful to talk about statistics as a science who is in constant war with one of those self-appointed Wikipedia-guardians who wants to call statistics a “study,” knowing full well that calling it a “formal science” will make half the readers think statisticians wear tuxedoes to work. (As if! My department’s guy wears a t-shirt, hiking shorts, wool socks, and Birkenstocks.  I believe he is considered “fancy” by his colleagues.)

Figuring I could use what was doubtless an incessant back-and-forth to make a useful point to my readers, I checked the whole damn revision history.  Well, guess what? There is no back and forth.

The statistics entry has called statistics “science” for a decade, since its inaugural writing in November 3, 2001.

Realizing the universe was handing me the evidence to support my point on a silver platter, and picking up the tab, I went to http://stats.grok.se and looked up the traffic on the page, and it seems to be roughly 14,000 views a month (though of course I don’t know how many are repeat visitors).  So this site has been viewed by roughly 10 (years) x 12 (months) x 14,000 (hits) = over a million lifetime viewers, even when assuming repeat visitors.  Only on June 3, did one of them change it.

Now of course I don’t know how many people read 1,861 words into the article to discover the caveat that statistics is not a science in the word’s “dominant sense in ordinary use.”   But I would guess that a good number of visitors to the statistics page over the past decade haven’t bothered clicking through on the science link, let alone read 1,861 words down, and as a result wikipedia has left over a million people with the impression that statistics is an empirical science with empirical answers to empirical questions – like when and when not to use the emean, median, and mode.

It’s not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s