Online Poll: On the Merits of Screwing Up


(Poll begins halfway down.)

I practice a type of psychotherapy called Transference Focused Psychotherapy (TFP) that a supervisor of mine once called “screw up therapy:” I screw up, and my patients tell me about it.

This doesn’t mean that I am actually screwing up – often I am, but screwing up is a complicated idea, and it’s rare for both parties not to have at least something to learn from bad communication.  Rather it means that when my patients feel I have screwed up – by being a minute late, by forgetting their uncle’s name, by thinking they look sad when they feel anxious – I encourage them to air their feelings about my failings directly.  This gives them a chance to understand and practice how they experience, work through, and resolve disappointment. I like this approach because in my view disappointment is the cardinal problem of clinical psychology, and learning to manage it, in all its subtlety, is the main goal of treatment.

Based on some of the comments I have received, my post about Jonah Lehrer’s WSJ article seems to have been my first screw-up moment with the internet, at least on any significant scale.  My little blog used to get 35 visitors a day; that story got 3,000. But a lot of people didn’t like it, particularly my skimpy treatment of medians and geometric means.  Chad Orzel’s post, in particular, summed up many of the critiques and made me realize that my main point had been lost somewhere.

For example, he thought that I thought the arithmetic mean disproved the whole WOC effect; I thought it just iced the cake of my main point, which is that in this particular study the geometric mean data wouldn’t have been very impressive to lay readers of the WSJ, and led Lehrer to choose a value from the median column.  He felt I didn’t understand medians, while in fact I just can’t stand seeing them cited without skew data, because the statistical outsider doesn’t know that citing a median is, in science, often code for “this is a right-tailed distribution,” let alone even really know that there’s different philosophies about how to deal with the outliers that create such distributions. I skipped over that because it’s a topic of the future post I alluded to in the post – but he had no way of knowing that.  He thought I didn’t understand that geometric means are similar to medians, as the methods section said, while my concern was simply that the methods section identified the the geometric mean as the outcome variable, and I’m old school conservative: I think reporters should report the outcome data identified by authors.  About all we agreed on was that citing that 10,000 figure was an example of cherry-picking.  But where he didn’t think that was my main point, I thought I’d made clear that it was the reason I’d twigged while reading the WSJ essay in the first place!  I knew a cherry when I saw one, and believed further that it wouldn’t fly in a neuroscience lab, but would induce singsong demands by the presenter’s friends to “show us the re-eeeeeeeeest” – a teasing tune that implied exactly what I found: the variable in question – the median – was as a whole not as impressive as that one cherry-picked value.   By the time I got to arithmetic means in my post I thought I was beating a dead horse; Orzel says he thought I thought I had come to my big reveal.

Most importantly, I worried he might think I was bashing Lehrer (though if anything I got complaints for revering him). I wasn’t.  I’m pro-Lehrer per se. However the piece was really a sociological commentary – I was bashing a reporting system and a set of public expectations that doesn’t acknowledge that the neuroscience beat is exponentially harder to cover than the White House beat – to the point of it being impossible for any individual to know as much as needs to be known about the brain to explain it in a newspaper.  What the public thinks they are getting from a “neuroscientist,” regardless of degree or writing ability, is much less than they believe; what is going on in that organ is far more wonderful and shocking than we yet understand. I was trying to point out the mistakes that can get made when too much pressure is put on one person – even as good as Lehrer – to explain a lot of different ideas at once early in his career, and on deadline.

Well, there’s one other thing I think we would agree on: my article seemed to prove my point.  I was pitching it not at statisticians, but non-statisticians, and in the process I purposely skipped over complex ideas that make people’s heads spin – like log normal distributions – while slipping in my dislike of the slippery meaning of medians without owning it.  Or maybe I just don’t understand medians.  In any case, it was bad pop science – and thus was I hoist by my own petard.

Because Orzel was in good and thoughtful company I realized that at the very least I had a screw-up moment on my hands, and wanted some feedback. As in therapy, I figured it was time to stop acting-out reverberating disappointments, and have a check-in.

Below is a 2-minute multiple-choice poll. A quiz, really, since you have to have read Lehrer’s article.  Each question asks about what turned out to be a disappointing point for at least somebody in my article. One of the answers is a summary of what I said in my post; the rest are answers suggested by critics, or merely plausible alternates.  I’ve tried to identify every place critics think that I may have gone wrong, but if you feel I’ve gone easy on myself (a staple complaint in screw-up therapy, btw), please let me know and I will add another question. When the votes slow down (if they even come in fast int he first place) I’ll post the results, but you can also see them if you click on the link in each question. It’s not a WOC study!

If you haven’t already, to answer the questions make sure you’ve at least read Lehrer’s WSJ piece, possibly my postplus/minus comments, and for extra credit and a great single-serving critique, Chad Orzel’s discussion of my post.

— Peter

_________________________________________________________

Issue #1

_________________________________________________________

Issue #2

Wikipedia defines cherry picking (fallacy) as follows: “Cherry pickingsuppressing evidence, or the fallacy of incomplete evidence is the act of pointing to individual cases or data that seem to confirm a particular position, while ignoring a significant portion of related cases or data that may contradict that position. It is a kind of fallacy of selective attention, the most common example of which is the confirmation bias. Cherry picking may be committed unintentionally.”

_________________________________________________________

Issue #3

_________________________________________________________

Issue #4

Consider this quote by Chad by Chad Orzel, or his entire post if you have the time: “…. this is the point on which the whole argument turns, Freed’s proud ignorance of the underlying statistics completely undermines everything else. His core argument is that the “wisdom of crowds” effect is bunk because the arithmetic mean of the guesses is a lousy estimate of the real value. Which is not surprising, given the nature of the distribution– that’s why the authors prefer the geometric mean. He blasts Lehrer for using a median value as his example, without noting that the median values are generally pretty close to the geometric means– all but one are within 20% of the geometric mean– making the median a not-too-bad (and much easier to explain) characterization of the distribution. He derides the median as the guess of a “single person,” which completely misrepresents the nature of that measure– the median is the central value of a group of numbers, and thus while the digits of the median value come from a single individual guess, the concept would be meaningless without all the other guesses.”

_________________________________________________________

Issue #5: 

_________________________________________________________

Issue #6: 

 

15 Comments

  1. It still seems to me like the table is the wrong thing to be looking at for any kind of wisdom of crowds effect because it’s reporting first estimates regardless of information condition. The point of the paper was to identify social effects on wisdom of crowd estimates. It’s that data later on in the figures that should be concentrated on.

    So, why all the references to the table data? It’s meaningless because it contains a mix of aggregated information, full information and control estimates. If that’s not the case, then maybe someone can explain why.

  2. I don’t think these reflect people’s primary beef with the post – you seemed to very proudly refer to yourself as a scientist:

    “as we call it in the biz”; “as we scientists call intros and conclusions”; “But as a scientist – having been on the inside”;

    Then you place this in stark comparison with Lehrer and suggest that his conclusions might come from his status as a non-scientist:

    “the part of me that knew Jonah Lehrer is not a neuroscientist – and therefore might have made a mistake in reading this article”*

    Then you, the scientist, get things wrong.

    You talk up going to the methods quite a bit, but then get some maths wrong. You claim this is just skimming over complex ideas, but skimming over is not the same as being wrong. An example:

    “What are the odds that 144 people are going to guess 144 different numbers and the average of those numbers will end in 0? One in ten – and only if the original number isn’t odd, in which case there should be a decimal.”

    The part from “and only if” onward is not true for the arithmetic mean, geometric mean, or median in the general case, or the case of positive whole numbers.

    Then you misinterpret the Wisdom of the Crowds:

    “There is no wise crowd! Those Swiss students blew it. Blew it! Every single question, the arithmetic mean, and really even the geometric mean, was from a human standpoint wrong, wrong, wrong, wrong, wrong and wrong. The end.”

    WoC means that the mistakes tend to cancel each other out because the average distance between the guess and the truth is larger than the distance of the average from the truth. The key is in the improvement in accuracy, not that accuracy itself.

    To be clear: this isn’t central to your main point. But when you set up your piece as the Experience Teacher who’s run many SPSS analyses, a “David Caruso” figure (your words), to help non-scientists because “Scientists have a huge advantage over their non-scientist friends on this front”, you can’t mess up.

    Cockiness can work okay, but it’s like betting big when you think you have a winning hand – you’d better not be wrong.

    Your point that neuroscience is too complex for any one person to understand and that cherry picking is bad are relatively uncontroversial, so I think it’s unsurprising that they weren’t focused on.

    *While you later suggest that no one is a neuroscientist, I believe the above statement wasn’t really in that vein, given that you’d spent a bit of time branding yourself as a scientist and Lehrer as not one, even in the traditional sense (i.e. your first three pagragraphs).

    1. Good point! I would say I am a scientist in that I use empiricism. I would not say I am a neuroscientist, because I don’t know enough about the brain to really “speak” for it. But in terms of setting forth a hypothesis and then attempting to refute it – yes. I will emphasize the difference between science and neuroscience in future posts – I agree that was not clear.

      I don’t agree that the improvement in accuracy is what’s important, not the accuracy itself. Remember, Galton’s original experiment was about original accuracy – guessing that first ox’s weight – not about getting ox #2 right (there was just one.)

    2. I can’t figure out wordpress’s system for comments – I thought I had responded to this but in case it didn’t get through, there is huge difference between science and neuroscience. I think many people use the scientific method. I feel comfortable saying that about myself. As an fMRI person who never even touches the brain, I am not comfortable saying I am a neuroscientist, and at a more meta level, I think the term means – to the general public – somethign more than “I apply the scientific method to the brain.” It means something more along the lines of “I understand the brain.” So I avoid the term for that reason too. Sorry if this is a repeat – new to blogging.

  3. When our belief systems/ideology/wishful thinking feel challenged one response is to critique the methodology. Although the most popular is to attack the person — “Kill (of blame) the messenger.”

    Internally, and perhaps socially, this is an effective tactic since it avoids the real topic which is felt (usually deeply, instantly and unconsciously) as an attack on our beliefs or patten projections we throw out to our experience. It also distracts attention of both the audience and the author of the threatening content. Everybody gets very busy with this other matter over “here.” Everyone feels immediate emotional relief since the perceived attack on the status quo has been deflected.

    At a minimum, the messenger if blocked from extending and deepening their point and perceived “threat.” It ends up diluting the problem-solving immediately.

    These tactics are used because they work. Our brains naturally fall for them.

    They block problem-solving and learning however. Or in this case, we end up learning more about stats than the real topic of the original post, as pointed out. Ho hum.

    Since these patterns are driven by survival reflexes – defending them “feels” a matter of life or death. The web and social media is a great example of these impulses and compulsions, writ large.

    It’s a lot less work for our brains to write about stats than the troublesome topic of the post. Unfortunately, online conversations (and current politics) are dominated by the “loudest voices” which are reactive, fear-based/defensive and polemical (defending ideologies.)

    Our experience is that, without strict boundaries an policing/active moderation, most online conversations will degrade to this leve and end up being worthless.

    It is both mental work and social work to problem-solving in open social settings.

    1. Oops, forgot, it is important to engrave (in a tattoo on our bodies somewhere) that the deflection comments and POV are only the “loudest voices” and a (small but persistent) minority. The poll result evidence this.

      However, again, out brains mistake volume and frequency for a central tendency — they’re not. Best to ignore them as noise and stay focused on the real issue.

        1. Matt I think you are right that Searle was pitching sohenmitg that can be reduced to behaviorism. His attempt to use a concept of causation to deny it is self-defeating I think, and if he reflects he will realize that you are right.Funny that a philosopher is telling his audience that the really important stuff is going to be coming from atheist technicians like Crick. Sounds like philosophy of his sort is over.However, I won’t be applauding a complete takeover by a new breed of panpsychists either. That’s still side-stepping the reality of the personal and wanting to make it into a goosed-up material instead of a supervenient spiritual (in my humble opinion).

      1. I’d argue that the poll results (where each person gets a + or – vote, but the strength of their opinion is not considered) are much closer to a median of the group’s opinions than the mean of them. Not a perfect equivalence, but similar.

        Given that, I find it ironic that …sleeprunning… seems to be arguing that median-like poll results are the important measure of the group’s opinions and that Freed at least appears to agree (“You rock!”).

        I agree that poll results are a good measure of group opinion with a few caveats: (1) The n seems awfully small, particularly compared to the traffic of the previous post. (2) Because online polls require an active choice to take them, they have some mean-like behavior (someone with a stronger opinion is probably more likely to take them) (3) I’m not sure how important the group’s opinion is.

        1. Our point includes:
          – Very few people comment online
          – Those that do are overwhelmingly triggered to hostile-aggressive venting, usually hyper-personalized
          – Something is felt, very quickly, as deeply identity threatening
          – So defensive verbal behavior dominate – the “loudest voices”
          – These are misleading of group sentiment. In fact, the opposite sentiments are usually true,e.g., the survey results. There is no measure of central tendency in the results — just the raw frequencies. Non-parametric yea!

          If you take any idea that disagrees with yours as a personal attack – you should avoid social media. It is “social” media not “personal” media.

          So both readers and writers online need to be critical consumers of the loudest voices. Our experience, and we coach on this, is to speak to the silent majority and ignore the loudest voices, unless you can play off of them.

          Hyper-defensive reactive comments don’t contribute to problem solving and, in fact, drive away more reasoned people. They are symptoms of a deeper individual problem and off-topic usually. They just corrode a blog and comment stream by raising the “acid level.” Think of how we all handle a loud person at a party — we get away.

          However, and here’s the rub, hostile defensive behavior always get’s a lot of attention, the media knows this. So the trade off is — attention vs problem-solving usefulness.

          Now principled, evidence-based arguments and disagreements (even strong) are actually, according to some research, the optimal kind of conversations. Some are hear to defend against imagined attacks to their identities (ho hum) but the large majority of us are hear to group problem solve and learn, share etc. Defensive behavior always smothers any of the good stuff.

  4. I have long wondered why there isn’t a sanadtrd for archiving web pages. If you’re using Wikipedia as a source, you can specify the exact revision used. Which is excellent. You can’t really do this with any other web page on the internet, because they lack a searchable archive, making them useless.

  5. Hiya, I’m really glad I’ve found this info. Nowadays bloggers publish just about gossip and web stuff and this is actually irritating. A good web site with exciting content, this is what I need. Thanks for making this site, and I will be visiting again. Do you do newsletters? I Cant find it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s