Archive for the ‘Science’ Category

Most published research results are false

Thursday, February 7th, 2008

John Ioannidis wrote an article in Chance magazine a couple years ago with the provocative title Why Most Published Research Findings are False.  Are published results really that bad? If so, what’s going wrong? 

Whether “most” published results are false depends on context, but a large percentage of published results are indeed false. Ioannidis published a report in JAMA looking at some of the most highly-cited studies from the most prestigious journals. Of the studies he considered, 32% were found to have either incorrect or exaggerated results. Of those studies with a 0.05 p-value, 74% were incorrect.

The underlying causes of the high false-positive rate are subtle, but one problem is the pervasive use of p-values as measures of evidence.

Folklore has it that a “p-value” is the probability that a study’s conclusion is wrong, and so a 0.05 p-value would mean the researcher should be 95 percent sure that the results are correct. In this case, folklore is absolutely wrong. And yet most journals accept a p-value of 0.05 or smaller as sufficient evidence.

Here’s an example that shows how p-values can be misleading. Suppose you have 1,000 totally ineffective drugs to test. About 1 out of every 20 trials will produce a p-value of 0.05 or smaller by chance, so about 50 trials out of the 1,000 will have a “significant” result, and only those studies will publish their results. The error rate in the lab was indeed 5%, but the error rate in the literature coming out of the lab is 100 percent!

The example above is exaggerated, but look at the JAMA study results again. In a sample of real medical experiments, 32% of those with “significant” results were wrong. And among those that just barely showed significance, 74% were wrong.

See Jim Berger’s criticisms of p-values for more technical depth.

Comet dust looks like asteroid dust

Saturday, January 26th, 2008

Until quite recently, astronomers thought that comets formed in the outer reaches of the solar system and then were drawn into highly elliptical orbits that pass near the sun. But samples collected from comet Wild 2 look more like they came from the inner solar system like asteroids. Maybe the outer solar system is more like the inner solar system, or maybe comets didn’t form where we thought they did.

For more details, listen to yesterday’s 60-Second Science podcast or read the Science Magazine article the podcast is based on.

Shell shock may be physical, not psychological

Friday, January 25th, 2008

Shell shock was identified during World War I as a condition that causes soldiers to become dazed after being near explosions. Symptoms may appear weeks after exposure and there are no outward signs of injury. Naturally, this was regarded as a psychological rather than physical disorder.

But according to a story in today’s Science Magazine podcast, there is increasing evidence that shell shock is caused by physical trauma to the brain. One theory is that compression waves from the explosion hit the torso and transfer pressure to the brain via the circulatory system. If this theory is true, improved head gear will not help but improved body armor might.

Repairing tumors

Saturday, January 19th, 2008

Imagine this conversation with your doctor:

Your poor tumor. It has a chaotic blood supply. Parts of it get too much blood, other parts too little. We’re going to give you a drug to improve your tumor’s blood supply, making it healthier.

Before you run screaming from your doctor’s office, see if there’s a copy of the January 2008 issue of Scientific American in the waiting room. If there is, read the article Taming Vessels to Treat Cancer by Rakesh Jain.

Just as the cells in a tumor are abnormal and growing out of control, so are the blood vessels that feed the tumor. This lack of proper infrastructure inhibits the tumor’s growth, but it also makes it difficult to deliver chemotherapy to the tumor. This lead to the radical idea to make the tumors healthier in preparation for killing them.

So how would you go about improving a tumor’s circulatory system? By administering a drug that was designed to attack tumor vessels!

A new class of cancer drugs, antiangiogenic agents, has been designed to attack tumors by cutting off their blood supply. These agents haven’t been a complete success. Experience with one such agent, Avastin, shows that while it shuts down some of the blood vessels in tumors, it may make the remaining tumor vessels healthier. That’s bad news if you’re treating patients with Avastin alone. But when used in combination with chemotherapy, it’s just what people like Dr. Jain were looking for: a way to normalize the blood flow in a tumor in order to make it more vulnerable to chemotherapy.

More information, including videos, is available at the web site of Dr. Jain’s lab.

Irreproducible analysis

Tuesday, January 15th, 2008

Journals and granting agencies are prodding scientists to make their data public. Once the data is public, other scientists can verify the conclusions. Or at least that’s how it’s supposed to work. In practice, it can be extremely difficult or impossible to reproduce someone else’s results. I’m not talking here about reproducing experiments, but simply reproducing the statistical analysis of experiments.

It’s understandable that many experiments are not practical to reproduce: the replicator needs the same resources as the original experimenter, and so expensive experiments are seldom reproduced. But in principle the analysis of an experiment’s data should be repeatable by anyone with a computer. And yet this is very often not possible.

Published analyses of complex data sets, such as microarray experiments, are seldom exactly reproducible. Authors inevitably leave out some detail of how they got their numbers. In a complex analysis, it’s difficult to remember everything that was done. And even if authors were meticulous to document every step of the analysis, journals do not want to publish such great detail. Often an article provides enough clues that a persistent statistician can approximately reproduce the conclusions. But sometimes the analysis is opaque or just plain wrong.

I attended a talk yesterday where Keith Baggerly explained the extraordinary steps he and his colleagues went through in an attempt to reproduce the results in a medical article published last year by Potti et al. He called this process “forensic bioinformatics,” attempting to reconstruct the process that lead to the published conclusions. He showed how he could reproduce parts of the results in the article in question by, among other things, reversing the labels on some of the groups. (For details, see “Microarrays: retracing steps” by Kevin Coombes, Jing Wang, and Keith Baggerly in Nature Medicine, November 2007, pp 1276-1277.)

While they were able to reverse-engineer many of the mistakes in the paper, some remain a mystery. In any case, they claim that the results of the paper are just wrong. They conclude “The idea … is exciting. Our analysis, however, suggests that it did not work here.”

The authors of the original article replied that there were a few errors but that these have been fixed and they didn’t effect the conclusions anyway. Baggerly and his colleagues disagree. So is this just a standoff with two sides pointing fingers at each other saying the other guys are wrong? No. There’s an important asymmetry between the two sides: the original analysis is opaque but the critical analysis is transparent. Baggerly and company have written code to carry out every tiny step of their analysis and made the Sweave code available for anyone to download. In other words, they didn’t just publish their paper, they published code to write their paper.

Sweave is a program that lets authors mix prose (LaTeX) with code (R) in a single file. Users do not directly paste numbers and graphs into a paper. Instead, they embed the code to produce the numbers and graphs, and Sweave replaces the code with the results of running the code. (Sweave embeds R inside LaTeX the way CGI embeds Perl inside HTML.) Sweave doesn’t guarantee reproducibility, but it is a first step.

Musicians, drunks, and Oliver Cromwell

Saturday, January 12th, 2008

Jim Berger gives the following example illustrating the difference between frequentist and Bayesian approaches to inference in his book The Likelihood Principle.

Experiment 1:

A fine musician, specializing in classical works, tells us that he is able to distinguish if Hayden or Mozart composed some classical song. Small excerpts of the compositions of both authors are selected at random and the experiment consists of playing them for identification by the musician. The musician makes 10 correct guesses in exactly 10 trials.

Experiment 2:

A drunken man says he can correctly guess in a coin toss what face of the coin will fall down. Again, after 10 trials the man correctly guesses the outcomes of the 10 throws.

A frequentist statistician would have as much confidence in the musician’s ability to identify composers as in the drunk’s ability to predict coin tosses. In both cases the data are 10 successes out of 10 trials. But a Bayesian statistician would combine the data with a prior distribution. Presumably most people would be inclined a priori to have more confidence in the musician’s claim than the drunk’s claim. After applying Bayes theorem to analyze the data, the credibility of both claims will have increased, though the musician will continue to have more credibility than the drunk. On the other hand, if you start out believing that it is completely impossible for drunks to predict coin flips, then your posterior probability for the drunk’s claim will continue to be zero, no matter how much evidence you collect.

Dennis Lindley coined the term “Cromwell’s rule” for the advice that nothing should have zero prior probability unless it is logically impossible. The name comes from a statement by Oliver Cromwell addressed to the Church of Scotland:

I beseech you, in the bowels of Christ, think it possible that you may be mistaken.

In probabilistic terms, “think it possible that you may be mistaken” corresponds to “don’t give anything zero prior probability.” If an event has zero prior probability, it will have zero posterior probability, no matter how much evidence is collected. If an event has tiny but non-zero prior probability, enough evidence can eventually increase the posterior probability to a large value.

The difference between a small positive prior probability and a zero prior probability is the difference between a skeptical mind and a closed mind.

Galaxies closer together than stars

Thursday, January 10th, 2008

I heard yesterday that relative to their size, galaxies are much closer together than stars. I’d never heard that, so I looked into it. Just using orders of magnitude, the sun is 10^9 meters wide and the nearest star is 10^16 meters away. The Milky Way is 10^21 meters wide, and the Andromeda galaxy is 10^22 meters away. So stars are millions of diameters apart, but galaxies are tens of diameters apart.