Summary: Efforts to improve scientific integrity must grapple with both questionable research practices that fall within the current "rules of the game" and outright misconduct. Survey and audit data suggest disturbing lower bounds for misconduct, and suggest the possibility of rates high enough to meaningfully distort readings of the scientific literature. The problem could be worse for "null fields" studying nonexistent effects, and for studies that seemingly have top methodological standards. I discuss this analysis in the context of cold fusion and parapsychology, commonly thought to be null fields. These fields may be more at risk of fraud than others, but may also provide a warning about the potential for misconduct in more conventional domains.
Questionable research practices vs fraud
Meta-research studies have documented a number of different issues impairing scientific credibility that are within the standard "rules of the game," biasing results without direct falsification or fabrication of data:
Since QRPs are very common, can easily produce false positives, and may be practiced without outright lies, they have been the focus of much of the attentions of reformers. Simmons, Nelson, and Simonsohn (2012) offer a "21 word solution," requiring authors to state:
Another technique, "p-curve" analysis, looks at the distribution of published results passing the p=0.05 significance test, asking whether p-values are evenly distributed above that threshold (suggesting p-hacking) or concentrated at more extreme values (suggesting an effect other than simple publication bias). Again, frauds falsifying data can falsify to whatever p-value they prefer.
In aggregate, reductions in QRPs could make misleading fraudulent results stand out more, since they could no longer hide amongst a forest of merely p-hacked results, and some measures directly complicate fraud, like data sharing (many cases of fraud exposure come from re-analysis of raw data). But addressing the combination of QRPs and fraud, especially in understanding existing literature, does complicate the picture.
How common is fraud?
If the base rate of fraud is sufficiently low, it might be mostly negligible, especially for results publicly replicated by multiple independent groups (rare as that is). There have been a number of attempts to estimate the frequency of fradulent practices in science. Fanneli (2009) offers us "How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data".
One would expect social desirability bias and fear of punishment to suppress admission rates dramatically, so I would expect the survey results are a substantial underestimate. Theoretically, they could be an overestimate, due to response bias, or a small minority of false confessions along the lines of people who report their religion as Jedi or agree that lizard men rule the United States.
In John et al. (2012) in addition to confession data, psychologists were asked to estimate rates of QRPs and fraud among their peers, as well as to estimate the portion of the guilty who woudl confess. This table shows the estimated prevalence rates from 1) confession; 2) peer estimates of frequency 3) the combination of empirical confession rates with peer estimates of the portion of the guilty who would confess. For falsification of data, the latter are much higher, around 10% and 40%, with a geometric mean of the three estimation procedures at 9%.

For poll estimates near the extremes of the scale, the mean can be heavily influenced by a few outliers, so I asked John for the median data (with and without the "Bayesian truth serum" technique):
So median psychologist respondent estimates a quite substantial fraud rate, several times higher than the anonymous admission rate, and tremendously higher than the rate of exposure.
There are also fraud data based on audits of samples of research. Faaneli (2009):
How is fraud distributed across fields, types of research, and publication status?
The above overall estimates of the frequency of fraud and fraudsters are troubling, but they need not be evenly distributed. If some areas of research disproportionately attract or retain fraudsters, then the local rates may be even worse.
One selective pressure is the demand for success and positive results. Fraudsters can ensure that their experiments always appear to 'work', and publish more papers with positive significant results. If they can publish all their experiments, fraudsters may make up a larger share of the published literature than their share of the researcher population. Likewise, if hiring and tenure are based on publication record, this would favor fraudsters being over-represented among those graduate students who are able to obtain academic jobs, professors who achieve tenure, and researchers at elite institutions.
Fields and types of research also vary in the feasibility of non-fraudulent work to advance a career. Consider the following routes to producing publications and getting continued funding for career and research:
Parapsychology: control group for science
Parapsychology, which tries to show that psychic powers exist, usually through randomized experiments, has been called a control group for science, I believe originally by Michael Vassar (in the lineage of this post). There are many reasons to think it is a null field, including:
Nonetheless, parapsychologists are able to produce a published research literature with an excess of positive results and meta-analyses with enormous p-values which they claim show that psychic powers exist after all.
My sense is that the bulk of parapsychology results are the result of p-hacking, publication biases, and reporting biases. Often experiments have odd sample sizes, report results for strange subgroups, or emphasize a significant but unusual analysis without mentioning more standard main effect analyses (which do not show a result). Outright fraudsters don't need to resort to such tactics to generate results.
Following the controversy over tenured psychologist Daryl Bem (who has no direct financial motive to generate positive psi results), his experiments seem p-hacked and subject to QRPs but not fraudulent. As discussed in an earlier post, the published experiments used nonstandard analyses without sufficient correction for multiple comparisons, excluded some previously presented data, got significant results more often than would be predicted even if the effects were given the low power, and other signs of p-hacking.
However, there are individual parapsychology experiments that purport to have rather high quality methodology, and report very impressive effects with somewhat reasonable sample sizes. For example, consider the "ganzfeld" experiments, which supposedly involve one participant being presented with 1 of 4 stimuli at random and telepathically transmitting its identity to someone in another room. That second participant then discusses his or her imaginations about the target with the experimenter (who is supposed to be blinded as to the target), with that discussion converted into a guess as to the stimulus.
Early versions of the procedure varied in their analysis in accord with p-hacking, and early meta-analysis also seems somewhat p-hacked (the subset of procedures chosen for meta-analysis had higher success rates). Nonetheless, after discussion with skeptics, the procedure was mostly stabilized in a Joint Communique involving a skeptical psychologist (Ray Hyman) and a parapsychologist (Charles Honorton, head of the Psychophysical Research Laboratories or PRL). Honorton's lab undertook a series of experiments purportedly following this protocol, with many but not all experimenter steps automated (so the experiments were know as "autoganzfeld").
Given this setup there is a clear null hypothesis of 25% accuracy, and little freedom to change the analysis. The main options I can see for getting positive results in the absence of psychic powers are:
The first four possibilities rely on failure to publish trials, and are most severe for small studies (and indeed most ganzfeld studies, like most parapsychology studies are underpowered). For experimenters who claim to have published all of the trials they conducted, and to have conducted a large sample of trials, the last two are the most plausible (hidden trials would be an instance of fraud in such a case), along with "other." And there are at least a few such experiments.
For example a recent ganzfeld meta-analysis includes one study with 60/128 hits. If each trial had an independent probability of success of 25%, this would be around a 1 in 14 million event. Another claims 57/138, close to 1 in 50,000. The possibility of optional stopping reduces this somewhat, but nonetheless such experiments should not yet have happened absent systematic error, cheating, or psychic powers. Honorton's lab's ganzfeld experiments post-Communique were supposedly published in full with no omitted trials and a hit rate of 119/354, a hit rate which should happen less than 1 in 5,000 times given a random 25% true accuracy.
Sensory leakage seems a bit wild, but is often hard to rule out (e.g. Honorton's PRL laboratory was shut down, making it hard to test the quality of soundproofing) and there are mechanisms that could allow it in various cases. There is certainly a history of bogus parapsychology results caused by sensory leakage, such as card guessing experiments where subjects could see the cards or markings on them. Also, there is a selection effect: if an apparatus and experimental setup turn out to "work" then they will be used again and again to produce apparently enormous results, while failed experimental setups will be frequently changed. Sensory leakage could be quite conclusively ruled out in a sufficiently vetted physical setup examined by independent quality auditors.
Fraud is naturally an unpleasant suggestion, for both the innocent and the guilty. The usual rebuttal to this possibility is that fraudsters must be too rare to matter much. But the surveys and audit estimates of fraud rates above suggest the baseline is not very rare. If 5% or 10% of psychologists in general have committed fraud, parapsychologists are disproportionately selected for fraud, and we restrict ourselves to experiments with low analysis freedom and large sample sizes (where p-hacking is hard) we might quickly reach quite high rates.
Now, there is uncertainty about how high baseline fraud rates are, about selection for frauds by field, and selection for fraud by experiment type in null fields. These are testable empirical hypotheses, and parapsychologists I have spoken to have argued that in fact parapsychologists are much more honest and rigorous in their methods than conventional scientists. But I think the hypothesis of high fraud rates when we zoom into the relevant area deserves nontrivial credence, before updating on the evidence for parapsychology being a null field. So to convince me that parpasychology is right I would need strong evidence against the fraud hypothesis, like repeated large highly scrutinized replications by independent skeptical researchers, evidence sufficient to make a high fraud rate (or other systematic error) more incredible than psychic powers.
Are parapsychologists more or less at risk for fraud than regular scientists?
If parapsychologists are much more strongly selected for or encouraged to commit fraud than scientists in other disciplines, than we might believe that fraud is a major problem for the most robust-seeming results in parapsychology, but not in other fields. But if the risk levels are not too far apart, then this may support a new angle for scrutinizing other academic fields.
I have encountered a number of arguments from parapsychologists that they are more trustworthy than other scientists, including:
Here are better screen shots: 1 2.
I have a few questions about the reasons that you think that parapsychology is more prone to fraud:
"The number of frauds reported in Kennedy's article is quite high on a per capita basis if the field has had only a few hundred serious researchers"
Would you happen to know what the fraud rate per capita is in the other fields (both conventional psych and other sciences)?
"A single parapsychology laboratory, the Rhine Institute, reported numerous cases of misconduct (without publicly naming the individuals) as a problem under control shortly before its appointed Director (subsidiary to Rhine), W.J. Levy, was also found to have engaged in fraud; this suggests that it may have been unusual not so much in a shocking frequency of frauds but in being willing to report such cases at all (Kennedy, 2014)"
Is the point here that parapsychologists are less willing to make accusations about fraud, which seems to tie into your last point?
"Parapsychology positions appear to be less available than conventional psychology ones"
Would this be simply due to the fact that parapsychology positions are rarer due to parapsychology research itself being rarer, or is there data to suggest that the ratio of positions to parapsychologists is much lower than in other fields' position to researcher ratios?
The other points seem solid. Thanks.
"Would you happen to know what the fraud rate per capita is in the other fields (both conventional psych and other sciences)?"
I linked above to a series of studies for conservative estimates of fraud, and evidence that they are indeed too conservative.
"Is the point here that parapsychologists are less willing to make accusations about fraud, which seems to tie into your last point?"
The Rhine Institute, the flagship parapsychology institution of its time, was rife with fraud (with the mentioned cases said to be only a selected subset of fraud cases) with the frauds not being named and shamed. If that is representative of the field, then the base rate of fraud is enough to explain any sexy results that cannot be replicated reliably by other researchers, or by outsiders.
One might argue that the Rhine could have been exceptional in two ways: it could have been more corrupt than the rest of the field, or it could have been more forthright in exposing corruption.
The "more corrupt" story suffers because of the sheer number of fraud cases. If frauds were randomly distributed, it would be very surprising that so many appeared at that one institution if the base rate is low. That leaves the options that the Rhine Institute differentially attracted or encouraged fraud, and that it was more forthright than usual in exposing it.
The fact that numerous cases for fraud were sat on for decades, until referenced as part of an unusual article towards the end of the lab's history (which might just as well not have been published), and the structure of the laboratory (whereby Rhine was in a position to fire frauds beneath him like Levy, as opposed to many small laboratories which could be run by individual frauds) suggest that Rhine was more unusual in the revelations than fraud prevalence.
"data to suggest that the ratio of positions to parapsychologists is much lower than in other fields' position to researcher ratios?"
This is based on parapsychologists claiming as much, and the membership of their professional societies vs full-time positions.
I see. Thanks for the clarification.
