January 23, 2008

Something rotten in the state of scientific publishing

By Jonathan M. Gitlin

There is an interesting commentary in this week's Nature1 that takes a look at the subject of plagiarism within the scientific literature. It's certainly a contentious subject; from day one as an undergraduate it was drilled into us that there could be no greater sin than plagiarism, and I assume most other universities are the same. However, just because it's bad, doesn't mean that no one will do it, and, as we know from high-profile fraud cases like Woo Suk Hwang, there will always be scientists out there who bend and break the rules.
These days, just about every scientific paper resides in a an online database, whether it be something like arXiv or PubMed, and that means it's now much easier to scan them for duplications of results and text. Officially, duplicate papers aren't supposed to be a big problem; PubMed claims less than 1,000 instances out of more than 17 million papers. But an anonymous survey of scientists suggest that rate of plagiarism is higher than that; 4.7 percent admitted to submitting the same results more than once, and 1.4 percent to plagiarizing the work of others.
The authors of the article, scientists at UT Southwestern in Texas, have been using a search engine called eTBLAST to search through scientific abstracts in the same way you might search through genome data for specific sequences. Any duplicates are then uploaded to a searchable database, Deja Vu. As might be expected, they managed to find quite a few examples of duplicate work. Out of a preliminary search of 62,000 abstracts, 421 were flagged. Some of these are papers that have been published in two languages, while others are all but identical, including the same authors, but have been submitted to different journals (a practice that is forbidden by every journal I've ever come across).
The article also looks at the nationalities behind such duplicate work; both China and Japan appear twice as often as their publication output suggests they ought to. This may be in part a language issue, as one of the people involved in the plagiarism cases identified by Turkish academics has claimed (subscription only) that, "For those of us whose mother tongue is not English, using beautiful sentences from other studies on the same subject in our introductions is not unusual." Unfortunately, in most of these cases, the copying goes well beyond individual sentences.
Although plagiarism is inexcusable, it can perhaps be said to be explainable. An academic's career depends upon their publication record: it's used to evaluate their performance for tenure, job applications, and funding, and entire departments are rated on their publications. All of this is determined by the ranking, or impact factor, of the journals for each of the publications. That impact factor is decided by Thomson ISI (the makers of the program Endnote), which has been criticized in the past for the way that it is calculated.
Now that criticism has been renewed, following the publication of an editorial in the Journal of Cell Biology2. The authors of that editorial went as far as buying the data that Thomson uses to calculate impact factors, whereupon they found that they couldn't arrive at the same numbers. Thompson have responded to the editorial, and things have been going back and forth since then. A long time ago, I wrote about a proposed alternative to Thomson's impact factors using Google's PageRank algorithm, but I must confess I've heard nothing more on that subject since then. Perhaps it's time for a renewed interest?
1: Nature, 2008. DOI: 10.1038/451397a


Popular Posts