A number of articles and news items have brought the issue of plagiarism into focus recently. Last week, a short paper in Science provided an update on the research by Harold Garner and his colleagues that was previously reported in Nature News, and has since been commented on in a number of places including SSP’s Scholarly Kitchen blog.
Garner’s team has taken abstracts from Medline and used a piece of software called eTBLAST to compare them against each other for similar and overlapping text. To date, with a combination of machine and human analysis, they have identified 9120 articles with "high levels of citation similarity and no overlapping authors", and 212 pairs of articles "with signs of potential plagiarism". They have gone on to contact authors and editors and (under assurances of anonymity) have received a range of responses from outrage to apology to denial. As of February 2009 they are aware of their study having triggering 83 internal investigations leading to 46 retractions.
In The Scientist Garner explains that technology has a role to play in plagiarism detection because "You can't expect all the editors and reviewers to have all 18,000,000 papers in their head from biomedicine”. Technology will never be an adequate substitute for a human domain expert’s knowledge and judgment, but a system such as CrossCheck can scan vast amounts of content and flag up potential issues, saving time and adding a level of reassurance previously unavailable.
The CrossCheck database currently contains almost 11 million content items and is on course to become the most comprehensive resource against which to check scholarly content for plagiarism. Look out for sessions on CrossCheck and plagiarism at the UKSG conference at the end of the month, and also at the Council of Science Editors meeting in May.