February 1, 2007

Plagiarism Detection in arXiv (2007)

Sorokina Daria, Gehrke Johannes, Warner Simeon, Ginsparg Paul

Abstract
We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger>>>

No comments:

Random Posts


  • Rampant Fraud Threat to China’s Brisk Ascent - The New York Times

    By ANDREW JACOBS BEIJING — No one disputes Zhang Wuben’s talents as a salesman. Through television shows, DVDs and a best-selling book, he convinced millions of people that raw eggplant and immense quantities of mung beans could cure lupus, diabetes, depression and cancer. For $450, seriously ill pa... READ MORE>>

  • Opinion: How to prevent fraud - The Scientist - Magazine of the Life Sciences

    Suresh Radhakrishnan Thoughts on how to catch scientific misconduct early from a researcher recently convicted of the offense Misconduct in science is increasing at an alarming rate, and is an issue that needs to be addressed. The constantly evolving technology, the arrival of online-only journals, ... READ MORE>>

  • Understanding Publication Ethics

    Geraldine S. Pearson* >>> A recent survey of 524 editors of Wiley-Blackwell science journals (including nursing journals) asked about the severity and frequency of ethical issues, editor confidence in handling these, and awareness of COPE guidelines (Wager, Fiack, Graf, Robinson, &... READ MORE>>

  • Singapore Statement Urges Global Consensus on Research Integrity

    Scientists, scientific journals, and research institutions must adhere to an international set of ethical standards and consider the social implications of their work, says a new statement from 2nd World Conference on Research Integrity, co-sponsored by AAAS.The Singapore Statement on Research In... READ MORE>>

  • SINGAPORE STATEMENT on RESEARCH INTEGRITY

    BackgroundThe principles and responsibilities set out in the Singapore Statement on Research Integrity represent the first international effort to encourage the development of unified policies, guidelines and codes of conduct, with the long-range goal of fostering greater integrity in research world... READ MORE>>

  • More retractions from Nobelist - The Scientist - Magazine of the Life Sciences

    Two prominent journals have retracted papers by Nobel laureate Linda Buck today because she was "unable to reproduce [the] key findings" of experiments done by her former postdoctoral researcher Zhihua Zou, according to a statement made by the Fred Hutchinson Cancer Research Center (FHCRC), where Bu... READ MORE>>

  • A Reflection on Plagiarism, Patchwriting, and the Engineering Master's Thesis

    Edward J. Eckel, edward.eckel@wmich.edu How many times has a graduate student asked you questions such as the following: "How many words do I need to change so I'm not plagiarizing?" or "If my professor gives me his article or patent and tells me to go ahead and 'use it', do I need to cite it?... READ MORE>>

.

.
.

Popular Posts