Showing posts with label arXiv. Show all posts
Showing posts with label arXiv. Show all posts

December 11, 2014

Breaking news and analysis from the world of science policy : Study of massive preprint archive hints at the geography of plagiarism - ScienceInsider

New analyses of the hundreds of thousands of technical manuscripts submitted to arXiv, the repository of digital preprint articles, are offering some intriguing insights into the consequences—and geography—of scientific plagiarism. It appears that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don’t get cited much.
Since its founding in 1991, arXiv has become the world's largest venue for sharing findings in physics, math, and other mathematical fields. It publishes hundreds of papers daily and is fast approaching its millionth submission. Anyone can send in a paper, and submissions don’t get full peer review. However, the papers do go through a quality-control process. The final check is a computer program that compares the paper's text with the text of every other paper already published on arXiv. The goal is to flag papers that have a high likelihood of having plagiarized published work.
"Text overlap" is the technical term, and sometimes it turns out to be innocent. For example, a review article might quote generously from a paper the author cites, or the author might recycle and slightly update sentences from their own previous work. The arXiv plagiarism detector gives such papers a pass. "It's a fairly sophisticated machine learning logistic classifier," says arXiv founder Paul Ginsparg, a physicist at Cornell University. "It has special ways of detecting block quotes, italicized text, text in quotation marks, as well statements of mathematical theorems, to avoid false positives."
Only when there is no obvious reason for an author to have copied significant chunks of text from already published work—particularly if that previous work is not cited and has no overlap in authorship—does the software affix a “flag” to the article, including links to the papers from which it has text overlap. That standard “is much more lenient" than those used by most scientific journals, Ginsparg says.
To explore some of the consequences of "text reuse," Ginsparg and Cornell physics Ph.D. student Daniel Citron compared the text from each of the 757,000 articles submitted to arXiv between 1991 and 2012. The headline from that study, published Monday in the Proceedings of the National Academy of Sciences (PNAS) is that the more text a paper poaches from already published work, the less frequently that paper tends to be cited. (The full paper is also available for free on arXiv.) It also found that text reuse is surprisingly common. After filtering out review articles and legitimate quoting, about one in 16 arXiv authors were found to have copied long phrases and sentences from their own previously published work that add up to about the same amount of text as this entire article. More worryingly, about one out of every 1000 of the submitting authors copied the equivalent of a paragraph's worth of text from other people's papers without citing them.
So where in the world is all this text reuse happening? Conspicuously missing from the PNAS paper is a global map of potential plagiarism. Whenever an author submits a paper to arXiv, the author declares his or her country of residence. So it should be possible to reveal which countries have the highest proportion of plagiarists. The reason no map was included, Ginsparg told ScienceInsider, is that all the text overlap detected in their study is not necessarily plagiarism.
Ginsparg did agree, however, to share arXiv’s flagging data with ScienceInsider. Since 1 August 2011, when arXiv began systematically flagging for text overlap, 106,262 authors from 151 nations have submitted a total of 301,759 articles. (Each paper can have many more co-authors.) Overall, 3.2% (9591) of the papers were flagged. It's not just papers submitted en masse by a few bad apples, either. Those flagged papers came from 6% (6737) of the submitting authors. Put another way, one out of every 16 researchers who have submitted a paper to arXiv since August 2011 has been flagged by the plagiarism detector at least once.
The map above, prepared by ScienceInsider, takes a conservative approach. It shows only the incidence of flagged authors for the 57 nations with at least 100 submitted papers, to minimize distortion from small sample sizes. (In Ethiopia, for example, there are only three submitting authors and two of them have been flagged.)
Researchers from countries that submit the lion's share of arXiv papers—the United States, Canada, and a small number of industrialized countries in Europe and Asia—tend to plagiarize less often than researchers elsewhere. For example, more than 20% (38 of 186) of authors who submitted papers from Bulgaria were flagged, more than eight times the proportion from New Zealand (five of 207). In Japan, about 6% (269 of 4759) of submitting authors were flagged, compared with over 15% (164 out of 1054) from Iran.
Such disparities may be due in part to different academic cultures, Ginsparg and Citron say in their PNAS study. They chalk up scientific plagiarism to "differences in academic infrastructure and mentoring, or incentives that emphasize quantity of publication over quality."
*Correction, 11 December, 4:57 p.m.:  The map has been corrected to reflect current national boundaries.

February 23, 2012

How to avoid retractions for plagiarism: Advice from a radiology journal editor (and arXiv) - Retraction Watch

Earlier this month, we highlighted the concerns of the editors of the ACS Nano journal about self-plagiarism, otherwise known as duplication. The editor of the American Journal of Roentgenology (AJR) — that’s radiology, for the uninitiated — has similar concerns, but about plagiarism of others’ work.
"Preliminary data including all article types accepted by AJR show that the amount of duplication varies significantly with different article types. For example, duplication in Original Research articles may be up to 58% and in Memorials, 23%. That is not to say that all duplications are significant or deliberate. For example, most computer software packages pick up words that are the same in a given sentence, pulse sequences that vary with vendors, and other similarities that may be appropriately referenced or quoted."
The ORI, Berquist notes, “has reported that up to 25% of their misconduct allegations involve plagiarism.” So how can authors avoid it?
Start by reading the guidelines, he writes, from the International Committee of Medical Journal Editors (ICMJE) and each journal to which you’re submitting. And it’s OK to paraphrase or summarize someone else’s work, as long as you quote or footnote:
"The corresponding author has the ultimate responsibility to be certain that all coauthors have used the appropriate citations and procedures. There are appropriate methods for paraphrasing or summarizing someone else’s work. These have been summarized by K. Shashok [11]. When this cannot be accomplished, one should use quotations or footnotes. For example, “correct citation and accurate referencing of sources are effective ways to prevent unintentional plagiarism,” as stated by K. Shashok in reference 10 of this editorial. I quoted this work directly and placed it in quotation marks with reference to the author by name and reference number. Using quotation marks is double protection and should be accomplished when statements are taken verbatim from another work. Manuscript reviewers will appreciate this honest approach with appropriate credit for the other author’s work. Permission from the original publisher is required if multiple paragraphs are quoted from another’s research work. One must always acknowledge someone else’s work even if it is paraphrased. Finally, if there are any concerns about previously published content, they should be noted in the cover letter to the editor when submitting the manuscript to the journal [9]."
These seem like eminently sensible suggestions, although we should note that in different settings, fair use provisions mean permission is not always required.
If authors are found to have plagiarized, there are serious consequences:
"Depending on the level of concern and the explanation, the allegations may be dropped and the authors warned to be aware of the potential misconduct when submitting future manuscripts to AJR or any journal. If the authors’ responses are not deemed satisfactory, an independent panel is selected to review the allegations. If the allegations are confirmed, the authors’ institutions are brought into the picture. In several cases we have required an organized remediation of an entire department with documentation of their process and outcomes. The worst-case scenario is multiyear sanctions that prevent the authors from submitting manuscripts for publication. To date, this has not been necessary during my tenure as Editor in Chief."
arXiv, the preprint server, has another good lesson in what happens when authors don’t follow the steps described in the AJR. Take a look at this paper, “Libertarian free will and quantum indeterminism,” submitted earlier this week. The comment field reads:
"Comments: 11 pages. arXiv admin note: text overlap with arXiv:1011.4898"
A search for “text overlap” on arXiv reveals more than 600 cases of such notes. Think those papers have a good chance of being published?

February 9, 2011

Why Cheating is Wrong

Scott Williams & Michael Courtney
Abstract: Mathieu Bouville’s "Why is cheating wrong?" (Studies in Philosophy and Education, 29(1), 67-76, 2010) misses the mark by failing to consider the longer term consequences of cheating on student character development and longer term societal consequences of undermining professional expertise and trust in disciplines where an earned degree is an essential part of professional certification and qualifications. Educators who turn a blind eye to student cheating are cheating the public by failing to deliver on the promise of graduates who genuinely earned their degrees.

Keywords: academic dishonesty, academic integrity, academic misconduct, education, ethics, homework, plagiarism
Intellectual gymnastics (Bouville 2010) do not change the fact that cheating is wrong. In a series of self-limiting arguments, the author repeatedly dismisses the negative effects of cheating, suggests cheating is essentially equivalent to dysfunctional pedagogy, and claims cheating is therefore wrong only to the extent that it has material consequences on learning and assessment. That cheating – to practice fraud or deceit (cheating n.d.) – is wrong independent of academic consequences is dismissed. Yet the objective wrongness of cheating is the central issue. Moreover, while the author offers legitimate criticism of common pedagogical practices, the uses and efficacy of grades, and the sometimes misplaced focus of the academic system, his attempts to dismiss the effects of cheating in light of these concerns ring hollow. Imperfections in sincere efforts to engage and assess student learning cannot be equated with deliberate attempts to defraud the system.
The negative effects of cheating go far beyond immediate issues such as diluting the meaning of grades, creating inequity between students, or undermining the learning environment, even if the effects of these things are less substantial than is generally perceived. Since actions form habits, and habits form character, academic dishonesty builds into the character a propensity for dishonesty. In addition, since academic credentials are criteria for professional certifications, academic dishonesty carries the risk of the unfounded illusion of professional competence. How many readers relish the thought of being treated by doctors who cheated in their anatomy and physiology courses, or having important lab tests performed by technicians who fraudulently secured (via cheating) their required certifications?
An individual’s character and integrity are paramount. Here at the United States Air Force Academy, our mission is to commission leaders of character who are prepared to honorably serve their nation. The nation expects and requires its military officers to uphold their oaths of office, adhere to the highest standards of conduct, effectively lead those in their command, and steward both weapons of war and secrets of national security. Furthermore, earned degrees are required components for certification of professional competence in many areas of military service. Cheating at any stage of officer development and regardless of immediate consequences is therefore absolutely incompatible with the profession of arms. What nation wants military officers in charge of navigation, engineering, or other technical issues related to national security who cheated their way through coursework rather than demonstrating genuine competence in required subject areas? What nation is eager to entrust the lives of its sons and daughters or its weapons of war to those who cannot even demonstrate faithfulness in college coursework?
It is sophistry to argue, "Breaking a rule is illegitimate only if the rule is legitimate. Either the rule has a rational justification and this rather than breaking a rule makes cheating wrong, or the rule is arbitrary and there is no reason to endorse it." (Bouville 2010) The legitimacy of rules rests in the legitimacy of the issuing authority; it is not given to individual students to whimsically decide whether or not the rules apply to them, especially when the student has agreed (implicitly or explicitly) to the rules by virtue of enrollment. To claim otherwise is to promote anarchy – a chaotic system with no real standards as students choose for themselves what feels right to them. How many citizens are eager to live in a country where the police and the military only follow the rules they deem to have adequate rational justification? The human mind has infinite capacity for finding flaws in the "rational justification" of rules that the human heart is inclined to disobey. Due process in the making and enforcing of rules and laws of orderly society does not require "rational justification" to the satisfaction of every individual who has a duty of compliance.
People expect their doctors, their pilots, their engineers, and their military officers to have genuinely earned their professional credentials and to meet rigorous standards in areas of knowledge and conduct necessary for public trust in the performance of their duties. Cheating is wrong because academic dishonesty in the training of these professions undermines both the expected level of expertise and the expected level of trust. Educators have a duty to society to ensure the quality of graduates, and this duty includes good faith efforts to prevent academic dishonesty.
Bouville, M. (2010) Why is cheating wrong? Studies in Philosophy and Education, 29(1), 67-76.
cheating. (n.d.).Dictionary.com Unabridged. Retrieved January 29, 2010,
from Dictionary.com website: http://dictionary.reference.com/browse/cheating >>>

August 15, 2010

Articles withdrawn from Open Access Database

Debora Weber-Wulff
I just ran across an article from 2007 about arXiv.org, one of the many Open Access databases, that withdrew 65 papers on General Relativity and Quantum Cosmology by 14 Turkish authors on the basis of the papers containing plagiarized material. One of the authors, a grad student at the Middle East Technical University in Ankara, was listed on 40 (!) of the papers. >>>

March 11, 2008

Plagiarism: Words and ideas

Mathieu Bouville
Science and Engineering Ethicsdoi: 10.1007/s11948-008-9057-6


Plagiarism is a crime against academy. It deceives readers, hurts plagiarized authors, and gets the plagiarist undeserved benefits. However, even though these arguments do show that copying other people’s intellectual contribution is wrong, they do not apply to the copying of words. Copying a few sentences that contain no original idea (e.g. in the introduction) is of marginal importance compared to stealing the ideas of others. The two must be clearly distinguished, and the ‘plagiarism’ label should not be used for deeds which are very different in nature and importance.>>>

Plagiarism Accusation About Turkish Physicists

Turkiye Klinikleri J Med Ethics
Year: 2008 Volume: 16 Issue:1 

LETTER TO THE EDITOR
In an article published in Nature dated Sept 6, 2007, it was stated that nearly 70 articles of 15 scientists from 18 Mart, Dicle and Mersin universities have been removed from a popular preprint server by allegation of plagiarism.[1]
Some points in the article such as value-laden statements, generalizations, and that not taking into consideration of a system which urges academicians publishing with an orientalistic point of view in another language that they have not been educated appropriately, generated the thought of writing to the editor of Nature. I would like to share this letter, which was rejected by Nature, with our academic community, and sending your journal by hoping it to be ac cepted for publishing.

"Sir
Certain issues raised by Mr. Brumfiel’s article (“Turkish physicists face accusations of plagiarism” Nature 449, 8, 2007) must be addressed. It mustn’t be over looked that as yet there isn’t enough information to assess the situation thoroughly, and this essentially precludes the ability to make an ethical analysis of the situation. Although they contain some fallacies such as ad populum and non sequitor, some of the arguments made in their own defense by the accused academicians deserve to be considered seriously. For instance, they have publicly declared that some of the articles they have been accused of plagiarising were published after their work. Therefore language such as “allegedly” or “seem to be involved” is correct, not politically but factually. However, the sentence, “There are some cultures in which plagiarism is not even regarded as deplorable” is a counter-example. I’m not aware of any sociological research concerning this premise, perhaps it’s true; nevertheless, its inclusion makes the language value-laden. Since culture includes moral values which have been shaped and changed by various factors, it’s a mistake to discuss the moral atmosphere surrounding a certain scientific community without considering the factors which have shaped it, such as English barriers, as Mr. Smith mentioned (“Need to speak English puts burden on Asian scientists” Nature 445, 256, 2007), and local factors, such as those in Mr. Sarioglu’s formula: “They’re isolated, their English is bad, and they need to publish”. What Mr. Sarioglu didn’t include is ‘their work should interest Western editors’. If scientific work is assessed regarding editors’ interests per se, not the needs of a particular society, then publishing transforms to some kind of a price to pay, and end transforms to means.”

[1]. Brumfiel, G. Turkish physicists face accusations of plagiarism. Nature, 2007. 449(7158):8.

September 6, 2007

Turkish Professors Uncover Plagiarism in Papers Posted on Physics Server -THE CHRONICLE of HIGHER EDUCATION

Aisha Labi
Dozens of academic papers containing apparently plagiarized work have been removed by moderators from arXiv, the popular preprint server where many physicists post their work before publication, Nature (subscription required) is reporting. According to the article, 67 papers by 15 physicists at four Turkish universities were pulled after an examination of their content revealed that they “plagiarize the works of others or contain inappropriate levels of overlap with earlier articles.”.>>>

August 22, 2007

65 admin withdrawals

65 articles by a group of 14 authors have been withdrawn by the arXiv administration due to excessive reuse of text from articles by other authors. The withdrawn articles were submitted from late 2001 through mid 2007, mainly to gr-qc, and the vast majority (59) were submitted in 2005-2006. (See also this article for additional details.)

The 14 authors, and the number of withdrawn articles co-authored by each, are as follows:

40 M. Salti (Grad Student, METU, Ankara)
29 O. Aydogdu (Grad Student, METU, Ankara)
15 S. Aygun (Grad Student, 18 Mart Univ, Canakkale)
14 M. Korunur (Grad Student, Dicle Univ, Diyarbakir)
13 A. Havare (Assoc Prof., Mersin Univ, Icel)
13 I. Tarhan (Assoc. Prof., 18 Mart Univ, Canakkale)
10 M. Aygun (Grad Student, 18 Mart Univ, Canakkale)
7 H. Baysal (Assoc Prof., 18 Mart Univ, Canakkale)
5 I. Acikgoz (Professor, Dicle Univ, Diyarbakir)
4 I. Yilmaz (Professor, Dean, 18 Mart Univ, Canakkale)
3 F. Binbay (Assistant Prof., Dicle Univ, Diyarbakir)
3 N. Pirinccioglu (Grad Student, Dicle Univ, Diyarbakir)
3 T. Yetkin (Instructor with PhD, Mersin Univ, Icel)
1 C. Aktas (Grad Student, Math Dept, 18 Mart Univ, Canakkale)

Note 1: The name K. Sogut (Instructor with PhD, Mersin Univ, Icel), who has co-authored with others on the above list, was erroneously included in an earlier version of this list due to an arXiv administrative error. There are no known problems with arXiv.org articles co-authored by K. Sogut.

Note 2: The name T. Yetkin appears only in conjunction with senior co-authors who have a more systematically problematic record.


The 65 withdrawn articles are listed below (the author submissions remain available as earlier versions):

gr-qc/0110023 gr-qc/0207026 gr-qc/0502031 gr-qc/0502032 gr-qc/0502042
gr-qc/0502043 gr-qc/0502058 gr-qc/0502059 gr-qc/0502060 gr-qc/0502061
gr-qc/0505078 gr-qc/0505079 gr-qc/0506061 gr-qc/0506062 gr-qc/0506135
gr-qc/0508018 gr-qc/0509022 gr-qc/0509023 gr-qc/0509047 gr-qc/0509061
gr-qc/0510037 gr-qc/0510038 gr-qc/0510123 gr-qc/0511030 gr-qc/0511095
gr-qc/0512080 gr-qc/0601070 gr-qc/0601133 gr-qc/0601141 gr-qc/0602012
gr-qc/0602070 gr-qc/0603027 gr-qc/0603044 gr-qc/0603063 gr-qc/0603108
gr-qc/0606022 gr-qc/0606028 gr-qc/0606080 gr-qc/0607011 gr-qc/0607082
gr-qc/0607083 gr-qc/0607089 gr-qc/0607095 gr-qc/0607102 gr-qc/0607103
gr-qc/0607104 gr-qc/0607109 gr-qc/0607110 gr-qc/0607115 gr-qc/0607116
gr-qc/0607117 gr-qc/0607119 gr-qc/0607126 gr-qc/0608014 gr-qc/0608024
gr-qc/0608050 gr-qc/0608111 gr-qc/0609101 gr-qc/0611014 gr-qc/0612016
gr-qc/0702047 astro-ph/0505018 0704.0525 0705.2930 0707.1776

February 1, 2007

Plagiarism Detection in arXiv (2007)

Sorokina Daria, Gehrke Johannes, Warner Simeon, Ginsparg Paul

Abstract
We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger>>>

Random Posts



.
.

Popular Posts