Sandra Jamieson and Rebecca Moore Howard:
Unraveling the Citation Trail
Project Information Literacy, "Smart Talks," no. 8, August 16, 2011
Look up the word plagiarism in a reliable source and you may find it defined as “the act of taking the writings of another person and passing them off as one’s own. The fraudulence is closely related to forgery and piracy—practices generally in violation of copyright laws.”
Two English professors, Sandra Jamieson at Drew University and Rebecca Moore Howard at Syracuse University argue that definitions like this one may fall short, especially when they are applied in the academy in the digital age.
Moreover, they suggest that such narrow views of plagiarism impede ways in which writing is taught to students and how students come to use sources in papers.
With a team of 21 other writing instructor-researchers, Jamieson and Howard lead The Citation Project. The project is a national study that collects and distributes empirical data about how college students use sources when writing research papers for composition courses.
We interviewed Sandra and Becky in the summer of 2011 as they prepared for another academic year. We discussed their innovative research study, what they had learned about how students integrate sources into their papers, and what their findings tell us about today’s students and how they use of information.
PIL: We are intrigued that although you are both professors of writing and rhetoric, you are also conducting what you call an empirical inquiry. How did The Citation Project come about? What are its origins? What influenced your decision to conduct a study like this? Is plagiarism on the rise? What do you hope to accomplish with your research?
Sandra & Becky: We are both pretty amazed by this fact, too! Nothing in our graduate education prepared us for this kind of research; in fact, in our field it is not uncommon for this sort of research to be considered of less value than “theory.” But we felt that we needed to go beyond our training so that we could make arguments that would be evidenced in ways that people outside our field could respect. We were significantly influenced by arguments that Chris Anson made in a plenary address to the Council of Writing Program Administrators in 2006: he pointed out that anecdotally evidenced claims tend to be accepted only by those who share common beliefs and assumptions. So when a compositionist makes a claim that she evidences in her own classroom experiences, other compositionists may readily agree, because they share similar experiences. Outside composition and rhetoric, though, the audience to such claims may react skeptically or indifferently.
Motivating our research was our concern about how much the academic discourse about students’ writing was focused on whether students were plagiarizing, and we were also concerned by the lack of useful data about how students use and engage with sources. There was a public concern, a lot of anecdotal claims that plagiarism was on the rise, and a rush to find ways to catch "cheaters," but the information about what is actually going on when students write source-based papers was pretty slim. We felt—based on our classroom experience!—that what underlay much of what was being interpreted as plagiarism was not based in students’ ethical choices, but rather in their practices and skills in source-based writing. Having given careful consideration to Anson’s argument, though, we wanted to test our beliefs, in such a way that we could be certain of them, and so that others could respect them, as well. Therein was born the Citation Project, which endeavors to describe what students at a variety of institutions are actually doing in their source-based writing. To prepare for our research Sandra took an undergraduate statistics course, and throughout the development of the project she has worked closely with Sara Abramowitz, a statistician at Drew University.
The Citation Project began with a localized study that asked a simple question: “How often do students use summary, paraphrase, and patchwriting in their researched writing?” The results of that preliminary study, by Rebecca Moore Howard, Tanya K. Rodrigue, and Tricia C. Serviss, are described in "Writing from Sources, Writing from Sentences” (Writing and Pedagogy 2.2. Fall 2010. 177-192). This preliminary study piloted the methods used in our subsequent larger study, and it raised questions that the Citation Project has endeavored to answer.
When we developed the Citation Project, we decided to create a larger, multi-school study. One cannot make meaningful generalizations about writing or writing instruction based on statistical data from only one institution, and most institutional studies are more informative when placed in the context of data from other institutions. Analyzing 1,911 citations from 174 student papers produced at 16 different colleges provides a broadly based sample of the kinds of writing of all college students and constitutes a mathematically representative sample. As you’ll see when you look at our Website, we carefully chose our 16 colleges from the entire geography of the country, and we selected a wide variety of types of institutions.
PIL: To date, the project’s research team has employed a systematic content analysis and has studied papers of 160 college students enrolled at 16 U.S. colleges and universities. In many cases, you have found students use sources to patch write. What do you mean by patch writing? What’s the Web’s role in it? Do you have any idea of why some students may be patch writing more than others? Is patch writing a form of plagiarism?
& Becky: Actually,
we have now studied 174 student papers.
Some of the samples were shorter than others, so in order to study the same
number of pages from each campus we had to include a few more papers from some.
As we said above, it is very important to us that our data be as incontrovertible
as possible because it is leading us to some pretty extreme realizations about
student writing and writing instruction.
As for patchwriting: we use the following generalized definition for the purposes of teaching and policy-making: “patchwriting is restating a phrase, clause, or one or more sentences while staying close to the language or syntax of the source.” We have come to think of patchwriting as an unsuccessful attempt at paraphrase. In the papers we have analyzed, students often toggle back and forth between paraphrase and patchwriting, as they try to answer the question, “How else can I say this?”
We also have a technical definition that the researchers used when they coded the 174 papers; it’s embedded in our coding procedures:
“Highlight in yellow all words in the cited area that restate the ideas or information in a passage using more than 20% of the source. (This 20% does not include accurate synonyms, articles, prepositions, proper names or technical terms; words whose morphology is changed are considered to be source language and counted within the 20%.)
“If you code the majority of the material in the cited area as patchwritten, check this box on the coding sheet.”
would urge that this definition not be used for anything but research purposes!
While it was necessary for us to come up with a quantified rather than
impressionistic way for coders to differentiate patchwriting and paraphrase,
that quantification is inescapably arbitrary. Patchwriting, paraphrase, and the
differences between them are necessarily contextualized by the rhetorical
situation, including the task, writer, reader, and occasion. This first stage
of the Citation Project is purely textual, analyzing words on a page. The
interpretation of these words, however—including decisions about whether they
are appropriate, inadequate, or transgressive—must always take place within the
rhetorical context. This is why we used human coders for the papers rather than
computer programs, as some suggested we should. Such a decision would have
saved us literally years of research time, but it would have removed the
interpretive process that is at the heart of engaging with sources, and we
would have missed most of the nuances that we believe make our work useful.
You ask whether patchwriting is a form of plagiarism. It could be: a writer could deliberately patchwrite rather than go to the trouble of paraphrasing successfully. In our own experiences as writers, teachers, and adjudicators of plagiarism cases, however, we believe it seldom is. Patchwriting occurs whenever a writer struggles with a source text, and many first-year college students don’t even know that it isn’t “paraphrase.” Patchwriting is underdeveloped writing, not transgressive writing.
PIL: As you have suggested in your latest paper, “writing from sources is a staple of academic inquiry.” What else have your preliminary results told you about how college students use sources? Where do they struggle the most with applying and integrating information sources into the papers they are writing? In a larger sense, what does this tell us about today’s students, their writing styles, and their information literacy competencies?
& Becky: Our
recently completed data confirm much of what the pilot study found, but we are
not yet able to address the question of whether what we have found indicates
that today’s students are any different from previous students. One
fascinating study currently planned is to code researched papers written before the Internet
became a cultural staple. We would not be surprised to find significant
patchwriting from these sources, but we are looking forward to being able to
publish some comparative data on this subject. If
anyone out there has access to student researched papers written before 1994
that we might include, please let us know! Our contact information is on our Website.
But to answer your question. What do our data tell us about how students use sources? We need to make a distinction here. In short papers and in researched papers for discipline-specific and even writing intensive classes, students may use sources very differently than they do in first-year composition classes. Analysis of papers written for these different contexts is also planned for our subsequent research. All we are qualified to comment upon right now is how students use sources in researched writing for first-year composition courses. And the news is not good. We found that 42% of the citations are direct quotations, 16% are patchwritten, 32% are paraphrased, and 6% are summarized. A further 4% of the cited material was directly copied with no quotation marks or other indication that this was not the student’s own words. At first glance the relatively low percentages for patchwriting and unmarked but cited copying (misused sources) may seem encouraging, but those concerned with plagiarism need to remember that we did not code unattributed copying. We don’t know whether there was any because we did not analyze material that was not cited; but conversely, we cannot say that there was not any, either.
So, most of the citations were for material that was either quoted or paraphrased. If your focus is on procedure and correct format, these papers are a great success. But if you look at this another way and remember that for most of us, “research” is about the discovery of new information and ideas, and the synthesis of those ideas into deeper understanding, the majority of the papers failed. Only 6% of the citations are to summarized material. It is in summary that writers demonstrate comprehension of the larger arguments of a text, working from ideas rather than sentences. And in the papers we studied, students are not doing that. Further, 46% of the citations are from the first page of the source in question. Yes, that really is 46%, and a full 70% come from somewhere in the first two pages (1,328 citations from a total of 1,911 that we coded). The majority of the sources are cited only once, and only a handful of the papers cite any source in a way that suggests the student was engaging with the entire text.
On the subject of information literacy there is also mixed news. Our study found that 24.4% of the citations are to journal articles, most of which would qualify as academic. A further 17.9% are to books, although this category includes fiction, anthologies (including poetry and plays), and collections of short articles such as the Opposing Viewpoints series. But on the other extreme, 24.5% come from web-based sources, and 26% come from sources that are two pages or less in length (44% of the citations are to sources that are no longer than four pages). What this suggests more than anything else is that students may not know how to distinguish between what instructors would consider “reliable sources” and totally inappropriate sources for college-level (or even high school) papers. The Citation Project needs to conduct follow-up research to help us determine whether the students are actually aware that some sources are less appropriate than others; until then, we can only speculate about what this reveals about their information literacy awareness. This stage of our research focuses exclusively on what students wrote in their papers, not why they did it or what they knew.
PIL: In 2010, PIL conducted a content analysis of 191 research assignment handouts that instructors distributed to students on 28 U.S. college campuses. Overall, we found handouts provided more how-to procedures and conventions for preparing a final product for submission, and not as much guidance about conducting research and finding and using information sources. Further, few handouts provided details about preventing plagiarism. And if they did, the handouts tended to emphasize the disciplinary recourse that instructors would take against students who were caught in acts of academic dishonesty. Why is the topic of plagiarism so frequently couched in the punishment that will be meted out for violators? What are the consequences of admonishing instead of educating students about plagiarism?
& Becky: Yours
is fascinating research, and it seems to reflect what we have found from
analyzing the papers in our sample and from reading the results of various
local surveys people have conducted at Citation Project participating campuses.
As you indicate, it reveals an emphasis on crime and punishment when, we
believe, we should be focusing on engagement with source material and the
research process as a generative, meaning-making activity. That is,
instructors’ focus on the ethics of source use and the fear of plagiarism has
obscured the reason that researched writing is assigned in the first place.
By focusing on procedures and conventions, instructors render researched
writing as stultifying as the five-paragraph theme, and by indicating a concern
with form more than content, this pedagogy reduces the process to a mindless
exercise. Citation Project researchers were frequently struck by this as we
analyzed the sources students selected for their papers. For example, when every
paper in a school’s sample included a citation to a book, we could deduce that
the curriculum required students to refer to a book. The cited material from
books does not, however, always seem to inform the paper—although thankfully
there are some exceptions to this norm. We are concerned at our finding that
the cited material is most often quoted, and most often drawn from the first
few pages of the book or a related chapter of that book. In the latter case, we can assume that students are using
the index to find relevant information, but the former suggests that students
are just reading the first few pages of the introduction and gleaning workable
quotations from the material there. In neither case do students seem
to be using the book as a resource from which they might learn about the topic
at hand, which is what we assume the instructors intended. Similarly, a
vast majority of the papers cite each source only once, and only in one
paragraph. A lot of attention does seem to have been focused on the bibliographic
entries, though; these are usually close to MLA citation. This, too, suggests an emphasis on procedure rather than
As we said above, we only studied the way students handled cited sources. If the source was not cited, we did not code it. So we very intentionally are not analyzing plagiarism. Yet when we talk about our research, the first thing many people ask is about plagiarism, and that is also the focus of most of the current scholarship about students’ source-based writing. We do concur that academic integrity is an important issue, but it is not the focus of the Citation Project, for a very important reason: If students are not engaging with the research process and are not carefully reading the sources they cite, plagiarism—in one form or another—is likely to occur, because the students lack the source-handling skills and practices that are necessary for responsible, ethical writing. We believe that when students are engaging with their researched sources, they will have a significantly reduced motivation to deliberately plagiarize. When the subtitle of our Website says, “preventing plagiarism, teaching writing,” we are asserting a causal relationship. We would make that relationship explicit by saying, “to prevent plagiarism, teach writing in a way that engages students with research, research sources, and research writing.”
In this first stage, the Citation Project has studied the textual evidence of what students are doing with their sources. Preventing plagiarism is a desired outcome of our research, but as an indirect result of students’ knowing how to work with sources. If instructors know what their students are doing—and that is what our research accomplishes—then they know what the basic instructional tasks are. Correct citation is what one teaches when one is focused on plagiarism prevention. Engaged reading of sources and thoughtful writing about them is what one teaches when one realizes that students may be citing correctly while merely parroting sentences from the first few pages of a source, and nothing more. If all they’re producing is a hollow simulacrum of research, should we be surprised if they sometimes choose to plagiarize?
To answer another part of your question: we know of no reliable way to track whether acts of plagiarism have increased, or whether the Internet has simply made it easier for us to discover (and prove) plagiarism. We do know, however, that instructors and campus policies should remove patchwriting from the category of plagiarism. Treat it as bad writing, partial reading, or unsuccessful paraphrase, but not as academic dishonesty.
Some campus policies still categorize patchwriting as a form of plagiarism, even when the source is cited. So this issue of definitions is crucial. In 2003, the Council of Writing Program Administrators produced a best practices document “Defining and Avoiding Plagiarism: The WPA Statement on Best Practices” in which the authors make a distinction between plagiarism and misuse of sources that we find very helpful, and we urge readers to consult that document.
PIL: Lastly, what are three things that instructors can do to integrate plagiarism prevention into their curriculum in more far-reaching ways? Do you have any examples of how best to teach plagiarism prevention in the digital age? What can librarians do teach students how to avoid plagiarism, especially inadvertent plagiarism?
& Becky: First,
instructors should teach students how to read complex sources critically. In
the 174 papers in our sample, there is very scanty evidence of this.
Second, instructors and librarians should teach, at every opportunity, methods of good source selection. This has to start not with “a journal is better than a Website” but with “here’s how you identify the bibliographic elements of a text.” A lot of the sources used by the students in our sample are stunningly cheesy and simplistic, but this may not be evident in the bibliographic entries—not because the students are trying to dissemble, but because they don’t know how to identify bibliographical elements when looking at a text, much less evaluate the quality of that text. Just teaching students to differentiate who publishes a journal from the journal’s name would be a step forward; we have frequently encountered things like “Johns Hopkins University” as the name of a journal, when it is Johns Hopkins that publishes the journal. Or we see “Health and Diet” in the journal position in a bibliographic citation, but when one tracks down the source, one discovers that “Health and Diet” is a section of Web MD. If students can’t identify the bibliographic elements of a text, how can they evaluate the text for quality and authority?
We have been quite surprised by how few of the sources cited in the papers contain works cited pages themselves, and think this might provide a fruitful way to begin a conversation about research and citation. If students are reading essays in class and using sources in papers that do not themselves cite sources, our emphasis on this feature of academic writing might seem a little confusing. More important, when using a bibliography-free source, the students lack models of the kind of source engagement their instructors want, and can’t see the reasons for it. A discussion of citation as part of a larger conversation about the purpose of research and the nature of the academic conversation might be very useful. Students need to understand why we cite sources in the first place, as well as the roles that citations and lists of references can play in gaining new knowledge. They need to know why instructors are assigning a research paper. If the only reason to assign the paper is to teach the procedures of research and citation, as your study of handouts suggests, perhaps instructors could develop better methods to do that. It is reasonable to assume that the more students get the sense that content does not matter, the more they are likely to produce disengaged, patchwritten work.
Third, instructors need to teach students how to work with and summarize extended portions of text. Our analysis reveals that 94% of the 1,911 citations are at the sentence level—either quotation, unattributed copying, patchwriting, or paraphrasing. This, we believe, is a stunning statistic. Our researchers defined summary as the restatement and compression of three or more consecutive sentences. Even then, students summarized only 6% of the time, indicating that they either could not or would not engage with extended passages of text. Teaching students how to summarize and how to integrate that summary into researched writing are compelling pedagogical mandates.
Notice that none of these three recommendations directly addresses plagiarism. That’s because we believe that plagiarism will inevitably occur if students can neither read complex sources critically nor conduct authentic researched inquiry. And our research reveals that they do not, and raises the question of whether that is a matter of students’ choice, or a matter of their being unable to. College instructors have a hard job to do, and fetishizing plagiarism is diverting us from the hard work ahead.
We’d like to thank Project Information Literacy for the opportunity to talk about our research. We’re looking forward to readers’ comments and questions, and to learning what others are doing in this area of inquiry.
For more about their findings from an analysis of 18 student papers, see Rebecca Moore Howard, Tricia C. Serviss, and Tanya K. Rodrigue. "Writing from Sources, Writing from Sentences," Writing and Pedagogy, 2.2 (Fall 2010): 177-192, at http://writing.byu.edu/static/documents/org/1176.pdf (accessed 8 July 2011).
Smart Talks are informal conversations with leading thinkers about the challenges of finding and using information, conducting research, and managing technology in the digital age.
Smart Talks is an occasional series, produced by Project Information Literacy (PIL). PIL is an ongoing research study, based in the University of Washington’s Information School with contributing support from the Berkman Center for Internet and Society at Harvard University, the John D. and Catherine T. MacArthur Foundation, Cengage Learning and Cable in the Classroom.
Smart Talks are open access. No permission for its use is required from PIL, though we ask that this source be cited as: “Unraveling the Citation Trail,” Project Information Literacy Smart Talk, no. 8, Sandra Jamieson and Rebecca Moore Howard, The Citation Project, August 15, 2011.
Alison Head, Lead Researcher for Project Information Literacy, conducted
this email-based interview with Sandra Jamieson and Rebecca Moore Howard.
 This definition of plagiarism was found on http://www.britannica.com/EBchecked/topic/462640/plagiarism in the Britannica Encyclopedia on 8 July 2011.