Plagiarism in Programming

Plagiarism in Programming: experts and novices belief about what it is and whether it is wrong

Plagiarism is currently an ongoing issue within academia, especially in the increasingly popular discipline of engineering (McCabe, 2005; Parameswaran & Devi, 2006; Stephen, Young, & Calabrese, 2007). Cheating leads to earning an inaccurate diploma, which has real-world consequences as engineers are involved in the creation of everyday life and household items. Despite its urgency, plagiarism has rarely been explored in the engineering majors.

Different communities and disciplines have different norms for what counts as plagiarism (Pennycook, 1996; Yeo, 2007). There has not been any clear, unanimously agreed upon idea of what constitutes integrity in coding. For example, many professors have their own expectations about what plagiarism is. Some allow collaboration, whereas others condemn it. It is important to study individuals who have varying levels of experience with programming because exposure affects understanding of discipline-specific standards.

Educators have attempted to curb the issue of cheating in programming by using plagiarism detection software (e.g., MOSS; Schleimer, Wilkerson, & Aiken, 2003). However, this is not foolproof. As mentioned, people disagree as to what constitutes plagiarism in coding. Thus, it is important to assess the extent to which such software aligns with people’s opinions about what counts as plagiarism.

Even if people agree that an act of copying counts as plagiarism, a crucial barrier to integrity is whether students believe plagiarizing is wrong. They may, for example, recognize that collaboration is against the rules but nevertheless see it as justified because it is closer to real-world practices. Very little research has measured whether programmers are concerned with academic integrity.

The present research establishes a new method for studying how people classify plagiarism in computer programming. We built upon prior work (e.g., Roig, 1997, 2001; Waltzer, Hari, Gonzalez, Berman, & Dahl, 2017) that asked people to discern whether concrete passages of text constitute plagiarism -- a method which, to our knowledge, has never been applied to computer code. The project addressed three key questions: (1) What do individuals with varying programming experience believe counts as cheating? (2) Do people’s decisions align with those of a plagiarism detector (i.e., MOSS)? (3) Do people believe cheating is wrong? To answer these questions, we asked programming experts and novices to complete an online survey in which they decided whether various examples counted as plagiarism and whether they were wrong.

We recruited 12 participants with over one year of programming experience (“experts”, 18% female, mean age = 26.30 years) and 19 participants with no programming experience (“novices”, 68% female, mean age = 21.80 years) to take an online survey. The survey took approximately 30 minutes to complete. Participants decided whether five hypothetical scenarios counted as cheating (see Table 1), reported whether they thought cheating was okay, and judged the extent to which six randomly assigned pairs of Java code (ranging from highly similar to highly different) counted as plagiarism. The full survey is permanently available at: https://ucbpsych.qualtrics.com/jfe/form/SV_0q4sIZ9ISM0i1jD

What do individuals with varying programming experience believe counts as cheating?

Though both groups generally agreed about which hypothetical scenarios counted as cheating, there were several striking differences (see Table 1). Experts deemed collaboration to be cheating much more often (70%) than did novices (37%), and novices sometimes thought using professor-given test code was cheating, whereas experts all agreed it was allowed. These group differences highlight the importance of discipline-specific experiences in classifying academic integrity.

Both groups were able to distinguish between various levels of similarity in codes (i.e., similar, intermediate, different), though experts more clearly differentiated between all levels (see Figure 1). Both groups identified similar pairs of code (which involved variable name changes only) as plagiarized, but there was substantial disagreement about intermediate cases (which involved e.g. switching for- and while-loops, changing comments and bracketing styles).

Figure 1. The proportion of both groups who thought the pair code were plagiarized

Do people’s decisions align with those of a plagiarism detector?

As comparison, we ran the programs through MOSS (Schleimer et al., 2003), an algorithm that reports a percentage of similarity for each pair of codes. The MOSS ratings were moderately positively correlated with participants’ judgments, though we found that experts aligned more closely with MOSS (r = .65, p = .004) than did novices (r = .56, p = .016). This indicates that MOSS was slightly more representative of those with experience in programming.

Do people believe cheating is wrong?

Ninety-three percent of participants said that cheating is not okay. Similarly, when they identified an act as cheating, both experts (r = -.80, p = < .001) and novices (r = -.93, p = <.0001) nearly always said the act was not okay, though experts were less resolved in this sentiment. There were several cases in which experts, despite identifying an act as cheating, nevertheless thought it was okay, giving reasons such as “they comprehended the lesson”. This did not appear at all in the novice group.

This method of examining plagiarism in programming has merit as there is a distinction between the novice’ and experts’ responses and how MOSS aligns closer with the expert’s response. There is also slight dichotomy between actions of cheating and whether it’s okay in experts’ perspective, which indicates a different mindset towards plagiarism in those with engineering experience.

Back to HCI Research

User Experience Research

User Experience Research

User Experience Research

Plagiarism in Programming: experts and novices belief about what it is and whether it is wrong