Cheating on labs

Computer Science Educators Asked on May 14, 2021

It is spectacularly easy to cheat on CS labs at the high school level and above (roughly ages 14 and up), particularly on short, early assignments. In fact, there is article in today’s NY Times about this problem.

For instance, for a very early assignment, a question might be:

Create a small program that will ask the user for a word and a number n, and responds by printing out the word n times.

This is a rather extreme example, but it could easily be the first part of a lab assignment for a student who has just encountered for loops for the first time.

Detecting cheating here is nonsensical because a huge portion of the resulting code will be nearly identical even if the students didn’t cheat. For that reason, I am seeking ways to discourage students from trying to cheat in the first place.

What are some best practices for staving off the flow of plagiarism?

11 Answers

I am a student assistant for some programming courses at bachelor level. What we do:

  • Automated plagiarism tests, that are tailored to the language. The general idea is to strip comments, whitespace, variable and function names, literals, and thus only keep the general structure of a program, i.e. parentheses, curly braces, semicolons, etc. (for C-like, that is). Then compare this 'hash' for all hand-ins, and when you find a match manually check if the programs look alike. For example, the hash of the program below would be something like #<>(,*){(;;--){("",);}}:

    #include <stdio.h>
    void repeat(int n, char *word) {
        for (; n; n--) {
            printf("%sn", word);
  • When plagiarism is suspected, notify the examination board directly (this is a policy, and it helps to show the severity of the issue).

  • At the beginning of the course, explicitly mention that we expect students to hand in alone or in pairs, and that collaboration between groups is OK as long as it remains on an abstract level.

There always remain some people who work together, and often students start groups on social media to exchange solutions. This is typical, especially for first-years courses, and I'm not sure something can be done about it. However, by acting fast (i.e. reporting it quickly, before the next assignment has to be handed in), students are scared off and in later weeks it is much less of a problem for us.

For small programs like your example, it is not always possible to justify a plagiarism accusation, because there are not many ways to write such a program. We usually tend to give students larger tasks and more freedom. For example, not "Here is the skeleton of the Car class, finish the methods drive, turn and halt" but "Choose a suitable representation of a car that can drive, turn and halt."

Of course, this is not always possible and especially in beginners courses plagiarism is 1) much more common and 2) much harder to check. This is why we don't give the lab assignments a large share in the final mark (most of the time just pass/fail), and rely on the final exam for an accurate grade. The lab sessions are to practice, it is ultimately up to the students to use them wisely.

Answered by user24 on May 14, 2021

I can only speak from a high school perspective as that is what I teach (14-18 years olds), and I truly feel your concern on this question. Biggest issues I have in my CS class are distracting websites (YouTube, games, etc) and copying code. I have certainly not fully conquered the problem but here are a few things that I do that seem to help:

  • Don't assign work that is not to be completed in the lab. I know this is probably a debated issue but I, personally, do not assign any work outside of class time so no homework or projects or anything done outside of the lab. While there are many reasons for this within the specific context of my high school that are irrelevant to this question, a big benefit is that I am always present when students are working on labs. It is much more difficult to copy off of another student when I am monitoring the lab and even harder when I am watching screens from a program like Apple Remote Desktop. I have found that as long as I actively monitor my classes when they are working on a lab it cuts way down on the copying

  • Don't have students turn work in, do screen checks instead. This takes some getting used to but the major benefits:

    • Get's you up walking around the room which helps with monitoring and making sure students aren't looking at each other's screens
    • Easier to compare two adjacent student's code when they are both up on screens right next to each other at the same time and you are right there so they can't just try to hide or minimize
    • More efficient for grading in that you don't have to individually open up files. Just walk up to student, have them run the code, and check that the output is correct
    • NOTE: This does not really work for more complicated, larger, multi-file, projects but you did state in your question that this question was aimed towards the starting, smaller projects which this is perfect for
  • Assign different assignments. Bear with me, because I know that as soon as someone says "Just create different assignments..." it almost always means more work for the teacher. But it does not have to be, especially for CS. For instance, one of my first Java assignments is to code a basic "Area Calculator" that asks for dimensions of a shape and then calculates the area. What I do is just assign each student a different shape from a pool of about six shapes. Then I make sure that no consecutive students have the same shape. When we do the Java MadLibs assignment, I have a few different MadLibs templates that I use, again making sure no consecutive students have the same template. This really creates no extra work for me other than when I first came up with the assignment, and even then it was totally manageable. I realize that they could still "copy" off each other to some extent but I have found that this cuts way down on the copying because students realize it would be more work to copy someone else's code and then adapt it, rather than just doing it themselves. You'd be surprised how small, little teaks to the assignment can dissuade students from copying.

  • Ask students questions about their code. This works particularly well when doing screen checks but just ask basic questions like "What does this line do?" or "Why did you put that there?". While this doesn't necessarily cut down on cheating it makes it very easy to identify the students who are copying because they will have the code done but have no idea what any of the lines do.

  • Position computers so its impossible to "screen snipe". I know that this is very dependent on the setup of the lab but if at all possible, arrange the workstations so that students do not have direct view of each others screens. One that I have found works well is to have four computers at a table, each facing different directions, but again I realize this may be impossible (as it is in the lab that I currently teach in), but if it is possible I have found it helps.

These are the main things I do and I am sure I am forgetting something but I hope these help and can improve the issues you are having.

Answered by celeriko on May 14, 2021

My main tool to prevent plagiarism on short beginner labs is to have a discussion with my students and to under-count them in grading.

I explain to my students that cheating is a huge problem on CS labs specifically. I use a news item of a huge number of university freshmen expelled for cheating in a CS class. (Search as I might, I cannot find the example right now - I will add it in later).

I then explain that the largest component of their grade will come from the quizzes and tests, not from their labs. The labs, as I explain it, are designed to help them do well on their quizzes and tests. Therefore, kids who cheat on labs are just going to shoot themselves in the foot.

When I have polled students about cheating, the year prior to this practice, I found that a very tiny portion of my students would admit, on a survey, to having cheated, but 100% reported that others cheated regularly.

By contrast, when I did the same poll the following year, almost none would admit to cheating themselves (this is no change), but only about 1/3 said that others cheated regularly. Needless to say, I kept this change.

Answered by Ben I. on May 14, 2021

For me, labs are worth very little. The district sets them to be only 10% of the student's average. So I don't worry about them working together. In fact, I encourage it. What I tell students is that as long as they understand the lab when they're finished, it's a successful lab.

Yes, I'm sure there are students that straight copy from their friends. At 10% though, it's not worth worrying about. They'll have tests and quizzes based on the labs and will absolutely bomb those.

For larger projects I only count the actual project as part of the grade. Turning in a working project is 70% of the grade, although I'm thinking of dropping that to 60% next year. The rest of the grade comes from an in class, on paper, free response that's based on what they wrote for the project. If they complete the project and understand what they turned in, the FRQs should be relatively easy. If they copied from a friend, they're going to be hurting.

I also run big projects through MOSS. Usually just the threat of that is enough to limit straight copying.

Answered by Ryan Nutt on May 14, 2021

If you're routinely setting tasks that are vulnerable to this kind of plagiarism, I'd start by questioning the purpose and value of the tasks - can you alter them to be less closed, more open, without losing the value they were providing to you?

If I'm setting something for homework or similar, it's always a creative task: e.g. "Write a python program to draw the flag of a foreign country of your choice".

("Foreign" = prevents them from all doing the most obvious choice: your current country's flag. "Your choice" = empowers them, but also makes it harder for people to hide behind the claim of accidental coincidence if e.g. 4 students all do the flag of the home country of one of the students' parents. Also easy to evaluate verbally whether it's an accident or not ... etc)

Programming shouldn't be too focussed on filling in deterministic, solved problems that are easier to get by copy/pasting off the web. If you set a task that's easy to copy/paste, you should expect students to copy/paste it - and unless an exam board is taking the choice out of my hands, I'd be on the side of the students (assuming they checked it worked, integrated it, and understood it - all those are easy to assess after the fact, but that's a separate question, I think?)

Answered by user31 on May 14, 2021

In addition to all of the great answers here, one further tool to consider is MOSS (Measure of Software Similarity), which has been released for free to educators by Stanford:

It can currently analyze code in C, C++, Java, C#, Python, Visual Basic, Javascript, FORTRAN, ML, Haskell, Lisp, Scheme, Pascal, Modula2, Ada, Perl, TCL, Matlab, VHDL, Verilog, Spice, MIPS assembly, a8086 assembly, a8086 assembly, MIPS assembly, and HCL2.

Answered by Ben I. on May 14, 2021

I just got a chance to read the article you linked to. As a teacher of CS50 AP, I can attest to the numerous solutions that are available online for all things CS50. It's almost unfair for a student trying to remain ethical when Google suggests appending "solution" to a query like "cs50 mario." Even a search without "solution" will often return a link to code on GitHub before the actual CS50 pset specs.

A few things I will do next year...

  • Continue to allow students to exercise the "regret clause" as I did this year

  • Utilize revision history - the Cloud9 IDE has a full history of each keystroke with a timestamp as a sort of built-in version control

  • Add just enough variation to assignments to distinguish one section's specs from the other and from CS50 itself

  • Grade the code but weigh more heavily reflection assignments on that code/assignment

As I prepare students for the AP Exam, I have to give them practice with written responses on their code for the Explore Task. I plan on assigning reflection components and looking more closely at comments that document student thinking and trying to grade process over product.

Paramount to all of this though is creating a classroom where students genuinely want to work through programming challenges. They have to know they will receive support when they struggle and hit a road block. I do my best to be available via office hours, especially before a big assignment is due, and sometimes knowing that they have a place to look for help other than Google makes a big difference.

Answered by Peter on May 14, 2021

One option is to have a short closed-book in-class quiz after each lab assignment to test each student's understanding (which is a good thing to do even in the absence of cheating).

For your sample problem, you might ask them to fill in the blanks to complete code to solve a comparable problem or to answer multiple choice questions about a short program you provide. I've used Moodle for these quizzes, which makes grading trivial.

Answered by Ellen Spertus on May 14, 2021

All the work we've done on in-flow peer review (see, for instance, our working group report) is aimed in part at this question. Overall, I believe we should rethink our curricula, pedagogy, and assessments so that we stop viewing plagiarism as a huge problem, but instead creatively think about alternate educational practices.

Answered by Shriram Krishnamurthi on May 14, 2021

At Denison, our intro class has labs designed around real world problems, and involves lab reports. This makes it a lot harder to cheat. We're not saying "implement quicksort", we're saying "write a simulation to check Tom Schelling's Nobel prize winning work, then write a 2-3 page paper explaining what you found."

Of course, students will still try to get help from each other or from the TAs. In my syllabus, I write some very explicit rules about what kinds of help is and is not allowed. Then, I give them a take-home quiz on the syllabus and it includes 2 case studies ("Jill was working on her lab when ...did a violation of the honor code occur?"). After making this change, and discussing the case studies in class after students hand in the quiz, cheating went down. By the way, I really like this syllabus quiz idea. It lets me spend the first day of class doing some active learning and setting the tone for the semester, rather than going over the syllabus. And it gives students a free quiz grade to force them to actually read the syllabus.

Answered by David White on May 14, 2021

I teach at the community college level, in Ontario's Colleges of Applied Arts and Technology system. My students generally study electronics and don't have any burning desire to become programmers. I teach hardware interfacing and realtime responses to hardware or external events with Python, CircuitPython and Arduino, on Windows or Raspberry Pi. For academic integrity, I have several approaches.

  • Minimize the temptation. Generous deadlines, appropriate difficulty, open ended assignments and lots of opportunities to ask questions.
  • Avoid reusing assignments. Last year's assignment was an auto ranging voltmeter, this year was a direct digital synthesis waveform generator, and so on. This gets rid of code reuse from one cohort to the next.
  • In my course policy, I refer to the MIT policy on coding, it's quite reasonable in that it's enough to give credit in comments.
  • For my second year class, I use a source code management system (previously Fossil, now Mercurial and TortoiseHg) and award marks for intelligent use of the system, including meaningful commits and commit messages. It looks really bad when a program appears in a single commit on the due date.
  • And yes, I routinely use MOSS, but warily. It works as a front end. I look at the top of the list and examine the code. Most of the similarities are idioms of the language like "for line in input_file:" and can be dismissed. What I look for (as pointed out in the NY Times story) is a pattern of similar mistakes or unusual coding, renamed identifiers, or shuffled order of function definitions.
  • Finally I will overlook some cases where a pair or trio of students have very similar solutions who are known to work together in all their courses. I observe their interactions in the labs (not just programming class, electronics etc,) and it's pretty obvious that learning is happening. Typically it's one or two of the more eager students pulling a friend along who's also motivated but needs the help to keep up. I can't stop it, might as well embrace it.

This year it's going to be a lot harder because most of my teaching is remote and I can't get a feel for who's doing what, and who are in the lead peloton drafting the rest.

Hope this helps.

Answered by Louis B. on May 14, 2021

Add your own answers!

Related Questions

AP Computer Science A vs. OCP

1  Asked on June 19, 2021 by marwi


Cheating on labs

11  Asked on May 14, 2021


How to learn Java as a beginner?

2  Asked on April 4, 2021 by long-le-thanh


games for teaching html

0  Asked on March 8, 2021


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir