What would be a good project for teaching big program concepts?

Question

I'm a freelance computer science tutor with junior high and high school students (working outside of school - I give them assignments.
Mostly my assignments are problems we find online, such as Codewars or USACO problems. This isn't the best way to teach them about large project issues, such as "maximize cohesion, minimize coupling" or clear structure and documentation.
When I had fewer students I did unique projects with each one, but that takes way too much time outside the session for me to prepare. So what I'm looking for now is some kind of project I can do with my students that would teach large project concepts.
This project may be fairly complex. But -- it should be the kind of complexity that my students can wrestle with on their own. I need to be able advise them on it in a one hour lesson. The problem with most student projects is that they spend the week making a mess of things and I can't sort them out in an hour.
It should be doable in Python or C++. I don't know much about web programming so that's probably out.
It should be fun and hold their attention - which probably means graphical game. I can't think of a text-based project which would hold their attention enough. If we use Qt then at least that is available in both C++ or Python.
(I don't want to use PyGame - too primitive and frustrating as a game engine.)
A simple game alone probably wouldn't teach "maximize cohesion minimize coupling" very well. Perhaps a game in which they implement the AI strategy? That might have complex enough algorithms. Maybe we could even play their AI's against each other.
Maybe a turn-based game like Civilization? (A primitive form of it, of course.)

guitarcat · Accepted Answer

You're definitely on the right track with your project scope but high school students can pay attention to large scale text based games.  I have had great success having them implement Blackjack.  This project is large by nature but you can place restrictions on how they solve the problem.  I've found that using three classes helps cut down on confusion.

The card class creates a single card object for use in the deck class
The deck class creates a deck of 52 cards made of card objects, also includes a shuffle and dealTopCard method
The hand class lets them actually deal a hand to themselves and play against a simple computer (the AI is simple to program)

I give this problem to AP CSA students in two parts, the deck and card classes with methods, and the hand class with functional game parts.
For higher level students, I give an open ended problem to algorithms & data structures kids where they need to write an algorithm that will procedurally generate a solvable (and interesting) puzzle.  It forces them to do their own research while implementing core functionality to a game.  Typically slide puzzles work well because they can be represented in a 2d array.
Middle School students tend to lack the attention to work on and understand a large and complex program.  Drawing an animal with Python turtles or something similar can be a good way to show how quickly bad code can get out of hand.  Just mentioning Discord bot is a good way to grab attention.  Going through the steps of creating a bot that does something even as simple as outputting a link by chat command goes over very well.
I have plenty more examples but as a general rule, the more "out there" the program while having clearly defined instructions, the better it holds high school student's attention.

Ben I. · Answer

I think you're onto the right track.  I've had great luck teaching these sorts of concepts in Unity, where you are trying to get so many different systems to operate together that architecture really starts to demonstrate its value.
And that's the key; you can't push this much beyond where they can see the value in it.
When I work with middle school students, I only gently lean on students for clean code.  I've found that most of what middle school students program is too small in scope for good practice to demonstrate its own value back to the students.
In high school, on the other hand, I have a different tact.  I get my initial buy-in to the notion of clean(er) coding by having the kids try to interpret short bits of absolutely abysmal code.  (These are very short, and purely meant to demonstrate how very hard code can get to interpret if you don't follow at least some norms.  You can see an answer where I have discussed this practice here.)
I will later follow up with code interviews in which they explain code that they created months ago.  By this point, they have mostly forgotten what they did, and we can establish the idea that clean code is usually self-explanatory, and that the person who will most likely be editing their old code is... them.  And that the person who most benefits, then, from being able to read it is also them.
One huge advantage you have by tutoring (as opposed to a classroom environment) is that code reviews become the most natural way to work, so you get a chance to look over their code in great detail and clarify good practices in ways that feel genuine to the student and their experience.  Take full advantage of this, but remember that you will not be able to get buy-in to cleaner coding until they can really feel the benefits.
Good luck!

danbst · Answer

Here is the set of mini-projects, I've designed but never took in class. So, completely untested, though most of this was either done by me or by my peers back in past.

Games:

Nim game. Any variant
Tic-Tac-Toe
Checkers
Tetris
Breakout
Asteroids

Bioinformatics:

all sorts of algorithms on RNA/DNA:

nucleotide frequency analysis
comparing genomes, finding best matches (nice one -- check which of coronaviridae RNAs are most similar to NCov-SARS-2019)
search most common substrings in RNA (those must be interesting by defintion, especially if they are long)
search substrings with N-mutations
search for palindromic substrings
more alogorithm examples and inspiration at http://rosalind.info/problems/tree-view/ and https://www.bioinformaticsalgorithms.org/bioinformatics-chapter-1

visualize RNA self-assembly. Build either terminal or arc views. This is real world usage of palindrome algorithm
visualize DNA replication, create simulation "soup" of nucleotides (A, T, C, G) and enzymes (polymerase, primase, ligase, and others), which all move randomly, and when primase "randomly" finds out primer DNA combination, the replication process starts. There are tons of youtube videos on how this is done, but still nice to see this "soup" in own implementation

Physics:

simulate ideal billiard -- balls hit balls or walls using Newtonian physics
simulate ideal gas (similar to above), show pressure as total impulse of balls on walls, and temperature as average kinetic energy of balls. Simulate "volume" change, or "wall temperature" and energy transfer
car moves on 2D, simulate proper steering, sudden brakes, drift behavior on high speeds, car track and window scrolling when car rides
gravitation simulator. Either Solar system, or just plenty of bodies all entangled with gravity force

Math:

implement: all sorts of averages, factorial, integer power, combination count, etc...
implement: all sorts of irrational functions and computations (sin, cos, sqrt, arctan, arcsin, exp, ln, pi, e, power through exp and ln)
implement: all sorts of operations with vectors and matrices
implement: long arithmetics (yeah, I know Python has Decimal, still nice to know how is it done)
implement: numeric integration (several kinds)

Robo stuff:

cellular automata (actually, I'll point to this topic here: https://natureofcode.com/book/chapter-7-cellular-automata/)
labyrinth creation and solving
implement A* and visualize

Coding theory:

all sorts of encodings: number bases, UTF-8 encoding, msgpack integers, etc...
encoding with error detection and correction: random bit errors, duplication as anti-error technique, Hamming code, CRC32
entropy encoding (Shannon-Fano)
math expression parser (with parens and operator priority)
Brainf*ck interpreter
implement x86_64 assembler, which can compile directly to machine codes. Macros support is optional, but of special interest!

Graphics. All of these should be done using two primitives: putpixel() and getpixel():

simples: horizontal line, vertical line, rectangle, filled rectangle, thick line
harder: line between two points, line by one point and angle, circle and filled circle, anti-aliasing drawing for line and circle, draw polygon by it's vertices, flood-fill, rotate square
complicated: duplicate part of screen with support of alpha channel and color key, mirror part of screen, rotate texture (with arbitrary center), scale texture, pixel blurring after scaling, mirror using guiding line

Psychology:

implement SRS tool like Anki or Memrise
implement Schulte test
implement "Organization of Dot" training puzzle generator
implement Corsi block test
implement N-Back

Though you wanted some large project, so you can combine several of above into one big project. You may also add topics like "databases" to some of those (the psychology ones -- store all the data in shared DB). The "visualize" are pretty hard projects too.
The big idea for these projects is to ignite more ideas after implementation done.

Jon Guiton · Answer

There are some very good ideas in the answers but have you considered asking your learners to perform a simple task on a very large piece of code? For example, adding a new item to the GIMP about menu or changing the error messages the AWK compiler to include the date of compilation using __DATE__.
I feel that learners are too often presented with exemplar code and rarely with real world code that, for example, contain large sections of Fortran 77 mixed that call libraries of routines written in assembler commented in German.
Many of our lessons concentrate on the need to have well structured and documented code with helpful comments. It is only when confronted with modifying code that the importance of this becomes evident. There are practical difficulties with large software objects for which tools such as grep and sed are very handy, again the importance of this only comes from maintaining code not from writing it.
[ As an adjunct you could introduce your learners to the obfuscated C competition https://www.ioccc.org/ and ask them if they fancy maintaining some of that! ]
I find students really like this approach as they start with something that works e.g. an alpha-beta pruning draughts playing program written in pascal, over which they have rights to augment and modify. This gives the learners a sense of empowerment rather than a feeling that after 10 hours of work they can sort six numbers into order having written a program they know nobody will ever use.
I have also some specially written code for exercises which I have deliberately written to reflect some poor programming practices or code problems which illustrate why techniques for large software projects exist. I have variously, commented out large parts of the code; used #define statements in slightly misleading ways; used inconsistent interfaces etc. The aim then is to get the learners to form criticisms of the code.

What would be a good project for teaching big program concepts?

4 Answers

Add your own answers!

Ask a Question