TransWikia.com

Reverse engineering a checksum algorithm

Reverse Engineering Asked by user2390115 on February 14, 2021

I am attempting to implement an editor for a discontinued hardware drum synthesizer which uses undocumented system exclusive MIDI messages for communication. I’ve figured out the patch format, but I am hitting a wall with the checksum algorithm(s) being used. I’ve tried a variety of approaches & tools (reveng, for instance) to attempt to figure out the calculation and have reached the limits of my abilities.

This is all complicated by the fact that MIDI is restricted to 7 bits within the message (the header and footer bytes F0/F7 excepted). So there’s the additional question of how overflows are handled.

There are two types of messages I’m attempting to understand:

A voice request:

F0 33 7F 7F 08 03 07 40 00 0E F7

or

F0 33 7F 7F 08 03 05 00 13 F7

in which the penultimate byte (0E, 13 in the examples) is some sort of checksum. I’m assuming that the two messages use the same algorithm, but I’m not even sure which bytes are actually being considered for the checksum calculation (since 0xF0 0x33 0x7F is always there for every message sent from or received by this instrument, it might be excluded, but I can’t be sure). Here are a few more samples:

F0 33 7F 7F 08 03 07 40 1A 57 F7
F0 33 7F 7F 08 03 07 40 1B 5F F7
F0 33 7F 7F 08 03 07 40 1C 67 F7
F0 33 7F 7F 08 03 07 40 1D 6F F7
F0 33 7F 7F 08 03 07 40 1E 77 F7
F0 33 7F 7F 08 03 07 40 1F 7F F7
F0 33 7F 7F 08 03 07 40 20 1C F7
F0 33 7F 7F 08 03 07 40 21 14 F7
F0 33 7F 7F 08 03 07 40 22 0C F7
F0 33 7F 7F 08 03 07 40 23 04 F7
F0 33 7F 7F 08 03 07 40 24 3C F7
F0 33 7F 7F 08 03 07 40 25 34 F7
F0 33 7F 7F 08 03 07 40 26 2C F7
F0 33 7F 7F 08 03 07 40 27 24 F7
F0 33 7F 7F 08 03 07 40 28 5C F7
F0 33 7F 7F 08 03 07 40 29 54 F7

or grouped by checksum:

F0 33 7F 7F 08 03 07 44 0D 00 F7
F0 33 7F 7F 08 03 07 45 1F 00 F7
F0 33 7F 7F 08 03 07 47 2A 00 F7

F0 33 7F 7F 08 03 07 44 1C 01 F7
F0 33 7F 7F 08 03 07 45 0E 01 F7
F0 33 7F 7F 08 03 07 46 29 01 F7

F0 33 7F 7F 08 03 07 44 2F 02 F7
F0 33 7F 7F 08 03 07 46 1A 02 F7
F0 33 7F 7F 08 03 07 47 08 02 F7

F0 33 7F 7F 08 03 07 45 2C 03 F7
F0 33 7F 7F 08 03 07 46 0B 03 F7
F0 33 7F 7F 08 03 07 47 19 03 F7

F0 33 7F 7F 08 03 07 40 23 04 F7
F0 33 7F 7F 08 03 07 41 31 04 F7
F0 33 7F 7F 08 03 07 42 16 04 F7
F0 33 7F 7F 08 03 07 43 04 04 F7

F0 33 7F 7F 08 03 07 41 20 05 F7
F0 33 7F 7F 08 03 07 42 07 05 F7
F0 33 7F 7F 08 03 07 43 15 05 F7

F0 33 7F 7F 08 03 07 40 01 06 F7
F0 33 7F 7F 08 03 07 41 13 06 F7
F0 33 7F 7F 08 03 07 43 26 06 F7

F0 33 7F 7F 08 03 07 40 10 07 F7
F0 33 7F 7F 08 03 07 41 02 07 F7
F0 33 7F 7F 08 03 07 42 25 07 F7

F0 33 7F 7F 08 03 07 44 0F 10 F7
F0 33 7F 7F 08 03 07 45 1D 10 F7
F0 33 7F 7F 08 03 07 47 28 10 F7

The checksums cover the full 7-bit range from 00 to 7F, but the occurrence of the lower 4 bits appears to be related to the lower 4 bits of the 0x4n byte.

The other type of message is the actual voice data. This data uses a 28-bit (4 byte) checksum at the end of a packet (212 bytes total):

F0337F19 08030600 00040000 00000000 03600000 02540070 18001640 17100D19 16010808 60021219 4B103112 4A000000 06600000 003C0070 18001640 17100D19 32652008 34033D48 03102C4F 0A000000 16420000 00580110 18001640 16580D19 1E012C08 7401760A 13102F17 47000000 16420000 00500070 18001640 16580D19 32007408 68033D75 7B102F57 4C000000 06420000 00500070 18001640 16580D19 32007408 7002517A 13102F17 4A000000 16420000 00580070 18001640 16580D19 32005C09 10023202 13102F57 48000001 5D7C65F7

or

F0337F19 08030600 00040000 00000000 03610000 02540070 18001640 17100D19 16010808 60021219 4B103112 4A000000 06600000 003C0070 18001640 17100D19 32652008 34033D48 03102C4F 0A000000 16420000 00580110 18001640 16580D19 1E012C08 7401760A 13102F17 47000000 16420000 00500070 18001640 16580D19 32007408 68033D75 7B102F57 4C000000 06420000 00500070 18001640 16580D19 32007408 7002517A 13102F17 4A000000 16420000 00580070 18001640 16580D19 32005C09 10023202 13102F57 48000000 6A7101F7

or

F0337F19 08030600 00040000 00000000 03620000 02540070 18001640 17100D19 16010808 60021219 4B103112 4A000000 06600000 003C0070 18001640 17100D19 32652008 34033D48 03102C4F 0A000000 16420000 00580110 18001640 16580D19 1E012C08 7401760A 13102F17 47000000 16420000 00500070 18001640 16580D19 32007408 68033D75 7B102F57 4C000000 06420000 00500070 18001640 16580D19 32007408 7002517A 13102F17 4A000000 16420000 00580070 18001640 16580D19 32005C09 10023202 13102F57 48000003 73762BF7

The changes to the 1st byte of the checksum (01, 00, 03) in response to the change to the lower 4 bits of byte 41 (0, 1, 2) is certainly a hint, but I’m not getting it.

Anyway, if anyone has a suggestion about a reasonable way to proceed, I’d be thankful. The company which manufactured this instrument has no interest in supporting their old(er) hardware, and refuses to provide any sort of system exclusive documentation. The company has used CRC algorithms on other instruments for similar purposes, but nothing corresponds to what I’m seeing here.

If there’s any interest, I’m happy to provide additional sample data (I realize that just a few samples isn’t adequate) — I have 256 request messages and more than 160 complete voices available for analysis. I can also generate data to test the effect of changes to different parts of the full message. Thanks for taking a look.

UPDATE 1:

I’ve solved the checksum for the short messages. It’s the CRC16(CCITT) of (full message previous to that byte) >> 9 & 0x7F. I figured this out by hacking on the binary of a computer-based utility program from the company (requests banks of voices) in a disassembler. I’m guessing that the long messages are something similar, just with more bytes, will report back when I know more.

UPDATE 2:

The long messages also use the above formula for the penultimate byte. However, the 3 bytes previous remain a mystery at this point and I’m not finding anything conspicuous in the binary I have — these bytes are generated on the hardware, rather than at the ‘client’. It still looks like a 16-bit value spread across 3 7-bit-limited bytes. I just haven’t determined how that value is generated.

2 Answers

One small discovery on the ND1: In a sys ex dump, byte 6 is 6 if you are dumping a single program, an 8 if you're dumping all. Byte 7 is a 0 if you're dumping one program or the number of the program (-1) if you're dumping all.

Perhaps even more interesting -- this only seems to change the value of the final checksum (byte 110) but not the mysterious other checksum bytes.

Answered by Tom Hoffman on February 14, 2021

The only Nord Sys Ex spec I found was for the Nord Stage, which apparently used a "7-bit checksum (sum of all bytes above - wrapping)." So I yesterday I tried every variation of that I could think of, basically summing and masking to 7-bits. Failing that I came back to this post.

The message headers are similar between the two devices (same and different in the ways you'd expect for two versions of a product line). The Sys Ex dump of one program on the ND1 is 110 bytes long (total).

CRC is new territory for me so I could use a little clarification of your explanation of the short message checksum. Do you mean:

  1. calculate CRC16(CCITT) of the full message;
  2. shift 9 bytes to the right (divide by 512);
  3. take the seven bit mask of that?

OK, that works on the ND1, at least insofar as changing a random bit and regenerating the checksum and sending it to the unit generates the "r.c.u." message indicating it is receiving, unlike what it does with a bad checksum. Doesn't actually change the program, but now perhaps actual reverse engineering of actual settings will work.

Trying a few small changes and comparing dumps, I'm afraid I see the same behavior on the ND1 that you see on the ND2 with bytes 105, 106, and 107 changing along with the last CRC checksum byte.

I'm not going to get any further I'm afraid. I'm impressed you got this far! I'm a high school CS teacher so it was a good exercise for me -- I've learned quite a bit about checksums and data error detection in the past two days!

Answered by Tom Hoffman on February 14, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP