Fletcher's checksum

The Fletcher checksum is an algorithm for computing a position-dependent checksum devised by John G. Fletcher (1934–2012) at Lawrence Livermore Labs in the late 1970s.^[1] The objective of the Fletcher checksum was to provide error-detection properties approaching those of a cyclic redundancy check but with the lower computational effort associated with summation techniques.

The algorithm

Review of simple checksums

As with simpler checksum algorithms, the Fletcher checksum involves dividing the binary data word to be protected from errors into short "blocks" of bits and computing the modular sum of those blocks. (Note that the terminology used in this domain can be confusing. The data to be protected, in its entirety, is referred to as a "word", and the pieces into which it is divided are referred to as "blocks".)

As an example, the data may be a message to be transmitted consisting of 136 characters, each stored as an 8-bit byte, making a data word of 1088 bits in total. A convenient block size would be 8 bits, although this is not required. Similarly, a convenient modulus would be 255, although, again, others could be chosen. So, the simple checksum is computed by adding together all the 8-bit bytes of the message, dividing by 255 and keeping only the remainder. (In practice, the modulo operation is performed during the summation to control the size of the result.) The checksum value is transmitted with the message, increasing its length to 137 bytes, or 1096 bits. The receiver of the message can re-compute the checksum and compare it to the value received to determine whether the message has been altered by the transmission process.

Weaknesses of simple checksums

The first weakness of the simple checksum is that it is insensitive to the order of the blocks (bytes) in the data word (message). If the order is changed, the checksum value will be the same and the change will not be detected. The second weakness is that the universe of checksum values is small, being equal to the chosen modulus. In our example, there are only 255 possible checksum values, so it is easy to see that even random data has about a 0.4% probability of having the same checksum as our message.

The Fletcher checksum

Fletcher addresses both of these weaknesses by computing a second value along with the simple checksum. This is the modular sum of the values taken by the simple checksum as each block of the data word is added to it. The modulus used is the same. So, for each block of the data word, taken in sequence, the block's value is added to the first sum and the new value of the first sum is then added to the second sum. Both sums start with the value zero (or some other known value). At the end of the data word, the modulus operator is applied and the two values are combined to form the Fletcher checksum value.

Sensitivity to the order of blocks is introduced because once a block is added to the first sum, it is then repeatedly added to the second sum along with every block after it. If, for example, two adjacent blocks become exchanged, the one that was originally first will be added to the second sum one fewer times and the one that was originally second will be added to the second sum one more time. The final value of the first sum will be the same, but the second sum will be different, detecting the change to the message.

The universe of possible checksum values is now the square of the value for the simple checksum. In our example, the two sums, each with 255 possible values, result in 65025 possible values for the combined checksum.

Overview of the different algorithm parameters

While there is an infinity of parameters, the original paper only studies the case K=8 (word length) with modulus 255 and 256.

The 16 and 32 bits versions (Fletcher-32 and -64) have been derived from the original case and studied in subsequent specifications or papers.

Fletcher-16

When the data word is divided into 8-bit blocks, as in the example above, two 8-bit sums result and are combined into a 16-bit Fletcher checksum. Usually, the second sum will be multiplied by 256 and added to the simple checksum, effectively stacking the sums side-by-side in a 16-bit word with the simple checksum at the least significant end. This algorithm is then called the Fletcher-16 checksum. The use of the modulus 2⁸ − 1 = 255 is also generally implied.

Fletcher-32

When the data word is divided into 16-bit blocks, two 16-bit sums result and are combined into a 32-bit Fletcher checksum. Usually, the second sum will be multiplied by 2¹⁶ and added to the simple checksum, effectively stacking the sums side-by-side in a 32-bit word with the simple checksum at the least significant end. This algorithm is then called the Fletcher-32 checksum. The use of the modulus 2¹⁶ − 1 = 65,535 is also generally implied. The rationale for this choice is the same as for Fletcher-16.

Fletcher-64

When the data word is divided into 32-bit blocks, two 32-bit sums result and are combined into a 64-bit Fletcher checksum. Usually, the second sum will be multiplied by 2³² and added to the simple checksum, effectively stacking the sums side-by-side in a 64-bit word with the simple checksum at the least significant end. This algorithm is then called the Fletcher-64 checksum. The use of the modulus 2³² − 1 = 4,294,967,295 is also generally implied. The rationale for this choice is the same as for Fletcher-16 and Fletcher-32.

Comparison with the Adler checksum

The Adler-32 checksum is a specialization of the Fletcher-32 checksum devised by Mark Adler. The modulus selected (for both sums) is the prime number 65,521 (65,535 is divisible by 3, 5, 17 and 257). The first sum also begins with the value 1. The selection of a prime modulus results in improved "mixing" (error patterns are detected with more uniform probability, improving the probability that the least detectable patterns will be detected, which tends to dominate overall performance). However, the reduction in size of the universe of possible checksum values acts against this and reduces performance slightly. One study showed that Fletcher-32 outperforms Adler-32 in both performance and in its ability to detect errors. As modulo-65,535 addition is considerably simpler and faster to implement than modulo-65,521 addition, the Fletcher-32 checksum is generally a faster algorithm.^[2]

Caution on modulus

A modulus of 255 is used above and in examples below for Fletcher-16, however some real-world implementations use 256. The TCP protocol's alternate checksum has Fletcher-16 with a 256 modulus,^[3] as do the checksums of UBX-* messages from a U-blox GPS.^[4] Which modulus is used is dependent on the implementation.

Example calculation of the Fletcher-16 checksum

As an example, a Fletcher-16 checksum shall be calculated and verified for a byte stream of 0x01 0x02.

C0_initial = 0
C1_initial = 0

Byte (B)	C0 = (C0_prev B) mod 255	C1 = (C1_prev C0) mod 255	Description
0x01	0x01	0x01	First byte fed in
0x02	0x03	0x04	Second byte fed in

The checksum is therefore 0x0403. It could be transmitted with the byte stream and be verified as such on the receiving end. Another option is to compute in a second step a pair of check bytes, which can be appended to the byte stream so that the resulting stream has a global Fletcher-16 checksum value of 0.

The values of the checkbytes are computed as follows:

CB0 = 255 − ((C0 C1) mod 255),
CB1 = 255 − ((C0 CB0) mod 255),

where C0 and C1 are the result of the last step in the Fletcher-16 computation.

In our case the checksum bytes are CB0 = 0xF8 and CB1 = 0x04. The transmitted byte stream is 0x01 0x02 0xF8 0x04. The receiver runs the checksum on all four bytes and calculates a passing checksum of 0x00 0x00, as illustrated below:

Byte (B)	C0 = (C0_prev B) mod 255	C1 = (C1_prev C0) mod 255	Description
0x01	0x01	0x01	First byte fed in
0x02	0x03	0x04	Second byte fed in
CB0 = 0xF8	(0x03 0xF8) % 0xFF = 0xFB	(0x04 0xFB) % 0xFF = 0x00	Checksum calculation – byte 1
CB1 = 0x04	(0xFB 0x04) % 0xFF = 0x00	(0x00 0x00) % 0xFF = 0x00	Checksum calculation – byte 2

Weaknesses

The Fletcher checksum cannot distinguish between blocks of all 0 bits and blocks of all 1 bits. For example, if a 16-bit block in the data word changes from 0x0000 to 0xFFFF, the Fletcher-32 checksum remains the same. This also means a sequence of all 00 bytes has the same checksum as a sequence (of the same size) of all FF bytes.

Implementation

These examples assume two's complement arithmetic, as Fletcher's algorithm will be incorrect on ones' complement machines.

Straightforward

The below is a treatment on how to calculate the checksum including the check bytes; i.e., the final result should equal 0, given properly-calculated check bytes. The code by itself, however, will not calculate the check bytes.

An inefficient but straightforward implementation of a C language function to compute the Fletcher-16 checksum of an array of 8-bit data elements follows:

uint16_t Fletcher16( uint8_t *data, int count )
{
   uint16_t sum1 = 0;
   uint16_t sum2 = 0;
   int index;

   for ( index = 0; index < count;   index )
   {
      sum1 = (sum1   data[index]) % 255;
      sum2 = (sum2   sum1) % 255;
   }

   return (sum2 << 8) | sum1;
}

On lines 3 and 4, the sums are 16-bit variables so that the additions on lines 9 and 10 will not overflow. The modulo operation is applied to the first sum on line 9 and to the second sum on line 10. Here, this is done after each addition, so that at the end of the for loop the sums are always reduced to 8 bits. At the end of the input data, the two sums are combined into the 16-bit Fletcher checksum value and returned by the function on line 13.

Each sum is computed modulo 255 and thus remains less than 0xFF at all times. This implementation will thus never produce the checksum results 0x??FF, 0xFF?? or 0xFFFF (i.e., 511 out of the total 65536 possible 16-bit values are never used). It can produce the checksum result 0x0000, which may not be desirable in some circumstances (e.g. when this value has been reserved to mean "no checksum has been computed").

Check bytes

Example source code for calculating the check bytes, using the above function, is as follows. The check bytes may be appended to the end of the data stream, with the c0 coming before the c1.

uint16_t csum;
uint16_t c0,c1,f0,f1;

csum = Fletcher16(data, length);
f0 = csum & 0xff;
f1 = (csum >> 8) & 0xff;
c0 = 0xff - ((f0   f1) % 0xff);
c1 = 0xff - ((f0   c0) % 0xff);

Optimizations

In a 1988 paper, Anastase Nakassis discussed and compared different ways to optimize the algorithm. The most important optimization consists in using larger accumulators and delaying the relatively costly modulo operation for as long as it can be proven that no overflow will occur. Further benefit can be derived from replacing the modulo operator with an equivalent function tailored to this specific case—for instance, a simple compare-and-subtract, since the quotient never exceeds 1.^[5]

Here is a C implementation that applies the first but not the second optimization:

#include <stdlib.h>		/* for size_t */
#include <stdint.h>		/* for uint8_t, uint16_t & uint32_t */

uint16_t fletcher16(const uint8_t *data, size_t len) {
	uint32_t c0, c1;

	/*  Found by solving for c1 overflow: */
	/* n > 0 and n * (n 1) / 2 * (2^8-1) < (2^32-1). */
	for (c0 = c1 = 0; len > 0; ) {
		size_t blocklen = len;
		if (blocklen > 5802) {
			blocklen = 5802;
		}
		len -= blocklen;
		do {
			c0 = c0   *data  ;
			c1 = c1   c0;
		} while (--blocklen);
		c0 = c0 % 255;
		c1 = c1 % 255;
   }
   return (c1 << 8 | c0);
}

uint32_t fletcher32(const uint16_t *data, size_t len) {
	uint32_t c0, c1;
	len = (len   1) & ~1;      /* Round up len to words */

	/* We similarly solve for n > 0 and n * (n 1) / 2 * (2^16-1) < (2^32-1) here. */
	/* On modern computers, using a 64-bit c0/c1 could allow a group size of 23726746. */
	for (c0 = c1 = 0; len > 0; ) {
		size_t blocklen = len;
		if (blocklen > 360*2) {
			blocklen = 360*2;
		}
		len -= blocklen;
		do {
			c0 = c0   *data  ;
			c1 = c1   c0;
		} while ((blocklen -= 2));
		c0 = c0 % 65535;
		c1 = c1 % 65535;
	}
	return (c1 << 16 | c0);
}

// A similar routine could be written for fletcher64. The group size would be 92681.

The second optimization is not used because the "never exceeds 1" assumption only applies when the modulo is calculated naively; applying the first optimization would break it. On the other hand, modulo Mersenne numbers like 255 and 65535 is a quick operation on computers anyway, as tricks are available to do them without the costly division operation.^[6]

Test vectors

8-bit implementation (16-bit checksum)

"abcde" -> 51440 (0xC8F0)
"abcdef" -> 8279 (0x2057)
"abcdefgh" -> 1575 (0x0627)

16-bit implementation (32-bit checksum), with 8-bit ASCII values of the input word assembled into 16-bit blocks in little-endian order, the word padded with zeros as necessary to the next whole block, using modulus 65535 and with the result presented as the sum-of-sums shifted left by 16 bits (multiplied by 65536) plus the simple sum

"abcde" -> 4031760169 (0xF04FC729)
"abcdef" -> 1448095018 (0x56502D2A)
"abcdefgh" -> 3957429649 (0xEBE19591)

32-bit implementation (64-bit checksum)

"abcde" -> 14467467625952928454 (0xC8C6C527646362C6)
"abcdef" -> 14467579776138987718 (0xC8C72B276463C8C6)
"abcdefgh" -> 3543817411021686982 (0x312E2B28CCCAC8C6)

Bit and byte ordering (endianness / network order)

As with any calculation that divides a binary data word into short blocks and treats the blocks as numbers, any two systems expecting to get the same result should preserve the ordering of bits in the data word. In this respect, the Fletcher checksum is not different from other checksum and CRC algorithms and needs no special explanation.

An ordering problem that is easy to envision occurs when the data word is transferred byte-by-byte between a big-endian system and a little-endian system and the Fletcher-32 checksum is computed. If blocks are extracted from the data word in memory by a simple read of a 16-bit unsigned integer, then the values of the blocks will be different in the two systems, due to the reversal of the byte order of 16-bit data elements in memory, and the checksum result will be different as a consequence. The implementation examples, above, do not address ordering issues so as not to obscure the checksum algorithm. Because the Fletcher-16 checksum uses 8-bit blocks, it is not affected by byte endianness.

References

^ Fletcher, J. G. (January 1982). "An Arithmetic Checksum for Serial Transmissions". IEEE Transactions on Communications. COM-30 (1): 247–252. doi:10.1109/tcom.1982.1095369.
^ Theresa C. Maxino, Philip J. Koopman (January 2009). "The Effectiveness of Checksums for Embedded Control Networks" (PDF). IEEE Transactions on Dependable and Secure Computing.
^ J. Zweig, UIUC, C. Partridge, BBN (February 1990). "TCP Alternate Checksum Options". Network Working Group.{{cite web}}: CS1 maint: multiple names: authors list (link)
^ Sartek, Various (June 2018). "Calculating UBX checksum correctly in python". uBlox Support Portal.
^ Anastase Nakassis (October 1988). "Fletcher's error detection algorithm: how to implement it efficiently and how toavoid the most common pitfalls". Newsletter ACM SIGCOMM Computer Communication Review Homepage Archive. 18 (5): 63–88. doi:10.1145/53644.53648. S2CID 17356816.
^ Jones, Douglas W. "Modulus without Division, a tutorial". THE UNIVERSITY OF IOWA Department of Computer Science. Retrieved 9 September 2019.

External links

RFC 905 – ISO Transport Protocol Specification describes the Fletcher checksum algorithm summing to zero (in Appendix B).
RFC 1146 – TCP Alternate Checksum Options describes the Fletcher checksum algorithm for use with TCP.
Stone, Jonathan; Greenwald, Michael; Partridge, Craig; Hughes, Jim (1998). "Performance of Checksums and CRCs over Real Data". IEEE/ACM Transactions on Networking. 6 (5): 529–543. CiteSeerX 10.1.1.55.8520. doi:10.1109/90.731187. S2CID 1750358.
Kodis, John (1992-05-01). "When it comes to high-speed data verification, Fletcher's checksum algorithm can do the job". Dr. Dobb's.
Somerstitle, Alan (2013-04-12). "The Fletcher Checksums in ZFS" (PDF). Spectra Logic.

[1] Fletcher, J. G. (January 1982). "An Arithmetic Checksum for Serial Transmissions". IEEE Transactions on Communications. COM-30 (1): 247–252. doi:10.1109/tcom.1982.1095369.

[2] Theresa C. Maxino, Philip J. Koopman (January 2009). "The Effectiveness of Checksums for Embedded Control Networks" (PDF). IEEE Transactions on Dependable and Secure Computing.

[3] J. Zweig, UIUC, C. Partridge, BBN (February 1990). "TCP Alternate Checksum Options". Network Working Group.{{cite web}}: CS1 maint: multiple names: authors list (link)

[4] Sartek, Various (June 2018). "Calculating UBX checksum correctly in python". uBlox Support Portal.

[5] Anastase Nakassis (October 1988). "Fletcher's error detection algorithm: how to implement it efficiently and how toavoid the most common pitfalls". Newsletter ACM SIGCOMM Computer Communication Review Homepage Archive. 18 (5): 63–88. doi:10.1145/53644.53648. S2CID 17356816.

[6] Jones, Douglas W. "Modulus without Division, a tutorial". THE UNIVERSITY OF IOWA Department of Computer Science. Retrieved 9 September 2019.

[1]

[2]

[3]

[4]

[5]

[6]