SimpleChecksums

SimpleChecksums.additive16
SimpleChecksums.additive32
SimpleChecksums.additive64
SimpleChecksums.additive_checksum
SimpleChecksums.adler32
SimpleChecksums.bsd16
SimpleChecksums.bsd_checksum
SimpleChecksums.fletcher16
SimpleChecksums.fletcher16a
SimpleChecksums.fletcher32
SimpleChecksums.fletcher32a
SimpleChecksums.fletcher64
SimpleChecksums.fletcher64a
SimpleChecksums.fletcher_checksum
SimpleChecksums.sysv16
SimpleChecksums.sysv_checksum

Synopsis

Installation

using Pkg
Pkg.add("SimpleChecksums")

Use

using SimpleChecksums

data = UInt8.(0:255);

# super-simple sums:
@assert additive_checksum(UInt16, data) == additive16(data) == 0x7f80
@assert additive_checksum(UInt32, data) == additive32(data) == 0x00007f80
@assert additive_checksum(UInt64, data) == additive64(data) == 0x0000000000007f80

# BSD and SYS-V checksums from GNU sum utility:
@assert bsd_checksum(UInt16, data) == bsd16(data) == 0x0200
@assert sysv_checksum(UInt16, data) == sysv16(data) == 0x7f80

# Fletcher's and Adler's checksums:
@assert fletcher_checksum(UInt16, data) == fletcher16(data) == 0x5500
@assert fletcher_checksum(UInt32, data) == fletcher32(data) == 0xaaaa7f80
@assert fletcher_checksum(UInt64, data) == fletcher64(data) == 0x002aaa8000007f80

@assert fletcher_checksum(UInt16, data, UInt16(0), 0x100) == fletcher16a(data) == 0x8080
@assert fletcher_checksum(UInt32, data, UInt32(0), 0x10000) == fletcher32a(data) == 0xaa807f80
@assert fletcher_checksum(UInt64, data, UInt64(0), 0x100000000) == fletcher64a(data) == 0x002aaa8000007f80

@assert fletcher_checksum(UInt32, data, UInt32(1), UInt32(65521), 5552) == adler32(data) == 0xadf67f81

Checksums are small summaries of data that can be used to detect errors introduced to the data during transmission or storage. In modern times, checksums have largely been replaced with cryptographic hashes like MD5 and SHA, but these functions and the computational power required to compute them in a reasonable amount of time did not exist when many of the data transmission and storage standards we use today were invented. If we want to use the checksums that appear in these standards in Julia code, it helps to have a standard library to compute them correctly and efficiently.

If you are building a new data transmission or storage standard and need a way to check for errors, consider using a cryptographic hash like SHA, or at least a better error-detecting code like CRC, before choosing any of these functions.

API

SimpleChecksums.additive16 — Function

additive16(data, [init, modulo])

Alias of additive_checksum(UInt16, data, init, modulo).

source

SimpleChecksums.additive32 — Function

additive32(data, [init, modulo])

Alias of additive_checksum(UInt32, data, init, modulo).

source

SimpleChecksums.additive64 — Function

additive64(data, [init, modulo])

Alias of additive_checksum(UInt64, data, init, modulo).

source

SimpleChecksums.additive_checksum — Method

additive_checksum(T, data, init::T = zero(T), modulo::T = typemax(T)) where {T<:Unsigned}

Compute a simple additive hash of the values in data.

Sums all the values in data, then computes modulo modulo, optionally starting from init instead of zero. Sums are allowed to overflow, which resets the sum to zero.

This function is extremely fast but suffers from many collisions. Flaws to keep in mind are:

zero values in data do not affect the checksum value at all;
the order of the values in data can be permuted without changing the checksum value; and
because of the modulo operation, the checksum of all zeros is equal to the checksum of all modulo values.

Arguments

T: An unsinged integer type.
data: A single Unsigned value or an iterator of values.
init::Unsigned = zero(T): Optional starting value.
modulo::T = typemax(T): A number to use as the modulo for addition. Typical choices are typemax(T) (the default) or some prime number close to but less than typemax(T).

Predefined convenience functions are additive16, additive32, and additive64.

See also: sysv16 for a different way of folding the sum into a fixed number of bits.

source

SimpleChecksums.adler32 — Function

adler32(data, init::Unsigned = one(UInt32))

Compute Adler's 32-bit checksum.

Adler's 32-bit checksum is the same as Fletcher's 32-bit checksum with a modulus of 65521 (the largest 16-bit prime) and an inital value of one. This is a strict improvement on fletcher16, but has a higher probability of collision, worse likelihood of error detection, and is slower than fletcher32. Prefer fletcher32 instead.

source

SimpleChecksums.bsd16 — Function

bsd16(data, [init])

Alias of bsd_checksum(UInt16, data, init).

source

SimpleChecksums.bsd_checksum — Function

bsd_checksum(UInt16, data, init::UInt16 = zero(UInt16))

Compute a hash of data using the method defined in BSD's sum utility (and implemented in the GNU sum program).

The BSD checksum computes a 16-bit checksum value by bit-rotating the checksum value from the previous step to the right by 1 bit, then adding the next value from data and repeating. Sums are allowed to overflow, which resets the sum to zero.

This function is fast but suffers from many collisions. Flaws to keep in mind are:

zero values at the beginning of data do not affect the checksum value at all (unless the initial value is set to something other than zero); and
runs of zeros in data that are multiples of 16 in length do not affect the checksum value at all.

Arguments

data: A single Unsigned value or an iterator of values.
init::UInt16 = zero(UInt16): Optional starting value.

Predefined convenience function is bsd16.

source

SimpleChecksums.fletcher16 — Function

fletcher16(data, [init, modulo])

Alias of fletcher_checksum(UInt16, data, init, modulo=0x00ff, blocksize=380368696).

source

SimpleChecksums.fletcher16a — Function

fletcher16a(data, [init, modulo])

Alias of fletcher_checksum(UInt16, data, init, modulo=0x0100, blocksize=379625061).

source

SimpleChecksums.fletcher32 — Function

fletcher32(data, [init, modulo])

Alias of fletcher_checksum(UInt32, data, init, modulo=0x0000ffff, blocksize=23726746).

source

SimpleChecksums.fletcher32a — Function

fletcher32a(data, [init, modulo])

Alias of fletcher_checksum(UInt32, data, init, modulo=0x00010000, blocksize=23726565).

source

SimpleChecksums.fletcher64 — Function

fletcher64(data, [init, modulo])

Alias of fletcher_checksum(UInt64, data, init, modulo=0x00000000ffffffff, blocksize=92681).

source

SimpleChecksums.fletcher64a — Function

fletcher64a(data, [init, modulo])

Alias of fletcher_checksum(UInt64, data, init, modulo=0x0000000100000000, blocksize=92681).

source

SimpleChecksums.fletcher_checksum — Method

fletcher_checksum(T, data, init::T = zero(T), modulo::T = typemax(T), blocksize::Integer = 1)

Compute a hash of data using Fletcher's checksum.

Fletcher's checksum computes two values: c0, a running sum of values modulo some number, and c1, a running sum of c0 values modulo some number. The two values are contatenated bitwise into a single unsigned integer.

This function is fast and good at detecting small errors. Flaws to keep in mind are:

zero values at the beginning of data do not affect the checksum value at all (unless the initial value is set to something other than zero); and
the worst-case performance for this algorithm occurs when data are truly random, but all errors of up to 2 bits are detected on all runs of data of length modulo * (sizeof(T)*4) bits.

Arguments

T: An unsinged integer type.
data: A single Unsigned value or an iterator of values.
init::Unsigned = zero(T): Optional starting value.
modulo::T = typemax(T): Optional modulo value, applied independently to the sum and the sum-of-sums after each block is summed.
blocksize::Integer = 1: Optional block size for summing before applying the modulo operation. blocksize should be chosen with modulo, T, and eltype(data) in mind to make sure the sum operation does not overflow.

Predefined convenience functions are fletcher16, fletcher32, and fletcher64 for the standard modulo value of typemax(T) shifted right half the number of bits in the checksum, and fletcher16a, fletcher32a, and fletcher64a for the alternate modulo value of 1 left shifted half the number of bits in the checksum.

SimpleChecksums

Synopsis

Installation

Use

Why?

API