image thumbnail

ComputeNonCryptHash

version 2.2.0 (54.9 KB) by Rik
Compute a non-cryptographic hash

21 Downloads

Updated 12 Feb 2022

From GitHub

View license on GitHub

View ComputeNonCryptHash on File Exchange

This function is intended to be fast, but without requiring a Java or mex implementation to do the actual hashing. It was not checked for any security flaws and is therefore probably vulnerable to most attacks.

Non-cryptographic hashes should only be used as a checksum. Don't use this to do things like storing passwords.

This function will transform most common data types to a uint16 vector to apply the hash in an array operation. Changing the data type should change the hash. The allowed data types are uint*, int*, char, cell, struct, double, single, and string (which is cast to cell array of chars). The contents of the nested data types (i.e. cell and struct) must also be one of the mentioned data types.

Version 1.x of this algorithm attempts to cast string to char, instead of a cell array of chars. Version 1.x also has many hash collisions for scalar doubles. Version 2 will transcode the UTF-8 chars on Octave to UTF-16 (the Matlab standard), which ensures that the same Unicode code points as input will return the same hash.

Performance was tested with a 216553 items long English word list in both upper and lower case (this list with the two duplicates removed) and the numbers 0-1e6 as char and double. An additional test was performed with the images from the Stanford Dog Dataset containing 20580 images (the 89 duplicates were removed from this tar file before running this test). Timings below were determined on R2020b on Windows 10. For a comparison with other hash functions, see this SE thread. Note that these tests are different from the relative performance comparison.

Hash length English words Numbers (in char) Numbers (in double) Images
16 bits 50 μs/hash
 391 630 collisions
50 μs/hash
 958 488 collisions
56 μs/hash
 958 487 collisions
106 129 μs/hash
4 976 collisions
32 bits 58 μs/hash
305 collisions
55 μs/hash
31 056 collisions
61 μs/hash
120 collisions
105 571 μs/hash
0 collisions
48 bits 64 μs/hash
144 collisions
63 μs/hash
0 collisions
67 μs/hash
16 033 collisions
105 288 μs/hash
0 collisions
64 bits 60 μs/hash
1 collisions
56 μs/hash
0 collisions
62 μs/hash
0 collisions
105 366 μs/hash
0 collisions
128 bits 65 μs/hash
0 collisions
58 μs/hash
0 collisions
69 μs/hash
0 collisions
106 289 μs/hash
0 collisions
192 bits 70 μs/hash
0 collisions
68 μs/hash
0 collisions
72 μs/hash
0 collisions
105 932 μs/hash
0 collisions
256 bits 83 μs/hash
0 collisions
85 μs/hash
0 collisions
92 μs/hash
0 collisions
106 187 μs/hash
0 collisions

Licence: CC by-nc-sa 4.0

Cite As

Rik (2022). ComputeNonCryptHash (https://github.com/thrynae/ComputeNonCryptHash/releases/tag/v2.2.0), GitHub. Retrieved .

MATLAB Release Compatibility
Created with R2021b
Compatible with R13SP1 and later releases
Platform Compatibility
Windows macOS Linux
Tags Add Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
To view or report issues in this GitHub add-on, visit the GitHub Repository.
To view or report issues in this GitHub add-on, visit the GitHub Repository.