Выбор некриптографического алгоритма хеширования?

Question

Maxim Avanov @Ghostwriter

Выбор некриптографического алгоритма хеширования?

Для одного самописного балансировщика нагрузки необходимо выбрать алгоритм (не криптографический) для подсчёта контрольных сумм. Входные данные (строковые ключи) всегда точно больше 32 байт.

Очень хотелось бы услышать мнение разработчиков с глубоким пониманием темы. Какой из следующих алгоритмов вы бы порекомендовали или предостерегли от использования?

То, что удалось найти самому:

— fnv-1a — самый распространенный по описанию в сети;
— lookup3 — то, к чему склоняюсь сам; на мой взгляд — наиболее оптимальный, но смущает отсутствие ссылок на успешное применение в проектах, как у fnv;
— MurmurHash2 — судя по доступным тестам — самый быстрый, есть истории применения (libmemcached, Hadoop); но какая-то неадекватная ситуация с коллизиями на определенных тестовых наборах — simon-says.vox.com/library/post/murmur-hash-very-f... — "...one pathological input sequence of 2^32 values causes the algorithm to suffer from a rate of hash collision as high as 97.6%"

Вопрос задан более трёх лет назад
4332 просмотра

3 комментария

Подписаться 4 Оценить 3 комментария

Maxim Avanov @Ghostwriter Автор вопроса

только что заметил, что ссылка больше не действительна. Она была найдена через Википедию — en.wikipedia.org/wiki/MurmurHash#cite_note-14
Там был отличный тестовый набор с результатами. К сожалению и в кэше гугла не сохранилось…

Написано более трёх лет назад
Maxim Avanov @Ghostwriter Автор вопроса

Кэш всё-таки нашёлся!
webcache.googleusercontent.com/search?sourceid=chrome&ie=UTF-8&q=cache:http://simon-says.vox.com/

Написано более трёх лет назад
Maxim Avanov @Ghostwriter Автор вопроса

Чтобы иметь возможность ссылаться на этот тест впоследствии, продублирую статью из кеша.

— Murmur Hash: very fast and collision resistant?

* Nov 1, 2008
* Post a comment

Being interested in everything related to hashes then I was very pleased to recently come across MurmurHash 2.0 which is said to be faster and more collision resistant than all other publically available 32bit hashing algorithms.

The author, Austin Appleby, also makes an impressive claim:

«Murmur is close enough to random to be suitable for virtually all
keysets, though it does have the interesting property that it produces
0 collisions if the keyset is all 4-byte integers from 0 to 2^32-1 — this is a side effect of it doing mixes on the full 4-byte hash state
using only reversible operations.»

This claim is so impressive that I decided to test it. And here is the source code of the test program:

# cat murmurhash2.c
// Code taken from murmurhash.googlepages.com/MurmurHash2.cpp

//-----------------------------------------------------------------------------
// MurmurHash2, by Austin Appleby

// Note — This code makes a few assumptions about how your machine behaves — // 1. We can read a 4-byte value from any address without crashing
// 2. sizeof(int) == 4

// And it has a few limitations — // 1. It will not work incrementally.
// 2. It will not produce the same results on little-endian and big-endian
// machines.

unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed )
{
// 'm' and 'r' are mixing constants generated offline.
// They're not really 'magic', they just happen to work well.

const unsigned int m = 0x5bd1e995;
const int r = 24;

// Initialize the hash to a 'random' value

unsigned int h = seed ^ len;

// Mix 4 bytes at a time into the hash

const unsigned char * data = (const unsigned char *)key;

while(len >= 4)
{
unsigned int k = *(unsigned int *)data;

k *= m;
k ^= k >> r;
k *= m;

h *= m;
h ^= k;

data += 4;
len -= 4;
}

// Handle the last few bytes of the input array

switch(len)
{
case 3: h ^= data[2] << 16;
case 2: h ^= data[1] << 8;
case 1: h ^= data[0];
h *= m;
};

// Do a few final mixes of the hash to ensure the last few
// bytes are well-incorporated.

h ^= h >> 13;
h *= m;
h ^= h >> 15;

return h;
}

# cat test-collisions-for-2pow32-032bit-keys.c
#include <stdio.h>
#include <malloc.h>
#include «murmurhash2.c»

/**
* Copyright © 2008 Simon Hardy-Francis
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the «Software»), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED «AS IS», WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
* Except as contained in this notice, the name(s) of the above
* copyright holders shall not be used in advertising or otherwise
* to promote the sale, use or other dealings in this Software
* without prior written authorization.
*/

#define TWO_POW_32 4294967296
#define TWO_POW_32_DIV_8 536870912
#define BYTE_STATS_MAX 4096

unsigned char bit_masks[8] = {1,2,4,8,16,32,64,128};
unsigned int byte_stats[BYTE_STATS_MAX];

int main(void)
{
union {
char as_u08[24];
char as_s08[24];
unsigned int as_u32[6];
} key;
unsigned int hash32;
unsigned int i,j,k,l,m,n;
unsigned int byte_index;
unsigned int bit_index;
unsigned int bit_mask;
unsigned int collisions_so_far;
double dcollisions_so_far;
double di;
char * bits = malloc(TWO_POW_32_DIV_8);

collisions_so_far = 0;

for ( i = 0; i < BYTE_STATS_MAX; ++ i )
{
byte_stats[i] = 0;
}

for ( j = 0; j < (TWO_POW_32_DIV_8); ++ j )
{
bits[j] = 0;
}

for ( i = 0; i < (TWO_POW_32-1); ++ i )
//for ( i = 0; i < (500000001); ++ i )
{
key.as_u32[0] = i;
key.as_u32[1] = i;
key.as_u32[2] = i;
key.as_u32[3] = i;
hash32 = MurmurHash2 ( &key.as_s08[0], 4, (0xdeadbeef * 16) );
if ( hash32 < BYTE_STATS_MAX )
{
byte_stats[hash32] ++;
}
byte_index = hash32;
bit_index = hash32;
byte_index >>= 3;
bit_index &= 7;
bit_mask = bit_masks[bit_index];
if(((i % 10000000) == 0)
&& (i != 0))
{
di = i;
dcollisions_so_far = collisions_so_far;
printf("%10u collisions_so_far or %4.1f%% of %10u ", collisions_so_far, (dcollisions_so_far/di*100), i);
/* draw graph ;-) */
k = (dcollisions_so_far/di*100);
for ( j = 0; j < (k/2); ++ j )
{
printf(" ");
}
printf("*\n");
}
if((bits[byte_index] & bit_mask ) != 0)
{
collisions_so_far ++;
} else {
bits[byte_index] |= bit_mask;
}
}

for ( i = 0; i < BYTE_STATS_MAX; ++ i )
{
if ( byte_stats[i] == 0 )
{
printf("-");
}
else
{
if ( byte_stats[i] >= 10 )
{
printf("(%d)", byte_stats[i]);
}
else
{
printf("%d", byte_stats[i]);
}
}
}
}

And here's the output:

# gcc test-collisions-for-2pow32-032bit-keys.c -o test-collisions-for-2pow32-032bit-keys && ./test-collisions-for-2pow32-032bit-keys
0 collisions_so_far or 0.0% of 10000000 *
0 collisions_so_far or 0.0% of 20000000 *
0 collisions_so_far or 0.0% of 30000000 *
0 collisions_so_far or 0.0% of 40000000 *
0 collisions_so_far or 0.0% of 50000000 *
0 collisions_so_far or 0.0% of 60000000 *
0 collisions_so_far or 0.0% of 70000000 *
0 collisions_so_far or 0.0% of 80000000 *
0 collisions_so_far or 0.0% of 90000000 *
0 collisions_so_far or 0.0% of 100000000 *
…

0 collisions_so_far or 0.0% of 4000000000 *
0 collisions_so_far or 0.0% of 4010000000 *
0 collisions_so_far or 0.0% of 4020000000 *
0 collisions_so_far or 0.0% of 4030000000 *
0 collisions_so_far or 0.0% of 4040000000 *
0 collisions_so_far or 0.0% of 4050000000 *
0 collisions_so_far or 0.0% of 4060000000 *
0 collisions_so_far or 0.0% of 4070000000 *
0 collisions_so_far or 0.0% of 4080000000 *
0 collisions_so_far or 0.0% of 4090000000 *
0 collisions_so_far or 0.0% of 4100000000 *
0 collisions_so_far or 0.0% of 4110000000 *
0 collisions_so_far or 0.0% of 4120000000 *
0 collisions_so_far or 0.0% of 4130000000 *
0 collisions_so_far or 0.0% of 4140000000 *
0 collisions_so_far or 0.0% of 4150000000 *
0 collisions_so_far or 0.0% of 4160000000 *
0 collisions_so_far or 0.0% of 4170000000 *
0 collisions_so_far or 0.0% of 4180000000 *
0 collisions_so_far or 0.0% of 4190000000 *
0 collisions_so_far or 0.0% of 4200000000 *
0 collisions_so_far or 0.0% of 4210000000 *
0 collisions_so_far or 0.0% of 4220000000 *
0 collisions_so_far or 0.0% of 4230000000 *
0 collisions_so_far or 0.0% of 4240000000 *
0 collisions_so_far or 0.0% of 4250000000 *
0 collisions_so_far or 0.0% of 4260000000 *
0 collisions_so_far or 0.0% of 4270000000 *
0 collisions_so_far or 0.0% of 4280000000 *
0 collisions_so_far or 0.0% of 4290000000 *
1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

So the test program hashes 2^32 unique 32 bit input strings while detecting collisions and outputs results every 10 million iterations. We also capture a sample of 32bit hash value frequencies in the form of the first 4k of hash values and display these frequencies when the 2^32 loop is finished. If a hash value frequency is 0 (because it was never returned) then it is displayed as a dash '-', otherwise the frequency of the hash value is output. As expected, the output above shows 0 collisions during the 2^32 iterations and all 1s for the hash value frequencies.

Very nice I thought. So the cool thing about this hash algorithm is that it works on 32 bits of the input string to hash at a time. This is also one of the factors which makes it so fast. Most hashing algorithms only work on 8 bits of the input string at a time. A minority work on 16 bits of the input string at a time. Based on these initially very positive results I modified by test program to use bigger input strings. However, because I'm a very lazy developer I just duplicated the first 32 bits of the input string several times. Here is the diff showing how this new test program differs from the one above:

# diff test-collisions-for-2pow32-032bit-keys.c test-collisions-for-2pow32-128bit-keys.c
78c78
< hash32 = MurmurHash2 ( &key.as_s08[0], 4, (0xdeadbeef * 16) );
— > hash32 = MurmurHash2 ( &key.as_s08[0], 16, (0xdeadbeef * 16) );

So now our input string to hash is 128 bits instead of 32 bits. The initial 32 bits of the input string to hash has been duplicated 4 times. And here is the output of the second test program:

# gcc test-collisions-for-2pow32-128bit-keys.c -o test-collisions-for-2pow32-128bit-keys && ./test-collisions-for-2pow32-128bit-keys
746316 collisions_so_far or 7.5% of 10000000 *
2758812 collisions_so_far or 13.8% of 20000000 *
5913827 collisions_so_far or 19.7% of 30000000 *
9966267 collisions_so_far or 24.9% of 40000000 *
14697559 collisions_so_far or 29.4% of 50000000 *
20078662 collisions_so_far or 33.5% of 60000000 *
25950773 collisions_so_far or 37.1% of 70000000 *
32254750 collisions_so_far or 40.3% of 80000000 *
38906168 collisions_so_far or 43.2% of 90000000 *
45898801 collisions_so_far or 45.9% of 100000000 *
…

3898620227 collisions_so_far or 97.5% of 4000000000 *
3908620227 collisions_so_far or 97.5% of 4010000000 *
3918620227 collisions_so_far or 97.5% of 4020000000 *
3928620227 collisions_so_far or 97.5% of 4030000000 *
3938620227 collisions_so_far or 97.5% of 4040000000 *
3948620227 collisions_so_far or 97.5% of 4050000000 *
3958620227 collisions_so_far or 97.5% of 4060000000 *
3968620227 collisions_so_far or 97.5% of 4070000000 *
3978620227 collisions_so_far or 97.5% of 4080000000 *
3988620227 collisions_so_far or 97.5% of 4090000000 *
3998620227 collisions_so_far or 97.5% of 4100000000 *
4008620227 collisions_so_far or 97.5% of 4110000000 *
4018620227 collisions_so_far or 97.5% of 4120000000 *
4028620227 collisions_so_far or 97.5% of 4130000000 *
4038620227 collisions_so_far or 97.6% of 4140000000 *
4048620227 collisions_so_far or 97.6% of 4150000000 *
4058620227 collisions_so_far or 97.6% of 4160000000 *
4068620227 collisions_so_far or 97.6% of 4170000000 *
4078620227 collisions_so_far or 97.6% of 4180000000 *
4088620227 collisions_so_far or 97.6% of 4190000000 *
4098620227 collisions_so_far or 97.6% of 4200000000 *
4108620227 collisions_so_far or 97.6% of 4210000000 *
4118620227 collisions_so_far or 97.6% of 4220000000 *
4128620227 collisions_so_far or 97.6% of 4230000000 *
4138620227 collisions_so_far or 97.6% of 4240000000 *
4148620227 collisions_so_far or 97.6% of 4250000000 *
4158620227 collisions_so_far or 97.6% of 4260000000 *
4168620227 collisions_so_far or 97.6% of 4270000000 *
4178620227 collisions_so_far or 97.6% of 4280000000 *
4188620227 collisions_so_far or 97.6% of 4290000000 *
--------------------(24)----------(40)----------------------------(20)-----------------------------------------------------------------------------------------------------------------------(88)--(32)---------4-----------------------------------------------------------------------------------------------8------8---------------------------------------(24)---------(52)-----------------------------(32)------------4---------------------------------------------------------------------------------------------------------(64)--------------------(180)--------------------------------------(36)----------------------4------(32)-----------------------------(12)------------------------------(26)--------(32)----------------------------------------8-----------------------------(16)----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------(32)-----------------------------(48)-----------------------------4--------------------------------(16)------8------------(36)----------------(148)--(64)-----------------------------8---------(68)-------------------8---------(40)-------8----------------------------------(52)-----(24)---------(96)---(44)--------------------------------------------(48)--------------------(32)---------(16)-------------------------------------------------8---------(16)---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------(16)--------------------------------------------------------------(40)-(12)--------------(112)----(16)-----------------(36)------------------------------------------------------------------------------------------------------------------------------------------------------------------------------(28)-------------(68)-----(16)----(12)--------(68)----------8----------------------------------(76)--------------------------------------------(56)-----------------(88)-(80)---------------------------------------8--------------------------------------------------------------------------------------------------------------------------------------------------(36)-------------------(36)-------------------------------------------------------------------------------(24)-----(32)-----------------------------------------------------4-------------------(12)----------------------------------------------------------------------------------------(40)-------------------(20)-------------------4-----------------(140)-------------------(56)---------8-----------------------------(16)-(124)--------(24)--------(32)-------------------------------------------------8-------------------(40)-----------------------------------------------------------------------------------------(68)---------------------------------------------------------------------------------------------------(32)------------------------------------------------------------------------(96)-----------------------------------------------------------(88)-------------------8--(24)-----------------------------2-----------------------------(32)---------(60)------4--(28)------4--------------------------------------------------------------------------------------------------------------------------------------(128)--------------------------(112)----------------------------------------(24)---------------------------------------8-----------------------------(88)-------------------8-------------------------------------------------(32)-------------------(28)---------(24)--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------(84)-------------------------------------------------(16)---------(76)-------------------(16)--------------------------------------------------------------------------------(12)-(40)-----------------(16)------------------------------(68)---------------------------------------------------------------(40)------------------------------

To my surprise the second program shows a 97.6% collision rate. The hash value frequency also reflects that many hash values in the first 4k range were never returned as a result. This result is soooo bad that I'm wondering if I made a mistake in the test programs somewhere. Maybe somebody else can re-write the short test and confirm the results?

Написано более трёх лет назад

Помогут разобраться в теме Все курсы

Нетология

Go-разработчик с нуля + нейросети

9 месяцев

Далее
Академия Эдюсон

Python-разработчик + ИИ

9 месяцев

Далее
ProductStar × РБК

Профессия: Python-разработчик + ИИ

8 месяцев

Далее

Пригласить эксперта

Ответы на вопрос 2

3 комментария

bit @bit

«Эмпирически установлено, что значения 31 и 37 являются хорошими множителями в хэш-функции
для строк ASCII.» — цитата из книги к данному алгоритму.

Написано более трёх лет назад
Maxim Avanov @Ghostwriter Автор вопроса

К слову, алгоритм MurmurHash не сильно сложнее — tanjent.livejournal.com/756623.html :)

А у книжного алгоритма были указаны какие-то сводные характеристики? Вот у того же Murmur:
«This version passes Jenkin's frog.c torture-test up to 2^25 keypairs (beats Hsieh's 2^17; previous Murmur failed after 2^9), distribution is still excellent (only 3 collisions on the SCOWL english-words.95 list vs. Hsieh's 41), and still runs faster than Hsieh and Jenkins at 1300 mb/sec vs. their ~900 mb/sec.»

Написано более трёх лет назад
bit @bit

Никаких характеристик. Просто приводится как элемент примера работы с хэшами.
По-моему определяющим является длина получаемого хэша. Ведь как ни мудри, а если на выходе 32 бита — ими ты и ограничен. И что касается тестов — на одной последовательности лучший результат у одного, на другой — у другого. Хэширование строк — процесс в некотором роде вероятностный.

Написано более трёх лет назад

3 комментария

Maxim Avanov @Ghostwriter Автор вопроса

А среди его вариаций есть что-то лучше, чем CRC32? CRC32 не годится из-за этого:
«CRC32 produced relatively poorly distributed results, apparently by design: it is not supposed to produce uniformly distributed results, but to detect bursts of bit errors.» — www.usenix.org/publications/library/proceedings/als01/full_papers/phillips/phillips_html/

Да и сам алгоритм у него медленнее, чем три вышеприведённых — tanjent.livejournal.com/756623.html

Написано более трёх лет назад
MikhailEdoshin @MikhailEdoshin

Меня смутило, что вы пишете о контрольных суммах :) т. е. я не совсем понял вопрос. Увы, со знанием дела ничего своего сказать не смогу. Но вот нашел интересную статью вполне по теме, там как раз есть мотивированная оценка fnv (негативная) и дается некий Bob Jenkins hash, судя по тестам, куда более симпатичный.

Написано более трёх лет назад
Maxim Avanov @Ghostwriter Автор вопроса

О, спасибо! Весьма ценно. Jenkin's hash это как раз lookup3 :)

>
> Меня смутило, что вы пишете о контрольных суммах :)
>

Я всегда думал, что «hash value» == «checksum». Но в терминологии тоже не силён :)

Написано более трёх лет назад

Ваш ответ на вопрос

Войдите, чтобы написать ответ

Похожие вопросы

Алгоритмы

Простой
Стоит ли читать книгу «Грокаем алгоритмы»?
- 1 подписчик
- 27 мая
- 325 просмотров
0

ответов
Алгоритмы

Простой
Как реализовать поиск слов в файлах?
- 2 подписчика
- 06 апр.
- 389 просмотров
3

ответа
Алгоритмы

Простой
Как распознавать полосу для робота на шахматной доске?
- 1 подписчик
- 19 мар.
- 220 просмотров
1

ответ
Хеширование

Простой
Какие есть совместимые алгоритмы хеширования между C# и Java?
- 1 подписчик
- 11 мар.
- 162 просмотра
3

ответа
Алгоритмы

Простой
Какие данные берет функция для генерации случайного числа?
- 1 подписчик
- 18 февр.
- 281 просмотр
4

ответа
C++

+1 ещё

Простой
Почему не решает задачу?
- 1 подписчик
- 05 февр.
- 427 просмотров
1

ответ
Алгоритмы

Средний
Как создать алгоритм для ракеты в игре?
- 2 подписчика
- 30 окт. 2025
- 405 просмотров
1

ответ
Алгоритмы

Простой
Как решать алгоритмические хакатоны и учится?
- 1 подписчик
- 27 окт. 2025
- 295 просмотров
2

ответа
Базы данных

+1 ещё

Простой
Как правильно реализовать обновление порядка (поле order) в бд?
- 1 подписчик
- 19 окт. 2025
- 252 просмотра
2

ответа
Алгоритмы

Простой
Как открыть файл сжатый с помощью алгоритма Brotli на пк?
- 1 подписчик
- 19 окт. 2025
- 235 просмотров
0

ответов
Показать ещё Загружается…

только что заметил, что ссылка больше не действительна. Она была найдена через Википедию — en.wikipedia.org/wiki/MurmurHash#cite_note-14
Там был отличный тестовый набор с результатами. К сожалению и в кэше гугла не сохранилось…
Кэш всё-таки нашёлся!
webcache.googleusercontent.com/search?sourceid=chrome&ie=UTF-8&q=cache:http://simon-says.vox.com/

Answer 1 · 2010-11-12 13:44:53

Если не сильно заморачиваться, то в книге Кернигана и Пайка «Практика программирования» есть довольно простой алгоритм:
enum { MULTIPLIER = 31 };
/* hash: вычислить
хэш-функцию строки */
unsigned int hash(char *str)
{
unsigned int h;
unsigned char *p;
h = 0;
for (p = (unsigned char *)
str; *p != '\0'; p++)
h = MULTIPLIER * h +
*p; return h % NHASH; }

Answer 2 · 2010-11-12 18:13:10

MikhailEdoshin @MikhailEdoshin

А многочисленные вариации CRC почему не годятся?

Ответ написан более трёх лет назад

3 комментария

Выбор некриптографического алгоритма хеширования?

Войдите, чтобы написать ответ

Минуточку внимания

Войдите на сайт