From: Stefan Sperling Subject: Re: MurmurHash2 unaligned access To: Christian Weisgerber Cc: gameoftrees@openbsd.org Date: Thu, 14 Oct 2021 12:27:35 +0200 On Thu, Oct 14, 2021 at 12:03:57PM +0200, Christian Weisgerber wrote: > I'm surprised this hasn't blown up on sparc64: > > uint32_t > murmurhash2(const void * key, int len, uint32_t seed) > { > ... > const unsigned char *data = (const unsigned char *)key; > ... > uint32_t k = *(uint32_t*)data; > ... > The patch below eliminates the cast to uint32_t. Our hash input (SHA1 has object ID) is an uint8_t array type. Should we change the type of the key argument to uint8_t * or is the cast from void * to unsigned char * safe? It seems 'data' will be accessed in multiples of 4. I don't know sparc64 alignement requirements off the top of my head. Is 4 or 8 required? diff 1d19226a8a09c02a94d6ccee03f964fd413f2fe1 /home/stsp/src/got blob - ac869f866068f2b1050c52779f2539890a8c877f file + lib/murmurhash2.c --- lib/murmurhash2.c +++ lib/murmurhash2.c @@ -5,6 +5,7 @@ /* Obtained from https://github.com/aappleby/smhasher */ #include +#include #include "murmurhash2.h" @@ -27,8 +28,10 @@ murmurhash2(const void * key, int len, uint32_t seed) while(len >= 4) { - uint32_t k = *(uint32_t*)data; + uint32_t k; + memcpy(&k, data, sizeof(k)); + k *= m; k ^= k >> r; k *= m;