"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: MurmurHash2 unaligned access
To:
Christian Weisgerber <naddy@mips.inka.de>
Cc:
gameoftrees@openbsd.org
Date:
Thu, 14 Oct 2021 12:27:35 +0200

Download raw body.

Thread
On Thu, Oct 14, 2021 at 12:03:57PM +0200, Christian Weisgerber wrote:
> I'm surprised this hasn't blown up on sparc64:
> 
>   uint32_t
>   murmurhash2(const void * key, int len, uint32_t seed)
>   {
>   ...
>     const unsigned char *data = (const unsigned char *)key;
>   ...
>       uint32_t k = *(uint32_t*)data;
>   ...
> 

The patch below eliminates the cast to uint32_t.

Our hash input (SHA1 has object ID) is an uint8_t array type.
Should we change the type of the key argument to uint8_t * or is the
cast from void * to unsigned char * safe? It seems 'data' will be
accessed in multiples of 4. I don't know sparc64 alignement
requirements off the top of my head. Is 4 or 8 required?

diff 1d19226a8a09c02a94d6ccee03f964fd413f2fe1 /home/stsp/src/got
blob - ac869f866068f2b1050c52779f2539890a8c877f
file + lib/murmurhash2.c
--- lib/murmurhash2.c
+++ lib/murmurhash2.c
@@ -5,6 +5,7 @@
 /* Obtained from https://github.com/aappleby/smhasher */
 
 #include <stdint.h>
+#include <string.h>
 
 #include "murmurhash2.h"
 
@@ -27,8 +28,10 @@ murmurhash2(const void * key, int len, uint32_t seed)
 
   while(len >= 4)
   {
-    uint32_t k = *(uint32_t*)data;
+    uint32_t k;
 
+    memcpy(&k, data, sizeof(k));
+
     k *= m;
     k ^= k >> r;
     k *= m;