"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: MurmurHash2 unaligned access
To:
Christian Weisgerber <naddy@mips.inka.de>
Cc:
gameoftrees@openbsd.org
Date:
Thu, 14 Oct 2021 14:36:24 +0200

Download raw body.

Thread
On Thu, Oct 14, 2021 at 01:12:58PM +0200, Christian Weisgerber wrote:
> Stefan Sperling:
> 
> > Our hash input (SHA1 has object ID) is an uint8_t array type.
> > Should we change the type of the key argument to uint8_t * or is the
> > cast from void * to unsigned char * safe?
> 
> I feel confident that it's safe:
> 
>   void *
>   memcpy(void *dst0, const void *src0, size_t length)
>   {
>           char *dst = dst0;
>           const char *src = src0;
>   ...

I have tried a few things but I cannot get the current in-tree code
to fault on sparc64. Neither my 'got he -l' test and a toy program I wrote
managed to trigger the issue. Just out of curiousity, can you produce a short
program that calls murmurhash2 and triggers an alignment fault on sparc64,
octeon, or similar platforms?

In any case, given how the SHA1Update() function in libc works, this
patch should avoid any alignment issues.

ok?

diff 1d19226a8a09c02a94d6ccee03f964fd413f2fe1 /home/stsp/src/got
blob - ac869f866068f2b1050c52779f2539890a8c877f
file + lib/murmurhash2.c
--- lib/murmurhash2.c
+++ lib/murmurhash2.c
@@ -5,11 +5,12 @@
 /* Obtained from https://github.com/aappleby/smhasher */
 
 #include <stdint.h>
+#include <string.h>
 
 #include "murmurhash2.h"
 
 uint32_t
-murmurhash2(const void * key, int len, uint32_t seed)
+murmurhash2(const unsigned char * key, int len, uint32_t seed)
 {
   // 'm' and 'r' are mixing constants generated offline.
   // They're not really 'magic', they just happen to work well.
@@ -23,12 +24,14 @@ murmurhash2(const void * key, int len, uint32_t seed)
 
   // Mix 4 bytes at a time into the hash
 
-  const unsigned char *data = (const unsigned char *)key;
+  const unsigned char *data = key;
 
   while(len >= 4)
   {
-    uint32_t k = *(uint32_t*)data;
+    uint32_t k;
 
+    memcpy(&k, data, sizeof(k));
+
     k *= m;
     k ^= k >> r;
     k *= m;
@@ -58,4 +61,4 @@ murmurhash2(const void * key, int len, uint32_t seed)
   h ^= h >> 15;
 
   return h;
-} 
+}
blob - bbd2bf3ef0309efebdfbb79dd75aa100d4fb0b74
file + lib/murmurhash2.h
--- lib/murmurhash2.h
+++ lib/murmurhash2.h
@@ -4,4 +4,4 @@
 
 /* Obtained from https://github.com/aappleby/smhasher */
 
-uint32_t murmurhash2(const void *key, int len, uint32_t seed);
+uint32_t murmurhash2(const unsigned char *key, int len, uint32_t seed);