Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> so it is obvious a hash table will work well for them, whereas a trie will waste space trying to find common prefixes between random keys.)

It is obvious, but it is wrong. I observed Judy beating out many hash tables in specifically this situation, and comparing favorably to google's sparsehash. Given its much smaller code size and fewer dependencies this has demonstrated itself a huge win at my organization.

Link to my benchmark: http://reddit.com/r/programming/comments/bhxwh/hash_table_be...



Benchmarking beats random ideas, no doubt about that! Thanks for pointing out my mistake with integers. However, I would say Judy doesn't look so hot for string keys - slower at most things and similar memory requirements.

I'd be interested to see 10 million 64-bit integers in Judy and DenseHash though. Since ten million is 0.2% of 2^32, in some sense lots of the potential key space is actually in use, so I'd expect a lot of sharing to be possible and a sparse trie to perform well.

Is there a 64-bit integer key Judy array you could test on?


> Is there a 64-bit integer key Judy array you could test on?

Simply using the 64-bit integer as a 8-byte key is what I use.

Treating the 32-bit integer as 4-byte keys performs similarly to using the integer-API directly.

> I'd be interested to see 10 million 64-bit integers in Judy and DenseHash though

It's easy enough to check :)


If you're talking about performance, you should compare to the dense hash part of the sparsehash package.


I did.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: