Posts: 12
Threads: 2
Joined: Aug 2025
09-05-2025, 02:20 AM
(This post was last modified: 09-05-2025, 02:24 AM by Stanley Durham.)
One thing I learned from this, for absolute speed, you can’t use normal WString comparison. Also, Equal is faster than Comparison if that’s all you need, which is the case with a hash table. Comparison requires two comparisons to determine if it’s less than or greater than. For absolute speed, you can’t use; If A$ = B$ Then.
In the Equal callback, testing the lengths first immediately eliminates some values. Passing a pointer and the length to the callback eliminates that overhead in the callback.
I could get this faster, about 0.095 seconds to find 1,000,000 keys, by writing the code inline instead of using callbacks, but the complexity isn’t worth the benefit.
Code:
Very Fast WString /Object Hash Table
Posts: 47
Threads: 4
Joined: May 2024
09-05-2025, 05:59 AM
(This post was last modified: 09-05-2025, 06:04 AM by Stuart McLachlan.)
Already on forum.powerbasic.com and in gbThreads? Why repeat it here?
Posts: 12
Threads: 2
Joined: Aug 2025
(09-05-2025, 05:59 AM)Stuart McLachlan Wrote: Already on forum.powerbasic.com and in gbThreads? Why repeat it here?
Because it isn’t on PB. This is an object hash table. It’s tweaked, extremely fast and simple to use. This is entirely new code, the result of much experimentation and testing.
It was a personal goal to build a hash table that can find 1,000,000 keys in less than a 10th of a second. This will almost do that. I did build one that will do that, but this is a lot simpler and only slightly slower.
It’s traditional that if you build a better version, you have an update.
Posts: 6
Threads: 0
Joined: Jan 2025
Hi Stanley,
The following code
Code: Class Method LongFromObj(o As IUnknown) As Long
If IsObject(o) Then
o.AddRef()
Method = Peek(Long, VarPtr(o))
End If
End Method
wouldn't it be better to use ObjPtr?
Code: Class Method LongFromObj(o As IUnknown) As Long
If IsObject(o) Then
o.AddRef()
Method = ObjPtr(o) 'Peek(Long, VarPtr(o))
End If
End Method
Posts: 12
Threads: 2
Joined: Aug 2025
09-05-2025, 09:49 AM
(This post was last modified: 09-05-2025, 10:15 AM by Stanley Durham.)
There are no other options for PB users. It will find 10,000,000 random keys in 1.936 seconds.
Asking Google AI: “How long will it take the best hash tables to find 10,000,000 wide string keys” = one second
“This is a very rough estimate …”
Based on “rough estimate” about a second slower than the very best on 10,000,000 keys. Almost world class code.
It’s worthy of an update.
(09-05-2025, 09:28 AM)Steven Pringels Wrote: Hi Stanley,
The following code
Code: Class Method LongFromObj(o As IUnknown) As Long
If IsObject(o) Then
o.AddRef()
Method = Peek(Long, VarPtr(o))
End If
End Method
wouldn't it be better to use ObjPtr?
Code: Class Method LongFromObj(o As IUnknown) As Long
If IsObject(o) Then
o.AddRef()
Method = ObjPtr(o) 'Peek(Long, VarPtr(o))
End If
End Method
Probably. Will require some testing. It might make a big difference. This is fast of finding key but will be dramatically slower getting the object due to overhead.
Didn’t work, it crashed. Not sure why, but it could be that the reference count must be increased before calling that, it may not increase the count. In which case, it might only be slightly faster.
I’m afraid to mess with it.
Posts: 12
Threads: 2
Joined: Aug 2025
09-05-2025, 11:06 AM
(This post was last modified: 09-05-2025, 11:21 AM by Stanley Durham.)
Asked Google AI: “How long will it take the best hash tables to find 1,000,000 string keys”
“A C++ benchmark showed highly optimized implementations like ClickHouse's HashMap performing 1 million lookups in about 200 milliseconds, significantly outperforming a standard library implementation.”
Asked Google AI: “how fast is ClickHouse HashMap on 1,000,000 strings”
“For one million strings, ClickHouse's highly optimized hash table can operate in as little as 0.201 seconds, making it extremely fast for aggregation and lookup tasks.”
“In a 2023 benchmark, ClickHouse's custom hash map implementation was faster than several well-known C++ alternatives: Google DenseMap: 0.261s, Abseil HashMap: 0.307s, std::unordered_map: 0.466s”
This is faster than all those, 0.112 seconds, and I can get it down to 0.095 seconds with a lot of inline code.
This might be the world’s fastest hash table. I tried random keys 10 to 20 characters long, still faster, 0.127 seconds. And I can make it faster with inline code.
You can change the test count: Local testcount As Long : testcount = 1000000 'can change
You can change the string size: Function RandomString() As String, For i = 1 To Rnd(5, 8)
If i cheat and use this to build the keys:
Code: ReDim a(1 To testcount)
For i = 1 To testcount
'a(i) = RandomString()
a(i) = Format$(i)
Next i
instead of this:
Code: ReDim a(1 To testcount)
For i = 1 To testcount
a(i) = RandomString()
'a(i) = format$(i)
Next i
then I can get it down to 0.035 seconds for 1,000,000 keys
Posts: 6
Threads: 0
Joined: Jan 2025
09-05-2025, 04:32 PM
(This post was last modified: 09-05-2025, 04:48 PM by Steven Pringels.)
Phenomenal speed Stanley.
On my overclocked AMD Ryzen 9 3900X (4.25Ghz) which is quite old I get:
add 100,000 random key/value items
Time = 000.078
Count = 100,000 duplicate keys not allowed
find 100,000 random keys
Time = 000.010
Hi Stanley,
The reason why you have the crash is because of the ObjFromLong should not use ObjPtr because the object doesn't exist and therefore the pointer isn't allocated (null pointer).
This works
Code: Class Method LongFromObj(o As IUnknown) As dword
If IsObject(o) Then
o.AddRef()
Method = ObjPtr(o)
'Method = Peek(Dword, VarPtr(o))
End If
End Method
Class Method ObjFromLong(ByVal h As dword) As IUnknown
Local o As IUnknown
If h Then
Poke Dword, VarPtr(o), h
o.AddRef()
Method = o
End If
End Method
1 million items gives me:
add 1,000,000 random key/value items
Time = 000.853
Count = 999,909 duplicate keys not allowed
find 1,000,000 random keys
Time = 000.172
Posts: 12
Threads: 2
Joined: Aug 2025
It works, but the reference count still has to be increased. It affects the add time.
Can't tell the difference.
I made the change and will use it in the future but not worthy of an update.
Thanks.
Code: Class Method LongFromObj(o As IUnknown) As Long
If IsObject(o) Then
o.AddRef()
'Method = Peek(Long, VarPtr(o))
Method = ObjPtr(o)
End If
End Method
|