QuickJS JSAtom and `JSRuntime->atom

{

.date= 2021-01-01

`.title=` Quickjs Jsatom And `jsruntime->atom_array` Part 2

It's now 2021, and I'm still struggling to understand some of the nuance behind the hashing/bucket finding algorithm. I understand how/what hash tables are, but I think with QuickJS's atom hash table, there's some optimizations going on that I haven't quite figured out.

Let's look through it, continuing from last time. The atom hash table is made of two primary data structures: atom_hash and atom_array. atom_hash is the list of indexes into atom_array for a given hash for a given table size. atom_hash also always has a length that is a power of two so that for each "resize" of the table, the hash can be masked. The basic usage seems to be:

we have key string K, and atom value V
we produce a hash of K hash(K) producing H which is a 32bit integer
since the initial hash size is say 256 (mask of 0xff) entries, we mask H to produce H1 by using H & 0ff. (H is then stored on V so that atom->hash is H and we can reuse it later).
H1 is used as the index into atom_hash so that atom_hash[H1] gives us I.
I is the index into atom_array, such that atom_array[I] gives us a container C (linked list like object that contains V).

The difficulties I'm having, is that there is some pointer compression going on with C so that the lowest(?) bit of &C says if the entry is empty or not. C also has the field next_hash which returns an I for the next entry, which is where I start to get confused. Does this mean there are multiple I that map to the same hash? I think so, but it's not stored as a linked list like I would expect, instead the values all seem to be stored linearly in atom_array, whereas I expected atom_array to be buckets of linked lists.

What I want to answer for myself is if the whole datastructure in QuickJS is needed in Rust, or if I could just use some standard library datastructures that Rust provides and still maintain all the memory/speed benefits of the handwritten C version. At the moment, I feel like there's still too much escaping my understanding to be able to evaluate it.

I'm still planning/working on doing a port of the algorithm to Rust so I can understand it better, but all the indirection and bit/address manipulation is definitely pushing my abilities in Rust. It's a good thing in the long run, as it helps me udnerstand Rust's type/data system better, but it's quite the steep curve.

}