Tomer Aberbach

The Making of Keyalesce

Updated Published ·

Have you ever wanted to use tuples or objects for the keys of a Map or the values of a Set? It’s a very common question because the following code doesn’t do what you might expect:

const map = new Map()
map.set([1, 2], 3)
console.log(map.get([1, 2]))
//=> undefined

const set = new Set()
set.add([1, 2])
console.log(set.has([1, 2]))
//=> false

The code behaves this way because a Map’s keys and a Set’s values are compared using reference equality, as specified by the SameValueZero algorithm. Two deeply equal arrays are not referentially equal:

console.log([1, 2] === [1, 2])
//=> false

The following code behaves more predictably:

// A reference to a single array instance!
const tuple = [1, 2]

const map = new Map()
map.set(tuple, 3)
console.log(map.get(tuple))
//=> 3

const set = new Set()
set.add(tuple)
console.log(set.has(tuple))
//=> true

But that isn’t particularly useful because typically you’re constructing tuples on the fly using values from some external source:

const f = (x, y) => {
  // This will return undefined because `[x, y]` is a new array!
  const value = map.get([x, y])

  // ...
}

A common solution is to use JSON.stringify:

const map = new Map()
map.set(JSON.stringify([1, 2]), 3)
console.log(map.get(JSON.stringify([1, 2])))
//=> 3

const set = new Set()
set.add(JSON.stringify([1, 2]))
console.log(set.has(JSON.stringify([1, 2])))
//=> true

This works because strings are primitives, which are compared by value rather than by reference. Unfortunately, using JSON.stringify requires that the inner values are stringifiable. If they’re not, then you’re forced to write a custom serialization function.

Plus, sometimes you do want reference equality, per inner value, but serialization doesn’t preserve reference equality:

const person1 = { name: `Tomer`, occupation: `Software Engineer` }
const person2 = { name: `Tomer`, occupation: `Software Engineer` }

const salaryApplications = new Map()
salaryApplications.set(JSON.stringify([person1, Number.MAX_VALUE]), `approved`)

// Oh no! Two different software engineers named Tomer are considered the same due to stringifying!
console.log(salaryApplications.get(JSON.stringify([person2, Number.MAX_VALUE])))
//=> approved

Surely there’s a better way!

The solution: <code>keyalesce</code> permalinkThe solution: keyalesce

keyalesce is a module that returns the same unique key for the same value sequence1. It’s perfect for this use case:

const person1 = { name: `Tomer`, occupation: `Software Engineer` }
const person2 = { name: `Tomer`, occupation: `Software Engineer` }

const map = new Map()
map.set(keyalesce([1, 2]), 3)
map.set(keyalesce([2, `b`, 3]), 4)
map.set(keyalesce([person1, Number.MAX_VALUE]), `approved`)
map.set(keyalesce([person2, Number.MAX_VALUE]), `totally approved`)

console.log(map.get(keyalesce([1, 2])))
//=> 3

console.log(map.get(keyalesce([2, `b`, 3])))
//=> 4

console.log(map.get(keyalesce([person1, Number.MAX_VALUE])))
//=> approved

console.log(map.get(keyalesce([person2, Number.MAX_VALUE])))
//=> totally approved

const set = new Set()
set.add(keyalesce([1, 2, 3, 4, 5]))
set.add(keyalesce([3, 3, 2, 2, 1]))

console.log(set.has(keyalesce([1, 2, 3, 4, 5])))
//=> true

console.log(set.has(keyalesce([3, 3, 2, 2, 1])))
//=> true

console.log(keyalesce([1, 2, 3, 4, 5]) === keyalesce([1, 2, 3, 4, 5]))
//=> true

How does it work? permalinkHow does it work?

keyalesce internally maintains a trie containing the sequences of values it has been called with. It creates and returns new keys for new sequences and returns previously created keys for known sequences!

For example, the following code:

const key1 = keyalesce([1, 2, 3, 4])
const key2 = keyalesce([1, 2, 3])
const key3 = keyalesce([1, 2, 7, 8])
const key4 = keyalesce([`a`, `b`, `c`])

Would result in the following trie:

1

2

3

4

7

8

root

key1

key2

key3

key4

a

b

c

And calling keyalesce([1, 2, 3, 4]) again would return key1 after traversing nodes 1, 2, 3, and 4 in the trie.

Does <code>keyalesce</code> cause memory leaks? permalinkDoes keyalesce cause memory leaks?

A long running program using a naive implementation of keyalesce would have a memory leak due to unbounded growth of the trie. How are unreachable nodes pruned?

Sequence value reachability permalinkSequence value reachability

Consider the following code:

let object1 = {}
let object2 = {}

const key1 = keyalesce([1, object1, 4])
const key2 = keyalesce([1, 2, object2, 3])

// ...

Which would result in the following trie:

1

2

3

4

root

key1

key2

object1

object2

If the code continues like so:

object2 = null

Then object2’s original value is only reachable from the trie. keyalesce can now prune the associated sequence and its key from the trie because it has become impossible for keyalesce to be called with that sequence ever again.

I made the trie hold only weak references to objects passed to keyalesce and pruned the trie when the objects are garbage-collected using FinalizationRegistry.

After pruning in this case the trie would look like so:

1

4

root

key 1

object1

Note

Although 1 was in the sequence it was not pruned because it is still needed for another sequence.

Created key reachability permalinkCreated key reachability

Consider the following code:

let key1 = keyalesce([1, 2, 4])
let key2 = keyalesce([1, 2, 5, 7])

// ...

Which would result in the following trie:

1

2

4

5

7

root

key1

key2

The previous section’s logic does not apply because all of the sequence values are primitives, which are always reachable and not eligible for garbage collection. So how is the trie pruned in this case?

If the code continues like so:

key2 = null

Then key2’s original value is only reachable from the trie. Although it’s still possible to call keyalesce with [1, 2, 5, 7], keyalesce can actually prune the key and its associated value sequence because there isn’t any code that depends on receiving that specific key anymore. keyalesce doesn’t need to return the same unique key for a given value sequence. It only needs to prevent multiple keys existing simultaneously for the same value sequence.

Similarly to the handling of object sequence values, I made the trie hold only weak references to created keys and pruned the trie when the keys are garbage-collected using FinalizationRegistry.

After pruning in this case the trie would look like so:

1

2

4

root

key1

Note

Although 1 was in the sequence it was not pruned because it is still needed for another sequence.

In summary, the trie is pruned whenever object sequence values or keys have only weak references to them.

Is it actually safe to clean up unreachable keys? permalinkIs it actually safe to clean up unreachable keys?

You might think up the following scenario, which shows we can’t really know whether “any code depends on receiving that specific key anymore”:

  1. Use keyalesce to create a key from a sequence of primitive values
  2. Add some properties to the created key
  3. Depend on those added properties existing throughout your code
  4. Let the key become unreachable
  5. Get confused when keyalesce returns a key without those properties for the same sequence of primitive values from before, causing your code to misbehave

keyalesce prevents this scenario by freezing the created key, which makes it impossible to depend on the contents of the key. The key is a completely opaque object.

The correct alternative to setting properties on the created key is to, well, use it as a key! You can use it as a key in a:

  • Map; if you want the key’s associated value to exist as long as the Map is reachable. Beware of memory leaks in this case!
  • WeakMap; if you want the key’s associated value to exist only as long as the key is reachable by the rules explained before

Go use it! permalinkGo use it!

Install keyalesce:

$ npm i keyalesce

And import it:

import keyalesce from 'keyalesce'

const hangouts = new Set()

const createHangoutKey = (person1, person2) =>
  keyalesce([person1, person2].sort())
const hangOut = (person1, person2) =>
  hangouts.add(createHangoutKey(person1, person2))
const didTheyHangOut = (person1, person2) =>
  hangouts.has(createHangoutKey(person1, person2))

hangOut(`Tomer`, `Samuel`)
hangOut(`Tomer`, `Amanda`)

console.log(didTheyHangOut(`Tomer`, `Samuel`))
console.log(didTheyHangOut(`Samuel`, `Tomer`))
//=> true
//=> true

console.log(didTheyHangOut(`Tomer`, `Amanda`))
console.log(didTheyHangOut(`Amanda`, `Tomer`))
//=> true
//=> true

console.log(didTheyHangOut(`Samuel`, `Amanda`))
console.log(didTheyHangOut(`Amanda`, `Samuel`))
//=> false
//=> false

Footnotes

  1. Two value sequences are considered equal if each of their values are equal using the SameValueZero algorithm.