Bikey

TL;DR: in this post I present Bikey, a Java library to create Maps and Sets whose elements have two keys, consuming from 70%-85% to 99% less memory than usual data structures. It is Open Source, published in https://github.com/jerolba/bikey and available in Maven Central.

Bikey map memory consumption comparison

In two of my last articles talking about HashMaps I pointed out the problems that exist when we manage information with a key composed by two values: Map<Pair<K1, K2>, V>. If we analyze the available options we discovered that memory consumed per unit of information is high.

Why is so much memory consumed when 60% of the information is repeated?

Is there a Java library that efficiently implements composite key maps and sets?

Requirements

Collection libraries that you can find in Internet do not usually implement this data structure, and those that implement it, none pays special attention to memory consumption.

What does the library I’m looking for must accomplish?

It must support a variable number of values (with a bounded cardinality) within each key element. It must allow objects whose key is based on discrete values (names, codes, integers, dates, etc), in range and values not known a priori. By its nature, I will exclude continuous values (i.e. doubles).
It must preserve the properties of a HashMap, with O(1) cost in insertion and query operations.
The performance adding or consulting an element must be in the same order of magnitude as solutions already explored.
The number of elements limit is the available memory.
The memory cost of adding new elements should decrease with the number of elements, tending to 4 bytes (the cost of storing the pointer to the associated value).

State of the art

The solutions I have found are based on one of these three options:

A nested map: Map<R, Map<C, V>>, a technique that uses the HashBasedTable implementation from Google Guava.
A map with a tuple of elements: Map<Tuple, V>, which is the Apache Commons Collection proposal in its MultiKeyMap or CQEngine in CompoundIndex. They have the flexibility to allow more than two elements in the composite key, but it is not part of the requirements since I have narrowed to tuples of two elements (also seen as Tuple2 or Pair).
A two-dimensional matrix: (Object[][]), used in the ArrayTable version from Google Guava. This option forces you to configure its size and possible key values from the constructor, and does not allow it to grow in any of the two dimensions.

The first two options consume too much memory: an average of 40 bytes per register. Very far from those theoretical minimum 4 bytes used by the pointer to the associated value. They waste resources storing redundant information (key elements).

In the CQEngine case, it is even worse because memory consumption is almost an order of magnitude greater. CQEngine is a very powerful tool which can implement complex indexes and queries, but it is not efficient in memory.

Guava’s ArrayTable solution with a two-dimensional matrix is the one that best solves the memory problem, but only if you have a high percentage of fill rate: if you only have 1/4 of all possible elements present, you will be wasting 75% of the matrix entries.

Besides, as I have already mentioned, in the matrix version you need to configure its elements in the constructor, reducing flexibility because in your business logic, when you don’t know which entities are going to be used.

In a matrix, given n possible values of the first part of the key, and given m possible values of the second part of the key, adding a new value to n would increase the total size of the matrix by m.

Therefore, the memory consumption in the matrix has order of n * m, while in the other two cases the consumption has linear order, and depends only on the number of elements present in the map.

The ideal solution would be a data structure that consumes few bytes per additional record, but does not grow with the Cartesian product of its keys.

In the case of persisting a set of keys, the situation gets worse: I have not found anything that helps us to efficiently implement a Set<Tuple<R, C>>, apart from using a HashSet<Tuple<R,C>>> or a HashMap<R, HashSet<C>>.

Bikey

After a lot of research in different libraries, reading papers, thinking and testing, I came up with a way to implement an efficient double key map fulfilling all my requirements!

There are two map implementations:

TableBikeyMap: optimized for memory consumption, and with a performance similar to implementations we have already seen.
MatrixBikeyMap: optimized for performance, but with the disadvantage of consuming more memory at low fill rates. It does not meet my criterion of not growing in consumption with the number of keys (n * m) because it behaves like a matrix (without becoming one), but it has its use cases where it can be cost-effective.

Both implement the same BikeyMap<R, C, V> interface, so they are interchangeable as your needs change.

There is only one implementation of the Set, TableBikeySet, which is distinguished by having a ridiculous memory consumption compared to versions based on HashSet.

Although you can find the source code published in GitHub and find out how it is implemented, I will tell you the details in a next post . Don’t expect rocket science, or anything as complicated as F14, the latest and fastest implementation of Hash Table from Facebook people.

Examples of use

The library is available in Maven Central and you can add the dependency to your project by adding those lines to your pom.xml file:

<dependency>
  <groupId>com.jerolba</groupId>
  <artifactId>bikey</artifactId>
  <version>0.9.0</version>
</dependency>

or to your Gradle file:

implementation 'com.jerolba:bikey:0.9.0'

Although the source code is published in GitHub (https://github.com/jerolba/bikey) with Apache 2.0 license, you can find the JavaDoc published here.

If you look at the API, you will see that it is “identical” to the one defined by Map and Set, only that where the key is referred to using a single parameter, it becomes two. Because it has adopted the metaphor that the map is like a two-dimensional table, the first value of the key is the row and the second one is the column.

To simplify the example I’m going to use String for key references, but in real code it could be objects like Product and Store, and its identifiers could be the ones used in their hashCode and equals methods.

Also, instead of storing as value an object with several attributes for each product/store (stock, sales, stockout, availability date, etc), I have used only its stock.

//Given the stock of some products and stores
//BikeyMap<Product, Store, Stock> stock = new TableBikeyMap<>();
BikeyMap<String, String, Integer> stock = new TableBikeyMap<>();
stock.put("shirt-ref-123", "store-76", 10);
stock.put("pants-ref-456", "store-12", 24);
...
stock.put("tie-ref-789", "store-23", 2);

//Get the stock of a product/store
Integer inStock = stock.get("shirt-ref-1234", "store-45");

//Total Stock in store-123
stock.entrySet().stream()
     .filter(entry -> entry.getColumn().equals("store-123"))
     .mapToInt(entry -> entry.getValue())
     .sum();

//Total Stock of pants-ref-457
stock.entrySet().stream()
     .filter(entry -> entry.getRow().equals("pants-ref-457"))
     .mapToInt(entry -> entry.getValue())
     .sum();

//All existing products
Set<String> products = stock.rowKeySet();

//All existing stores
Set<String> stores = stock.columnKeySet();

//If it contains a product/store
if (stock.containsKey("tie-ref-789", "store-23")) {
    ....
}

//Get all products/stores present in the map
BikeySet<String, String> productStores = map.bikeySet();

//BikeySet<R, C> implements also Set<Bikey<R, C>>
Set<Bikey<String, String>> productStoresSet = map.bikeySet();

//Get all products/stores with stock
BikeySet<String, String> withStock = stock.entrySet().stream()
    .filter(entry -> entry.getValue() > 0)
    .map(BikeyEntry::getKey)
    .collect(BikeyCollectors.toSet());

//Do something with each map element
stock.forEach((product, store, units) -> {
    System.out.println("Product " + product + " has " + units + " in store " + store);
});

TableBikeyMap and MatrixBikeyMap implementations are interchangeable, and you can use one or the other depending on your requirements. I recommend using the TableBikeyMap by default.

From a BikeyMap you can extract the set of all its keys calling the bikeySet method, but you can also build and use a BikeySet directly.

If we talk about Marvel movies and their protagonists:

//BikeySet<Avenger, Film> avengerFilms = new TableBikeySet<>();
BikeySet<String, String> avengerFilms = new TableBikeySet<>();
avengerFilms.add("Hulk", "The Avengers");
avengerFilms.add("Iron Man", "The Avengers");
avengerFilms.add("Thor", "Avengers: Age of Ultron");
avengerFilms.add("Thor", "Thor: Ragnarok");
avengerFilms.add("Captain America", "Avengers: Infinity War");
....

if (avengerFilms.contains("Iron Man", "Black Panther")) {
    ....
}

//Films in the Set
Set<String> filmsInSet = avengerFilms.columnKeySet();

//Avengers in the Set
Set<String> avengersInSet = avengerFilms.rowKeySet();

//Iron Man Movies
List<String> ironManFilms = avengerFilms.stream()
    .filter(entry -> entry.getRow().equals("Iron Man"))
    .map(Bikey::getColumn)
    .collect(toList());

//Call to a BiFunction for each entry
bikeySet.forEach(this::doSomething);

public void doSomething(String avenger, String film) {
  ....
}

Performance analysis

Without going deeply, following the example of Products and Stores I used in my previous post, and with the same testing methodology, the results are:

Memory: Map

For a domain of 10,000 products and 1,000 stores, if we randomly add elements to the map, the evolution of memory consumption is:

Map memory consumption

If we look at the TableBikeyMap numbers, in a map where only a quarter of all possible combinations of values are present, the memory consumption is almost 4 times lower, while if we reach half it is 5 times lower. In the case of filling 100% of the combinations, the consumption is 7 times lower.

In the MatrixBikeyMap version, memory consumption is constant. Until it reaches a 10% fill rate it does not compensate in consumption over classic implementation, and 70% over the TableBikeyMap version.

What happens when the fill rate is very low? What happens when the number of possible key values is high but the number of present combinations is low? If we zoom in the previous graph we can see what happens when few elements have been inserted, but each different value of the key has been used few times:

Low fill rate memory consumption

With few elements the memory consumption is similar, but the graphs diverge when they reach 5% of fill rate. While the slope of TableBikeyMap decreases, the other two remain unchanged.

As I mentioned before, MatrixBikeyMap has a flat line that doesn’t change because it behaves like a matrix (without being one).

Memory: Set

For the same domain, if we randomly add those 10 million elements to the Set, the evolution of memory consumption is: Set memory consumption

In this case it consumes between 1% and 2% of memory.

In all benchmarks that use Tuple, I have implemented a hash function that does not generate collisions, avoiding even greater memory consumption. Generously, I’m not taking into account the consumption of Tuple instances of two elements or Pair.

Indexing performance: Map

Following the same methodology I used in the last post, the time it takes to index collections of different sizes is:

Map index performance

One of my main concerns while implementing the library was that the added complexity would penalize performance, but it remained in the same order of magnitude. It is equals or even improves significantly in the case of MatrixBikeyMap.

Access performance: Map

Given a Map of a certain size, how long does it take to randomly access to all its elements? Map access time

TableBikeyMap has a similar or slightly better performance again. In addition, the MatrixBikeyMap version is twice as fast most of the times.

Writing Performance: Set

The time spent creating a Set containing 10,000 products and 1,000 stores evolves in this way:

Write set performance

The time required is 2 to 3 times less in TableBikeySet because it has to create and manage fewer objects in memory, as it is based on the use of a BitSet.

Reading performance: Set

Given a Set containing 10,000 products and 1,000 stores, how long does it take to check whether each element exists randomly?

Set contains operation performance

In this case it is 20% slower than HashSet<Pair>, although it is still in the same order of magnitude.

Conclusion

I don’t know if I haven’t been able to research it properly, or if it is a data structure rarely used and with a low interest, but I was surprised not to find anything (neither at library nor paper level) on how to implement efficiently in memory an index composed by two values.

While developing Bikey, I learned some very interesting data structures that were not taught to me at the University. By replicating Map and Set API I have learned a lot about Java collections API (it is highly recommendable to build at least once something that implements Collection and an Iterator API).

In a next post I will explain you the implementation details, and the basic ideas upon which it is built.

This library allows us, by changing the implementation of the Map in a transparent way, to reduce the size of the machines that we currently use in Nextail in the execution of some heavy processes, reducing, besides the AWS invoice, the ecological footprint.

As I said, the library has Open Source license, and any comment, fix or contribution will be welcome.