Saturday, 10 September 2016

Building a faceted search using Redis and - part 3: Sorted sets for range queries

Storing and combining sets and strings in Redis will get us a nice filtered search. The first three rows of filtering options in the demo at are using only sets holding the keys to the hotels. If one or more buttons are clicked in one category, e g Countries, we do a union of those sets and store the new set in Redis. The same goes for all categories, clicked options of the category are unioned and stored as new keys, then intersected against the other categories.

With the possibility to sort the final set using external keys, we have built quite a cool feature with not that much work. But to make it awesome, we want to add some range filters to be able to filter out for instance all hotels in this facet within a certain price range. Not only does it look impressive, it's also easy to achieve with Redis.

Sorted sets

Sorted sets in Redis are like ordinary sets, but with one major difference. Whereas sets can hold only string values, typically the key to some other entity, sorted sets also give each item in the set a numeric score. If the score is the same for all items, the set is sorted and ranged lexically instead. There are some very interesting things that can be done with the lexical part of the sorted sets, but for this demo, we're going to look at the numeric score instead.
Hotels:Prices = [
   1000 "Hotels:1",
   2000 "Hotels:33",
   5000 "Hotels:194",
   3000 "Hotels:233",
    750 "Hotels:299",
   8000 "Hotels:45",
The set is always sorted by the score as a default. To get the items of a set, the command ZRANGE is used. ZRANGE takes the name of the sorted set and the indexes of where to start and end. To get all items without knowing how big the set is, use -1 as the ending index.
ZRANGE Hotels:Prices 0 -1
  1) "Hotels:299"
  2) "Hotels:1"
  3) "Hotels:33"
  4) "Hotels:233"
  5) "Hotels:194"
  6) "Hotels:45"
To view the score and make sure it's sorted correctly, add WITHSCORES to the command. Here we fetch the items between index 0 and 3.
  1) "Hotels:299"
  2) "750"
  3) "Hotels:1"
  4) "1000"
  5) "Hotels:33"
  6) "2000"
  7) "Hotels:233"
  8) "3000"
Getting a range of items from the sorted set by their index is not enough though. We want to be able to fetch all items between, say 1000 and 2200 SEK. Easy peasy using ZRANGEBYSCORE instead of ZRANGE!
  1) "Hotels:1"
  2) "1000"
  3) "Hotels:33"
  4) "2000"
And now things start to fall into place. We have a way of getting the id's of all hotels with a price between 1000 and 2200 SEK. Now we need to create a new set out of this range to intersect this result with the other sets.

Combining sorted sets

Creating this set containing only a certain range of prices is different from the sets we created before. There's no single command that will create the set for us. It's not a union, intersection or difference operation we are looking at. We need a subset of the data in the sorted set.

The way to do this is to create a copy of the original sorted set by using the ZUNIONSTORE command. In this case, we don't want to do a union with another set, we just want to copy the whole Hotels:Prices-set. If only one set is given in a union, this is precisely what happens. To define the set in the db, we name it Hotels:Prices:1000:2200 to show which range of prices it will eventually contain.
ZUNIONSTORE Hotels:Prices:1000:2200 1 Hotels:Prices
Now, we can remove the range we're not interested in from this new set using the command ZREMRANGEBYSCORE. All rangeby-commands are inclusive as default, meaning that the score we provide will be included in the range. This was fine in the latter case where we wanted to include both 1000 and 2200 in our range, but here we want to remove all items with a score less than 1000 and greater than 2200. Luckily this is not a problem since we have the option to make the numbers exclusive by adding a parenthesis.

So, first we want to remove all items with a score lower than 1000. Since we don't know the lowest score in the set, we use negative infinity (-inf) as the starting point. Then we remove everything greater than 2200 up to positive infinity (inf).
ZREMRANGEBYSCORE Hotels:Prices:1000:2200 -inf (1000
ZREMRANGEBYSCORE Hotels:Prices:1000:2200 (2200 inf

ZRANGE Hotels:Prices:1000:2200 0 -1
  1) "Hotels:1"
  2) "1000"
  3) "Hotels:33"
  4) "2000"
Success! We have a new set, containing only the hotels within the given price range. Now we can intersect this set with the other sets.

Combining sets and sorted sets

If we try to do the same kind of intersection like before between an ordinary set and a sorted set, SINTER, we'll get a big no-no. A union, intersection or diff that involves a sorted set will have to use the special sorted set commands; ZINTERSTORE, ZDIFFSTORE and ZUNIONSTORE. All of these commands store a new set in the db. The reason the command is different is that the contents of these types of sets are different.

A sorted set does not only contain the string value, it also has the numeric score. When doing an intersection, we have to decide how to treat the scores of the two different sets. Should the two scores of intersected items be added, or should we use the minimum or maximum value? If we intersect between a regular set and a sorted set, the sorted set's items gets treated like they have a score of 0.

Next up - using Redis in .Net

Now that we hopefully understand Redis a bit better, it's time to get it up and running in the MVC-app.

1 comment: