Friday 9 September 2016

Building a faceted search using Redis and MVC.net - part 2: Combining and sorting sets in Redis

So far we have used the set and string data types in Redis. The Countries-set holds the keys to all country entities. Each country then has a set of all hotel keys in that country. The hotel key holds a string, which is the JSON-representation of the hotel entity. By setting up keys like this, we can slice our hotels in many ways. Want to fetch all hotels with one star? Just get the members of the Stars:1:Hotels-set using the command SMEMBERS.
Countries = ["Countries:1", "Countries:2", "Countries:3"]
Countries:1 = "Germany"
Countries:2 = "Sweden"
Countries:3 = "Denmark"
Countries:1:Hotels = ["Hotels:1", "Hotels:33", "Hotels:194"]
Hotels:1 = '{\"Name\":\"Hotel 1\",\"Stars\":3,\"PricePerNight\":1000}'
Hotels:33 = '{\"Name\":\"Hotel 33\",\"Stars\":5,\"PricePerNight\":2700}'
Hotels:194 = '{\"Name\":\"Hotel 194\",\"Stars\":1,\"PricePerNight\":235}'
Stars:1:Hotels = ["Hotels:194", "Hotels:200"]

SMEMBERS Stars:1:Hotels

Combining sets

Just getting the different sets one by one won't help us filter our hotel list. What we need to do is combine the sets in different ways. In Redis, you can perform combination operations on sets and get the resulting set as a return value, but you can also store the resulting set as a new set in the database. The lifetime scope of these sets can be either temporary by setting an expiration, or permanent if that suits your needs better.

Redis performs intersections, unions and difference operations extremely fast, which makes storing the sets and performing these data manipulations in Redis a much better idea than doing them in your application code. These operations are very powerful and can be combined in a multitude of interesting ways.

Union

Union is the operation that returns all unique members of the given sets. If we want to get all hotels in Germany and Sweden, but not Denmark, we do a union of Countries:1:Hotels and Countries:2:Hotels.
SUNION Countries:1:Hotels Countries:2:Hotels
To store the resulting set instead of immediately retrieving it, we use SUNIONSTORE and as the first parameter to the operation provide a name for the new key.
SUNIONSTORE Countries:1:Countries:2:Hotels Countries:1:Hotels Countries:2:Hotels
If the same value exists in both sets, it will only be included once in the resulting set.

Intersect

Intersect combines two or more sets by taking the values only existing in all sets. An intersection between Countries:1:Hotels and Countries:2:Hotels won't give us anything, unless a hotel can be located in both countries. But doing an intersection between the sets Countries:1:Hotels and Stars:1:Hotels will give us all hotels in Germany holding 1 star.
SINTER Countries:1:Hotels Stars:1:Hotels
Here we can of course use that previously stored union set with both German and Swedish hotels.
SINTER Countries:1:Countries:2:Hotels Stars:1:Hotels
If we want to keep on doing combination operations on the result of this operation, we store the set, creating a new compounded key describing the set's content.
SINTERSTORE Countries:1:Countries:2:Stars:1:Hotels 
   Countries:1:Countries:2:Hotels Stars:1:Hotels

Diff

The final combination operation is diff, which as the name implies, returns the difference between sets. The diff operation is a bit different (haha) than the other combination operations. While union and intersect performs a union/intersection of all given sets, diff performs a difference operation between the first given set and one or more other sets. If we want to see which Swedish and German hotels that don't have 2 or 3 stars, we can do a diff operation.
SDIFF Countries:1:Countries:2:Hotels Stars:2:Hotels Stars:3:Hotels
Now, this could be done with an intersect operation as well, but then you would have to first store the union of the sets of 1, 4 and 5 stars and then do the intersection between that union and the countries union.
SUNIONSTORE Stars:1:Stars:4:Stars:5:Hotels Stars:1:Hotels Stars:4:Hotels Stars:5:Hotels
SINTER Countries:1:Countries:2:Hotels Stars:1:Stars:4:Stars:5:Hotels

Sorting sets

Once we have performed the unions, intersections, diffs and whatnots of our sets, we want to get the result of the final stored key. Even though the actual set only contains one value, the keys of the hotels, they can be sorted in different ways by using external keys and pattern matching.

If we for instance want to sort all hotels in the final set according to their price, we create a new key for that. The key name needs to contain the string value held in the set, i e the hotel key and the value used for sorting. The sort command then takes a pattern and sorts the set according to the values in the external key.
Countries:1:Countries:2:Stars:1:Hotels = ["Hotels:194", "Hotels:200"]
SortByPrice_Hotels:194 = 1000
SortByPrice_Hotels:200 = 1200

SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_*
We can use a limit to decide how many items we want to fetch in the sorted list. The limit takes the offset and the size, in this case we start at item 0 and take 4 items.
SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_* LIMIT 0 4
And finally, if the set contains keys to other entities, we can also use the same pattern as for sorting by external keys to actually get that JSON-blob in the same operation. Very clever. Think of the asterisk as being replaced with the individual values in the set. :)
SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_* LIMIT 0 4 GET *

Next step

These operations will get us a long way. The basics of filtering with Redis is to just keep on storing the result of unions, intersections and diffs based on choices made in the GUI until there is a final set to be sorted and fetched. But we need to use one more datatype in Redis, sorted sets. The sorted set will help us fetch hotels within a certain price interval, distance to beach or distance to shopping.

No comments:

Post a Comment