Monday 26 September 2016

Building a faceted search using Redis and MVC.net - part 4: Using Redis in an MVC-app

There's a number of .Net clients available as Nuget-packages. I've chosen to use StackExchange's Redis, github.com/StackExchange/StackExchange.Redis. It maps well against the commands available in the Redis Client, it has a good documentation and, well, Stack Overflow uses it so it really ought to cover my needs... And of course, it is free.

The demo web for the faceted search is available at hotelweb.azurewebsites.net and code can be found on github.com/asalilje/redisfacets.

Connecting to Redis

Once the StackExchange.Redis nuget package is installed in the .Net-solution, we can try a simple Redis query. We want all Hotels that have one star, i e all members of the set Stars:1:Hotels.
  var connection = ConnectionMultiplexer.Connect("redishost");
  var db = connection.GetDatabase();
  var list = db.SetMembers("Stars:1:Hotels");
The list returned is the JSON-blobs we stored for each hotel, so we need to deserialize it to a C#-entity using Newtonsoft.
  var hotels = hotels.Select((x, i) =>
  {
    var hotel = JsonConvert.DeserializeObject(x);
    hotel.Index = i;
    return hotel;
  });
Now, the ConnectionMultiplexer is the central object of this Redis Client. It is expensive, does a lot of work hiding away the inner workings of talking to multiple servers and it is completely threadsafe. So it's designed to be shared and reused between callers. It should not be created per call, as in the code above.

The database object that you get from the multiplexer is a cheap pass through object on the other hand. It does not need to be stored, and it is your access to all parts of the Redis API. One way to handle this is to wrap the connection and Redis calls in a class that uses lazy loading to create the connection.
  private static ConnectionMultiplexer Connection => LazyConnection.Value;
  private static readonly Lazy LazyConnection =
    new Lazy(() => ConnectionMultiplexer.Connect("redishost"));

  private static IDatabase GetDb()  {
    return Connection.GetDatabase(Database);
  }

  public static string GetString(string key)  {
    return GetDb().StringGet(key);
  }

Fine tuning the queries

Let's return to the concepts from the earlier parts of this blog series, combinations of sets. Say we want to get all hotels in Germany that has a bar. Just send in an array of the keys that should be intersected.
  var db = GetDb();
  return db.SetCombine(SetOperation.Intersect, 
    new []{"Countries:1:Hotels", "Bar:False"};
The chosen keys in the same category should be unioned before they are intersected with another category. As we did before, we union them and store them in the db to be able to do the intersection directly in Redis. In this case, we also send in the name of the new key to store, compounded from the data it contains.
  var db = GetDb();
  db.SetCombineAndStore(SetOperation.Union, "Countries:1:Countries:2:Hotels", 
    new []{"Countries:1:Hotels", "Countries:2:Hotels"});
  return db.SetCombine(SetOperation.Intersect, 
    new []{"Countries:1:Countries:2:Hotels", "Bar:False"};
If we want to sort the list according to an external key, we just add the by-keyword in the sort-command to point to the correct key, using the asterisk-pattern.
  var db = GetDb();
  db.Sort("Countries:1:Hotels", by: "SortByPrice_*", get: new RedisValue[] {"*"}));

Putting it all together

Now we have the concepts and data modelling of Redis and the Redis client in place. And the rest is basically just putting the things together. The filtering buttons are created dynamically according to what options are available in the db. Each time a filter or sorting option is clicked, or a slider is pulled, an event is triggered in javascript that creates an url based on which buttons are chosen.

The call goes via AJAX to the MVC-app that does all the filtering using unions and intersections, fetches and sorts the final list, and disables or enables any affected filter buttons.

All this, as you know, can be done in a number of ways. If you need inspiration or some coding examples, take a look at the code on github.com/asalilje/redisfacets. :)

Friday 23 September 2016

Leader Election with Consul.Net

Microservices are great and all that, but you know those old fashioned batch services, like a data processing service or a cache loader service that should run with regular intervals? They're still around. These kind of services often end up on one machine where they keep running their batch jobs until someone notices they've stopped working. Maybe a machine where it runs for both stage and production purposes, or maybe it doesn't even run in stage cause no one can be bothered. Easier to just copy the database from production.

But we can do better, right? One way to solve this is to deploy the services to multiple machines, as you would with a web application. Use Octopus, deploy the package, install and start the service, then promote the same package to production, doing config transforms along the way. Problem then is that we have a service that runs on multiple machines, doing the same job multiple times. Unnecessary and, if there's a third party API involved, probably unwanted.

Leader election to the rescue

Leader election is really quite a simple concept. The service nodes register against a host using a specific common key. One of the nodes is elected leader and performs the job, while the other ones are idle. This lock to a specific node is held as long as the node's session remains in the host's store. When the node's session is gone, the leadership is open for taking by the next node that checks for it. Every time the nodes are scheduled to run their task, this check is performed.

Using this approach, we have one node doing the job while the others are standing by. At the same time, we get rid of our single point of failure. If a node goes down, another will take over. And we can incorporate this in our ordinary build chain and treat these services like we do with other types of applications. Big win!

An example with Consul.io

Consul is a tool for handling services in your infrastructure. It's good at doing many things and you can read all about it at consul.io. Consul is installed as an agent on your servers, which syncs with one or many hosts. But you can run it locally to try it out.

Running Consul locally

To play around with Consul, just download it here, unpack it and create a new config file in the extracted folder. Name the file local_config.json and paste in the config below.
{
    "log_level": "TRACE",
    "bind_addr": "127.0.0.1",
    "server": true,
    "bootstrap": true,
    "acl_datacenter": "dc1",
    "acl_master_token": "yep",
    "acl_default_policy": "allow",
    "leave_on_terminate": true
}
This will allow you to run Consul and see the logs of calls coming in. Run it by opening a command prompt, moving to the extracted folder and typing:
consul.exe agent -dev -config-file local_config.json

Consul.net client

For a .Net-solution, a nice client is available as a Nuget-package, https://github.com/PlayFab/consuldotnet. With that, we just create a ConsulClient and have access to all the API's provided by Consul. For leader election, we need the different Lock-methods in the client. Basically, CreateLock is creating the node session in Consul, AcquireLock is trying to assume leadership if no leader exists, and the session property IsHeld is true if the node is elected leader and should do the job.
var consulClient = new ConsulClient();
var session = consulClient.CreateLock(serviceKey);
await session.AcquireLock();
if (session.IsHeld)
    DoWork();

A demo service

Here's a small service running a timer updating every 3 seconds. On construction, the service instance creates a session in Consul. Every time the CallTime-function is triggered, we check if we hold the lock. If we do, we display the time, otherwise we print "Not the leader". When the service is stopped, we destroy the session so the other nodes won't have to wait for the session TTL to end.
using System;
using System.Threading;
using System.Threading.Tasks;
using Consul;
using Topshelf;
using Timer = System.Timers.Timer;

namespace ClockService
{
    class Program
    {
        static void Main(string[] args)
        {
            HostFactory.Run(x =>
            {
                x.Service(s =>
                {
                    s.ConstructUsing(name => new Clock());
                    s.WhenStarted(c => c.Start());
                    s.WhenStopped(c => c.Stop());
                });
                x.RunAsLocalSystem();
                x.SetDisplayName("Clock");
                x.SetServiceName("Clock");
            });
        }
    }

    class Clock
    {
        readonly Timer _timer;
        private IDistributedLock _session;

        public Clock()
        {
            var consulClient = new ConsulClient();
            _session = consulClient.CreateLock("service/clock");
            _timer = new Timer(3000);
            _timer.Elapsed += (sender, eventArgs) => CallTime();
        }

        private void CallTime()
        {
            Task.Run(() =>
            {
                _session.Acquire(CancellationToken.None);
            }).GetAwaiter().GetResult();

            Console.WriteLine(_session.IsHeld 
                ? $"It is {DateTime.Now}" 
                : "Not the leader");
        }

        public void Start() { _timer.Start(); }

        public void Stop()
        {
            _timer.Stop();
            Task.WaitAll(
                Task.Run(() =>
                {
                    _session.Release();
                }),
                Task.Run(() =>
                {
                    _session.Destroy();
                }));
        }
    }
}

When two instances of this service are started, we get this result. One node is active and the other one is idle.


When the previous leader is stopped, the second node automatically takes over the leadership and starts working.


All in all, quite a nice solution for securing the running of those necessary batch services. :)

Saturday 10 September 2016

Building a faceted search using Redis and MVC.net - part 3: Sorted sets for range queries

Storing and combining sets and strings in Redis will get us a nice filtered search. The first three rows of filtering options in the demo at http://hotelweb.azurewebsites.net/ are using only sets holding the keys to the hotels. If one or more buttons are clicked in one category, e g Countries, we do a union of those sets and store the new set in Redis. The same goes for all categories, clicked options of the category are unioned and stored as new keys, then intersected against the other categories.


With the possibility to sort the final set using external keys, we have built quite a cool feature with not that much work. But to make it awesome, we want to add some range filters to be able to filter out for instance all hotels in this facet within a certain price range. Not only does it look impressive, it's also easy to achieve with Redis.



Sorted sets

Sorted sets in Redis are like ordinary sets, but with one major difference. Whereas sets can hold only string values, typically the key to some other entity, sorted sets also give each item in the set a numeric score. If the score is the same for all items, the set is sorted and ranged lexically instead. There are some very interesting things that can be done with the lexical part of the sorted sets, but for this demo, we're going to look at the numeric score instead.
Hotels:Prices = [
   1000 "Hotels:1",
   2000 "Hotels:33",
   5000 "Hotels:194",
   3000 "Hotels:233",
    750 "Hotels:299",
   8000 "Hotels:45",
]
The set is always sorted by the score as a default. To get the items of a set, the command ZRANGE is used. ZRANGE takes the name of the sorted set and the indexes of where to start and end. To get all items without knowing how big the set is, use -1 as the ending index.
ZRANGE Hotels:Prices 0 -1
  1) "Hotels:299"
  2) "Hotels:1"
  3) "Hotels:33"
  4) "Hotels:233"
  5) "Hotels:194"
  6) "Hotels:45"
To view the score and make sure it's sorted correctly, add WITHSCORES to the command. Here we fetch the items between index 0 and 3.
ZRANGE Hotels:Prices 0 3 WITHSCORES
  1) "Hotels:299"
  2) "750"
  3) "Hotels:1"
  4) "1000"
  5) "Hotels:33"
  6) "2000"
  7) "Hotels:233"
  8) "3000"
Getting a range of items from the sorted set by their index is not enough though. We want to be able to fetch all items between, say 1000 and 2200 SEK. Easy peasy using ZRANGEBYSCORE instead of ZRANGE!
ZRANGEBYSCORE Hotels:Prices 1000 2200 WITHSCORES
  1) "Hotels:1"
  2) "1000"
  3) "Hotels:33"
  4) "2000"
And now things start to fall into place. We have a way of getting the id's of all hotels with a price between 1000 and 2200 SEK. Now we need to create a new set out of this range to intersect this result with the other sets.

Combining sorted sets

Creating this set containing only a certain range of prices is different from the sets we created before. There's no single command that will create the set for us. It's not a union, intersection or difference operation we are looking at. We need a subset of the data in the sorted set.

The way to do this is to create a copy of the original sorted set by using the ZUNIONSTORE command. In this case, we don't want to do a union with another set, we just want to copy the whole Hotels:Prices-set. If only one set is given in a union, this is precisely what happens. To define the set in the db, we name it Hotels:Prices:1000:2200 to show which range of prices it will eventually contain.
ZUNIONSTORE Hotels:Prices:1000:2200 1 Hotels:Prices
Now, we can remove the range we're not interested in from this new set using the command ZREMRANGEBYSCORE. All rangeby-commands are inclusive as default, meaning that the score we provide will be included in the range. This was fine in the latter case where we wanted to include both 1000 and 2200 in our range, but here we want to remove all items with a score less than 1000 and greater than 2200. Luckily this is not a problem since we have the option to make the numbers exclusive by adding a parenthesis.

So, first we want to remove all items with a score lower than 1000. Since we don't know the lowest score in the set, we use negative infinity (-inf) as the starting point. Then we remove everything greater than 2200 up to positive infinity (inf).
ZREMRANGEBYSCORE Hotels:Prices:1000:2200 -inf (1000
ZREMRANGEBYSCORE Hotels:Prices:1000:2200 (2200 inf

ZRANGE Hotels:Prices:1000:2200 0 -1
  1) "Hotels:1"
  2) "1000"
  3) "Hotels:33"
  4) "2000"
Success! We have a new set, containing only the hotels within the given price range. Now we can intersect this set with the other sets.

Combining sets and sorted sets

If we try to do the same kind of intersection like before between an ordinary set and a sorted set, SINTER, we'll get a big no-no. A union, intersection or diff that involves a sorted set will have to use the special sorted set commands; ZINTERSTORE, ZDIFFSTORE and ZUNIONSTORE. All of these commands store a new set in the db. The reason the command is different is that the contents of these types of sets are different.

A sorted set does not only contain the string value, it also has the numeric score. When doing an intersection, we have to decide how to treat the scores of the two different sets. Should the two scores of intersected items be added, or should we use the minimum or maximum value? If we intersect between a regular set and a sorted set, the sorted set's items gets treated like they have a score of 0.

Next up - using Redis in .Net

Now that we hopefully understand Redis a bit better, it's time to get it up and running in the MVC-app.

Friday 9 September 2016

Building a faceted search using Redis and MVC.net - part 2: Combining and sorting sets in Redis

So far we have used the set and string data types in Redis. The Countries-set holds the keys to all country entities. Each country then has a set of all hotel keys in that country. The hotel key holds a string, which is the JSON-representation of the hotel entity. By setting up keys like this, we can slice our hotels in many ways. Want to fetch all hotels with one star? Just get the members of the Stars:1:Hotels-set using the command SMEMBERS.
Countries = ["Countries:1", "Countries:2", "Countries:3"]
Countries:1 = "Germany"
Countries:2 = "Sweden"
Countries:3 = "Denmark"
Countries:1:Hotels = ["Hotels:1", "Hotels:33", "Hotels:194"]
Hotels:1 = '{\"Name\":\"Hotel 1\",\"Stars\":3,\"PricePerNight\":1000}'
Hotels:33 = '{\"Name\":\"Hotel 33\",\"Stars\":5,\"PricePerNight\":2700}'
Hotels:194 = '{\"Name\":\"Hotel 194\",\"Stars\":1,\"PricePerNight\":235}'
Stars:1:Hotels = ["Hotels:194", "Hotels:200"]

SMEMBERS Stars:1:Hotels

Combining sets

Just getting the different sets one by one won't help us filter our hotel list. What we need to do is combine the sets in different ways. In Redis, you can perform combination operations on sets and get the resulting set as a return value, but you can also store the resulting set as a new set in the database. The lifetime scope of these sets can be either temporary by setting an expiration, or permanent if that suits your needs better.

Redis performs intersections, unions and difference operations extremely fast, which makes storing the sets and performing these data manipulations in Redis a much better idea than doing them in your application code. These operations are very powerful and can be combined in a multitude of interesting ways.

Union

Union is the operation that returns all unique members of the given sets. If we want to get all hotels in Germany and Sweden, but not Denmark, we do a union of Countries:1:Hotels and Countries:2:Hotels.
SUNION Countries:1:Hotels Countries:2:Hotels
To store the resulting set instead of immediately retrieving it, we use SUNIONSTORE and as the first parameter to the operation provide a name for the new key.
SUNIONSTORE Countries:1:Countries:2:Hotels Countries:1:Hotels Countries:2:Hotels
If the same value exists in both sets, it will only be included once in the resulting set.

Intersect

Intersect combines two or more sets by taking the values only existing in all sets. An intersection between Countries:1:Hotels and Countries:2:Hotels won't give us anything, unless a hotel can be located in both countries. But doing an intersection between the sets Countries:1:Hotels and Stars:1:Hotels will give us all hotels in Germany holding 1 star.
SINTER Countries:1:Hotels Stars:1:Hotels
Here we can of course use that previously stored union set with both German and Swedish hotels.
SINTER Countries:1:Countries:2:Hotels Stars:1:Hotels
If we want to keep on doing combination operations on the result of this operation, we store the set, creating a new compounded key describing the set's content.
SINTERSTORE Countries:1:Countries:2:Stars:1:Hotels 
   Countries:1:Countries:2:Hotels Stars:1:Hotels

Diff

The final combination operation is diff, which as the name implies, returns the difference between sets. The diff operation is a bit different (haha) than the other combination operations. While union and intersect performs a union/intersection of all given sets, diff performs a difference operation between the first given set and one or more other sets. If we want to see which Swedish and German hotels that don't have 2 or 3 stars, we can do a diff operation.
SDIFF Countries:1:Countries:2:Hotels Stars:2:Hotels Stars:3:Hotels
Now, this could be done with an intersect operation as well, but then you would have to first store the union of the sets of 1, 4 and 5 stars and then do the intersection between that union and the countries union.
SUNIONSTORE Stars:1:Stars:4:Stars:5:Hotels Stars:1:Hotels Stars:4:Hotels Stars:5:Hotels
SINTER Countries:1:Countries:2:Hotels Stars:1:Stars:4:Stars:5:Hotels

Sorting sets

Once we have performed the unions, intersections, diffs and whatnots of our sets, we want to get the result of the final stored key. Even though the actual set only contains one value, the keys of the hotels, they can be sorted in different ways by using external keys and pattern matching.

If we for instance want to sort all hotels in the final set according to their price, we create a new key for that. The key name needs to contain the string value held in the set, i e the hotel key and the value used for sorting. The sort command then takes a pattern and sorts the set according to the values in the external key.
Countries:1:Countries:2:Stars:1:Hotels = ["Hotels:194", "Hotels:200"]
SortByPrice_Hotels:194 = 1000
SortByPrice_Hotels:200 = 1200

SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_*
We can use a limit to decide how many items we want to fetch in the sorted list. The limit takes the offset and the size, in this case we start at item 0 and take 4 items.
SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_* LIMIT 0 4
And finally, if the set contains keys to other entities, we can also use the same pattern as for sorting by external keys to actually get that JSON-blob in the same operation. Very clever. Think of the asterisk as being replaced with the individual values in the set. :)
SORT Countries:1:Countries:2:Stars:1:Hotels BY SortByPrice_* LIMIT 0 4 GET *

Next step

These operations will get us a long way. The basics of filtering with Redis is to just keep on storing the result of unions, intersections and diffs based on choices made in the GUI until there is a final set to be sorted and fetched. But we need to use one more datatype in Redis, sorted sets. The sorted set will help us fetch hotels within a certain price interval, distance to beach or distance to shopping.

Wednesday 7 September 2016

Building a faceted search using Redis and MVC.net - part 1: The key is key

Faceted searches on web pages come in many forms, both technically and UX-wise. This one is built using Redis and MVC.net. The data is stored and modelled in Redis to suit this particular need. The web site uses plain javascript to build the calls to the MVC app, which serves the resulting list as an HTML-blob.


Basic concept

A demo can be found on http://hotelweb.azurewebsites.net/. The basic concept is: 2000 hotels in a list, possible to filter by different qualities and sort by column. The filtering facets are updated with the number of hits each option has. All calls for filtering and sorting are server calls going to the Redis DB. The demo is hosted on Azure and the Redis DB on RedisLabs, which makes the demo not quite as fast as it could be. Sorry about that. And it's not responsive. At all.

Data in Redis

Redis is a key-value store that gives you the possibility to use a number of different data types as the value. In this case, we make use of:
  • Strings, a plain string value: color = "blue".
  • Sets, an unordered collection of unique strings: allcolors = "blue"; "red"; "green".
  • Sorted sets, a set where each element has a score for sorting them: bestcolors = "blue", 1; "green", 2; "red", 3.

Being able to use sets and stored sets is great, but you still have to wrap your head around how to think about data in Redis. It's a bit different compared to working with the relational or document database you might be used to. Let's look at the first example: how to store and connect countries and hotels.

Compounding keys

In Redis, it's all about working with the data, creating all the keys necessary for what you want to do. There is a great pattern for doing this. First, let's look at what we want to be able to do with the countries in this scenario.
  • We want to fetch all the countries available. This sounds like a set.
  • We want to fetch all hotels for a country. This also sounds like a set.

So if we create the set "Countries", holding the values "Germany", "Sweden", "Denmark", that would solve our first task. But how will we solve the second one? Should we have a set called "Germany" that holds the hotel ids? Then, how would we know that the set "Germany" is a country? We won't, and maybe that wouldn't be a problem in a small data model, but it will definitely quickly get messy and hard to understand the data structure.

The solution to this are compound keys (yay!). Let the key reflect the data structure and let the values be other keys leading down to more data.
Countries = ["Countries:1", "Countries:2", "Countries:3"]
Countries:1 = "Germany"
Countries:2 = "Sweden"
Countries:3 = "Denmark"
Countries:1:Hotels = ["Hotels:1", "Hotels:33", "Hotels:194"]
Hotels:1 = '{\"Name\":\"Hotel 1\",\"Stars\":3,\"PricePerNight\":1000}'
Hotels:33 = '{\"Name\":\"Hotel 33\",\"Stars\":5,\"PricePerNight\":2700}'
Hotels:194 = '{\"Name\":\"Hotel 194\",\"Stars\":1,\"PricePerNight\":235}'

Using this pattern, the data structure is fairly simple to grasp just by looking at the key. The set Countries holds the keys Countries:1, Countries:2, Countries:3. These keys lead to the name of the country. By adding another segment to the key you can connect the countries to other sets. Countries:1:Hotels holds the keys to all hotels in that particular country. That hotel key in turn uses the string data type to store a json blob containing the hotel information.

Compound keys can of course be used in any direction. Hotels:1:Beaches can hold a set of all the beaches close to a certain hotel, in the same way as Beaches:1:Hotels can hold a set of all hotels close to a particular beach. Slicing and dicing the data in advance is key (no pun intended) when working with Redis.

Next step

Now we have a start to getting the hotels we want when clicking countries and stars (you guessed it, Stars:1:Hotels) in our faceted search. Next up we'll see how we can combine and sort sets.