How Simulated Annealing Can Improve Your Lunch

Zesty serves thousands of meals per week from local restaurants to hundreds of San Francisco’s fastest growing companies. Before Zesty, most of these companies had their Office Manager order from restaurants, giving them a near impossible task of balancing every employee's conflicting preferences.

Zesty looked at this task and saw an optimization problem better solved by technology, so we've automated the process of planning optimal meals for companies. Doing this turned out to require a lot more math than you usually do before lunch.

The high-level problem is this: for each meal to be served, we need to select a restaurant partner to provide that meal, and that choice needs to keep both the client and the restaurant happy. Take our client Heyzap, for example. They love variety, and will get bored and unhappy if we serve them the same cuisine two days in a row, they’re particularly keen on trying out new restaurants, and they dislike Chinese food. Matching all of these preferences is hugely important for keeping our clients happy, but having to deal with everything manually would cause a huge bottleneck for our growth.

Our first task is to encode all of these requirements. We can define an objective function to describe how good our choice is, something like... HeyzapHappiness(restaurants) = variety + new restaurants + no Chinese food + within budget whose value is optimized when they have maximum variety in the cuisine they eat, get to try out all of our best restaurant partners, and don’t get served food they don’t like, all whilst staying within their budget.

So far, this problem is easy to solve, because we can just optimize for each client individually. But there’s another side to the story - the restaurants. Zesty's restaurant partners want a consistent volume of orders from week to week, and they have capacity constraints so we need to make sure we don’t overload them at a single meal time. Now the optimization challenge becomes a much bigger problem because it’s no longer separable into distinct parts. Restaurants have constraints, and clients have constraints, and the two are coupled, which means we have to solve the entire optimization problem globally to find a good solution.

It’s easy to see that a brute-force approach is not going to get us anywhere very quickly… or even very slowly. If we had 100 clients, each requesting 5 meals per week, and 100 possible restaurants, we would face 100500 possible configurations to check to find the globally best configuration. That’s an awful lot more than the estimated number of atoms in the universe.

But that number obviously includes a lot of really stupid configurations, like giving everyone the same restaurant for every meal, or Heyzap getting nothing but Chinese food every day... So the first thing that we can and should do is filter out the really bad suggestions to leave a significantly reduced set of possibilities to optimize over. That said, even if we were to select a list of just 10 possible restaurants per client, and assume that the meal order in the week doesn’t matter, we’d still be left with around 250100 configurations to test!

Solution filtering

Methodically searching the state space is out. Instead, we take the approach of a directed random search to look for as good a solution as we can find. Starting from some initial configuration, we make a random change to the state, switching one restaurant for another, and then compare if it is better or worse than the previous state. This is all well and good, but the challenge in doing this (as is often the case with optimization in very high dimensional spaces) is getting stuck in local optima. A simple hill climbing algorithm, where we only ever switch to a state which is better than the previous one, will have very little chance of finding a good solution.

We can use a simulated annealing algorithm to get around this problem because it's great at hopping out of those local optima and searching for better solutions. Rather than deterministically moving to only better solutions, at each step the algorithm randomly decides whether to move to the new solution or stay with the old one. The important feature is that this can happen even if the new solution is worse than the previous one, with some probability that depends on how much worse it is, Probability of switch = f(new_happiness, old_happiness). By choosing an exponential switching function this system ends up looking very much like an evolving thermal system from statistical physics. As the algorithm progresses we effectively ‘cool’ the system, making it less likely to hop to much worse configurations. By the end, the state should have evolved into a good optimum.

Global optima

There are a number of ways this algorithm can be honed to give as good a solution as possible, within a reasonable amount of processing time. The switching function, the annealing schedule, the choice of initial configurations - making smart choices about these things makes an enormous difference to how well this approach works. There’s still a lot of room for improving our solution, but we’re excited about the results that we’ve seen so far. Ultimately, solving this problem well means a lot of happy employees eating healthy and delicious food!

If you found this interesting, we’re hiring! https://www.zesty.com/jobs