Tuesday, September 20, 2011

Netflix Splitting is a silly move

Here I am in Italy on my honeymoon, and while taking a brief pass through my email looking for something important, I see an email from Netflix about the separation of their businesses into a physical DVD division (Quickster) and a streaming video division (Netflix).

From a business perspective, I understand this completely. They want to isolate that business from the streaming business so that they can optimize the businesses separately. It will help ensure profitability on both halves (incidentally, that's also part of the reason they split up billing a month or two ago, it was a telegraphing of their moves). It may also help them limit and isolate each business from licensing terms so that they can only affect one and not both.

But from the perspective of the consumer? Well, as a consumer who didn't much care that I was paying an extra $5/month for the service (I definitely get my money's worth of 2 dvds out with unlimited streaming even at $20/month), not having the sites integrated actually pisses me off. When browsing through videos, I would usually check whether the streaming version has subtitles (my wife speaks English as a 2nd language, and finds subtitles easier to follow than spoken dialog in movies, especially British films), and if not, request the DVD. Or, when streaming wasn't available, directly request the DVD.

My interaction with Netflix is defined by video availability and feaures in the two formats in which I consume content. To separate them into two different sites makes it *more difficult* for me to choose media in a reasonable way. Three years ago, I wished for a web site that knew about my memberships to all the different sites I wanted to consume from, knew what I watched, what I wanted to watch, and told me where I could get it. Netflix is going in the opposite direction.

Remember back when they tried to remove profiles (that little feature where different people in a family could have different viewing preferences, recommendations, and number of DVDs they could have out)? You know, a practically useful feature? They ultimately kept it because so many people spoke out about it. I don't know if enough people are going to recognize how awful this recent move is and complain, but it would sure be nice to be able to know whether a movie is available in DVD and streaming without visiting two sites and jumping between them.

I've heard the stories about Netflix, about how awesome their benefits, pay, etc. are. Heck, I've even received recruiting emails from them. But you know what they need? They need to hire people who tell the upper management "no". One of the greatest bits of engineer and business productivity that we've cultivated at Adly is the ability for people to say no. We have management with great vision and ideas, a founder who can get us new designs overnight, and an engineering team that can do it almost as fast. But when you move too fast, when you are only concerned about the *right now*, you lose sight of where you are going. By saying no to daily updates, we focused on longer-term goals and company progress. Netflix needs people to say no, not because they are moving too fast (as was our problem), but because they are making decisions that make their customers angry. They need an actual customer advocate working for Netflix. Someone who talks with real customers.

I know, customers don't like change. People don't like change. We saw it continually at YouTube. But there are some people, who you have to contact and cultivate, who understand change is necessary, and who will tell you that your changes are good when they are good, and suck when they suck.

Netflix, I'm willing to be that guy. I'll do it for free: this multi-site thing sucks. Stop it.

Thursday, September 15, 2011

Improving Performance by 1000x

Whenever possible at Adly, we like to use existing software to solve as many problems as we can. From PostgreSQL for our database, to syslog-ng and Flask for our logging, to Redis for search, caching, and other fast-data operations, ActiveMQ for our message broker, etc. But there are times when existing software just isn't fast enough, takes too much memory, or just isn't suited for the task. This is one of those times.

In our Adly Analytics product, we calculate a number for pairs of Twitter users called "Also Follows". For two twitter users, it is the number of followers that follow both of those users. We had tried using Redis for this directly, but we were looking at roughly 700 megs of memory to store one set of 7.5 million followers. Ugh! Straight Python did a bit better at 500 megs, but that was still far more than we could reasonably use to store the data. Combine that with Redis intersections taking multiple seconds to run, and the fact that we needed to intersect 1,000 sets against 10,000 sets every week (these were due to the nature of the kinds of updates we were performing), and we'd need to have 33 processors plus roughly 300 gigs of memory to just perform this operation fast enough. That's just too expensive (over $6k/month just for machines in EC2 for the memory).

When faced with seemingly insurmountable costs, we did what any tech-directed company would do; we wrote custom software to solve our problem. What did we do?

First, we got rid of hash tables. They are great for random lookups, but in the case for fixed-size data (we had 32 bit integers for Twitter user ids), a sorted array would be roughly 1/20th the size of the set in Redis, while offering a method for random lookup (if necessary).

When we are fetching the follower lists, we get them in chunks of 5000 at a time, unsorted. We write them to temporary files on disk while we are reading them, then when done, we read the list into memory, then sort it using the C standard library quicksort. After sorting, we scan the sequence for any duplicate values and discard them. Once we have a sorted sequence of unique integers, we write them to disk.

Once they are on disk, we signal our intersection daemon to memory map the file*, letting the operating system cache the file as necessary. When we need to perform the "also follows" calculation, we run a custom intersection algorithm that is tuned to rely on memory bandwidth, which allows us to intersect a set of 5 million and 7.5 million integers with roughly 2 million shared integers in roughly 6 milliseconds on a 3ghz Core 2 Duo (using one processor). That is roughly 2 billion integers examined every second. Compare that to Redis taking roughly 7 seconds to perform that same operation (due to the random hash table lookups, allocating and copying data, etc.), and we've improved our performance by 1000x.

Our results:
For the data we need to store in memory, we went from over 300 gigs to under 15 gigs. We also went from taking around 45 minutes to intersect 1 item against 10,000, to taking 3 seconds**.  While we still believe in using existing software as much as possible, sometimes it is just worth it to write what you need to in C/C++ to get the speed you need.

Along the way, we tried to use scipy.weave.inline to inline C into Python. Sadly, due to some threading issues, we got some nasty behavior and segfaults. I didn't dig too far into it and just wrote what we needed as a plain Python C extension. It's pretty easy.

* Memory mapped files are an interesting and crazy thing, available in Posix (Linux, BSD, OS X, etc.), Windows, and just about any other mature system out there. They allow you to open a file in such a way that you can access the contents of the file as if it were a C array. You can map the entire file or parts of the file. The operating system caches the data as necessary, and if you have multiple processes memory mapping the same file, the memory is shared between them.

** Theoretically it was dropped to 3 seconds, but due to processor caching behavior, Python overhead, etc., this is actually around 45-50 seconds. In an hour, we sustain roughly 2.4 trillion integers examined, or roughly 670 million integers intersected against each other every second on average.

If you want to see more posts like this, you can buy my book, Redis in Action from Manning Publications today!

Update: You can read why we didn't use a bloom filter in a followup post.

Friday, September 2, 2011

Building for Uptime, Expecting Failure talk now available

Way back in June, I posted an article on building Adly's real-time ad network. Well, on August 1, presented at the Los Angeles Dev Ops meetup, and I just now got it uploaded to YouTube. You can view Building for Uptime, Expecting Failure now.

The full slides are available via Google Docs.