Wednesday, November 26, 2014

Introduction to rate limiting with Redis [Part 2]

This article first appeared on November 3, 2014 over on Binpress at this link. I am reposting it here so my readers can find it easily.

In Introduction to rate limiting with Redis [Part 1], I described some motivations for rate limiting, as well as provided some Python and Lua code for offering basic and intermediate rate limiting functionality. If you haven’t already read it, you should, because I’m going to discuss several points from the article. In this post, I will talk about and address some problems with the previous methods, while also introducing sliding window functionality and-cost requests.

Problems with previous methods

The last rate limiting function that we wrote was over_limit_multi_lua(), which used server-side Lua scripting in Redis to do the heavy lifting of actually performing the rate limiting calculations. It is included below with the Python wrapper as a reference.

def over_limit_multi_lua(conn, limits=[(1, 10), (60, 120), (3600, 240)]):
    if not hasattr(conn, 'over_limit_lua'):
        conn.over_limit_lua = conn.register_script(over_limit_multi_lua_)

    return conn.over_limit_lua(
        keys=get_identifiers(), args=[json.dumps(limits), time.time()])

over_limit_multi_lua_ = '''
local limits = cjson.decode(ARGV[1])
local now = tonumber(ARGV[2])
for i, limit in ipairs(limits) do
    local duration = limit[1]

    local bucket = ':' .. duration .. ':' .. math.floor(now / duration)
    for j, id in ipairs(KEYS) do
        local key = id .. bucket

        local count = redis.call('INCR', key)
        redis.call('EXPIRE', key, duration)
        if tonumber(count) > limit[2] then
            return 1
        end
    end
end
return 0
'''

Hidden inside this code are several problems that can limit its usefulness and correctness when used for its intended purpose. These problems and their solutions are listed below.

Generating keys in the script

One of the first problems you might notice was mentioned in a comment by a commenter named Tobias on the previous post, which is that we are constructing keys inside the Lua script. If you’ve read the Redis documentation about Lua scripting, you should know that we are supposed to be passing all keys to be used in the script from outside when calling it.

The requirement to pass keys into the script is how Redis attempts to future-proof Lua scripts that are being written, as Redis Cluster (currently in beta) distributes keys across multiple servers. By having your keys known in advance, you can calculate which Redis Cluster server the script should run on, and if keys are on multiple Cluster servers, that the script can’t run properly.

Our first problem is that generating keys inside the script can make the script violate Redis Cluster assumptions, which makes it incompatible with Redis Cluster, and generally makes it incompatible with most key-based sharding techniques for Redis.

To address this issue for Redis Cluster and other client-sharded Redis setups, we must use a method that handles rate limiting with a single key. Unfortunately, this can prevent atomic execution for multiple identifiers for Redis Cluster, but you can either rely on a single identifier (user id OR IP address, instead of both), or stick with non-clustered and non-sharded Redis in those cases.

What we count matters

Looking at our function definition, we can see that our default limits were 10 requests per second, 120 requests per minute, and 240 requests per hour. If you remember from the “Counting correctly” section, in order for our rate limiter to complete successfully, we needed to only increment one counter at a time, and we needed to stop counting if that counter went over the limit.

But if we were to reverse the order that the limits were defined, resulting in us checking our per-hour, then per-minute, then per-second limits (instead of per-second, minute, then hour), we would have our original counting problem all over again. Unfortunately, due to details too involved to explain here, just sorting by bucket size (smallest to largest) doesn’t actually solve the problem, and even the original order could result in requests failing that should have succeeded. Ultimately our problem is that we are counting all requests, both successful and unsuccessful (those that were prevented due to being over the limit).

To address the issue with what we count, we must perform two passes while rate limiting. Our first pass checks to see if the request would succeed (cleaning out old data as necessary), and the second pass increments the counters. In previous rate limiters, we were basically counting requests (successful and unsuccessful). With this new version, we are going to only count successful requests.
Stampeding elephants

One of the most consistent behaviors that can be seen among APIs or services that have been built with rate limiting in mind is that usually request counts get reset at the beginning of the rate limiter’s largest (and sometimes only) time slice. In our example, at every hour on the hour, every counter that had been incremented is reset.

One common result for APIs with these types of limits and limit resets is what’s sometimes referred to as the “stampeding elephants” problem. Because every user has their counts reset at the same time, when an API offers access to in-demand data, many requests will occur almost immediately after limits are reset. Similarly, if the user knows that they have outstanding requests that they can make near the end of a time slice, they will make those requests in order to “use up” their request credit that they would otherwise lose.

We partially addressed this issue by introducing multiple bucket sizes for our counters, specifically our per-second and per-minute buckets. But to fully address the issue, we need to implement a sliding-window rate limiter, where the count for requests that come in at 6:01PM and 6:59PM aren’t reset until roughly an hour later at 7:01PM and 7:59PM, respectively, not at 7:00PM. Further details about sliding windows are a little later.

Bonus feature: variable-cost requests

Because we are checking our limits before incrementing our counts, we can actually allow for variable-cost requests. The change to our algorithm will be minor, adding an increment for a variable weight instead of 1.

Sliding Windows

The biggest change to our rate limiting is actually the process of changing our rate limiting from individual buckets into sliding windows. One way of understanding sliding window rate limiting is that each user is given a number of tokens that can be used over a period of time. When you run out of tokens, you don't get to make any more requests. And when a token is used, that token is restored (and can be used again) after the the time period has elapsed.

As an example, if you have 240 tokens that can be used in an hour, and you used 20 tokens at 6:05PM, you would only be able to make up to another 220 requests until 7:04PM. At 7:05PM, you would get those 20 tokens back (and if you made any other requests between 6:06PM and 7:05PM, those tokens would be restored later).

With our earlier rate limiting, we basically incremented counters, set an expiration time, and compared our counters to our limits. With sliding window rate limiting, incrementing a counter isn’t enough; we must also keep history about requests that came in so that we can properly restore request tokens.

One way of keeping a history, which is the method that we will use, is to imagine the whole window as being one large bucket with a single count (the window has a ‘duration’), similar to what we had before, with a bunch of smaller buckets inside it, each of which has their own individual counts. As an example, if we have a 1-hour window, we could use smaller buckets of 1 minute, 5 minutes, or even 15 minutes, depending on how precise we wanted to be, and how much memory and time we wanted to dedicate (more smaller buckets = more memory + more cleanup work). We will call the sizes of the smaller buckets their “precision.” You should notice that when duration is the same as precision, we have regular rate limits. You can see a picture of various precision buckets in a 1 hour window below.


As before, we can consider the smaller buckets to be labeled with individual times, say 6:00PM, 6:01PM, 6:02PM, etc. But as the current time becomes 7:00PM, what we want to do is to reset the count on the 6:00PM bucket to 0, adjust the whole window’s count, and re-label the bucket to 7:00PM. We would do the same thing to the 6:01PM bucket at 7:01PM, etc.

Data representation

We’ve now gotten to the point where we need to start talking about data representation. We didn’t really worry about representation before simply because we were storing a handful of counters per identifier. But now, we are no longer just storing 1 count for a 1 hour time slice, we could store 60 counts for a 1 hour time slice (or more if you wanted more precision), plus a timestamp that represents our oldest mini-bucket label.

For a simpler version of sliding windows, I had previously used a Redis LIST to represent the whole window, with each item in the LIST including both a time label, as well as the count for the smaller buckets. This can work for limited sliding windows, but restricts our flexibility when we want to use multiple rate limits (Redis LISTs have slow random access speeds).

Instead, we will use a Redis HASH as a miniature keyspace, which will store all count information related to rate limits for an identifier in a single HASH. Generally, for a sliding window of a specified duration and precision for an identifier, we will have the HASH stored at the key named by the identifier, with contents of the form:

<duration>:<precision>:o --> <timestamp of oldest entry>
<duration>:<precision>: --> <count of successful requests in this window>
<duration>:<precision>:<ts> --> <count of successful requests in this bucket>

For sliding windows where more than one sub-bucket has had successful requests, there can be multiple <duration>:<precision>:<ts> entries that would each represent one of the smaller buckets. For regular rate limits (not sliding window), the in-Redis schema is the same, though there will be at most one <duration>:<precision>:<ts> key, and duration is equal to precision for regular rate limits (as we mentioned before).

Because of the way we named the keys in our HASH, a single HASH can contain an arbitrary number of rate limits, both regular and windowed, without colliding with one another.

Putting it all together

And finally, we are at the fun part; actually putting all of these ideas together. First off, we are going to use a specification for our rate limits to simultaneously support regular and sliding window rate limits, which looks a lot like our old specification.

One limit is: [duration, limit, precision], with precision being optional. If you omit the precision option, you get regular rate limits (same reset semantics as before). If you include the precision option, then you get sliding window rate limits. To pass one or more rate limits to the Lua script, we just wrap the series of individual limits in a list: [[duration 1, limit 1], [duration 2, limit 2, precision 2], ...], then encode it as JSON and pass it to the script.

Inside the script we need to make two passes over our limits and data. Our first pass cleans up old data while checking whether this request would put the user over their limit, the second pass increments all of the bucket counters to represent that the request was allowed.

To explain the implementation details, I will be including blocks of Lua that can be logically considered together, describing generally what each section does after. Our first block of Lua script will include argument decoding, and cleaning up regular rate limits:

local limits = cjson.decode(ARGV[1])
local now = tonumber(ARGV[2])
local weight = tonumber(ARGV[3] or '1')
local longest_duration = limits[1][1] or 0
local saved_keys = {}
-- handle cleanup and limit checks
for i, limit in ipairs(limits) do

    local duration = limit[1]
    longest_duration = math.max(longest_duration, duration)
    local precision = limit[3] or duration
    precision = math.min(precision, duration)
    local blocks = math.ceil(duration / precision)
    local saved = {}
    table.insert(saved_keys, saved)
    saved.block_id = math.floor(now / precision)
    saved.trim_before = saved.block_id - blocks + 1
    saved.count_key = duration .. ':' .. precision .. ':'
    saved.ts_key = saved.count_key .. 'o'
    for j, key in ipairs(KEYS) do

        local old_ts = redis.call('HGET', key, saved.ts_key)
        old_ts = old_ts and tonumber(old_ts) or saved.trim_before
        if old_ts > now then
            -- don't write in the past
            return 1
        end

        -- discover what needs to be cleaned up
        local decr = 0
        local dele = {}
        local trim = math.min(saved.trim_before, old_ts + blocks)
        for old_block = old_ts, trim - 1 do
            local bkey = saved.count_key .. old_block
            local bcount = redis.call('HGET', key, bkey)
            if bcount then
                decr = decr + tonumber(bcount)
                table.insert(dele, bkey)
            end
        end

        -- handle cleanup
        local cur
        if #dele > 0 then
            redis.call('HDEL', key, unpack(dele))
            cur = redis.call('HINCRBY', key, saved.count_key, -decr)
        else
            cur = redis.call('HGET', key, saved.count_key)
        end

        -- check our limits
        if tonumber(cur or '0') + weight > limit[2] then
            return 1
        end
    end
end

Going section by section though the code visually, where a blank line distinguishes individual sections, we can see 6 sections in the above code:
  1. Argument decoding, and starting the for loop that iterates over all rate limits
  2. Prepare our local variables, prepare and save our hash keys, then start iterating over the provided user identifiers (yes, we still support multiple identifiers for non-clustered cases, but you should only pass one identifier for Redis Cluster)
  3. Make sure that we aren’t writing data in the past
  4. Find those sub-buckets that need to be cleaned up
  5. Handle sub-bucket cleanup and window count updating
  6. Finally check the limit, returning 1 if the limit would have been exceeded
Our second and last block of Lua operates under the precondition that the request should succeed correctly, so we only need to increment a few counters and set a few timestamps:

-- there is enough resources, update the counts
for i, limit in ipairs(limits) do
    local saved = saved_keys[i]

    for j, key in ipairs(KEYS) do
        -- update the current timestamp, count, and bucket count
        redis.call('HSET', key, saved.ts_key, saved.trim_before)
        redis.call('HINCRBY', key, saved.count_key, weight)
        redis.call('HINCRBY', key, saved.count_key .. saved.block_id, weight)
    end
end

-- We calculated the longest-duration limit so we can EXPIRE
-- the whole HASH for quick and easy idle-time cleanup :)
if longest_duration > 0 then
    for _, key in ipairs(KEYS) do
        redis.call('EXPIRE', key, longest_duration)
    end
end

return 0

Going section by section one last time gets us:
  1. Start iterating over the limits and grab our saved hash keys
  2. Set the oldest data timestamp, and update both the window and buckets counts for all identifiers passed
  3. To ensure that our data is automatically cleaned up if requests stop coming in, set an EXPIRE time on the keys where our hash(es) are stored
  4. Return 0, signifying that the user is not over the limit

Optional fix: use Redis time

As part of our process for checking limits, we fetch the current unix timestamp in seconds. We use this timestamp as part of the sliding window start and end times and which sub-bucket to update. If clients are running on servers with reasonably correct clocks (within 1 second of each other at least, within 1 second of the true time optimally), then there isn’t much to worry about. But if your clients are running on servers with drastically different system clocks, or on systems where you can’t necessarily fix the system clock, we need to use a more consistent clock.

While we can’t always be certain that the system clock on our Redis server is necessarily correct (just like we can’t for our other clients), if every client uses the time returned by the TIME command from the same Redis server, then we can be reasonably assured that clients will have fairly consistent behavior, limited to the latency of a Redis round trip with command execution.

As part of our function definition, we will offer the option to use the result of the TIME command instead of system time. This will result in one additional round trip between the client and Redis to fetch the time before passing it to the Lua script.

Add in our Python wrapper, which handles the optional Redis time and request weight parameters, and we are done:

def over_limit_sliding_window(conn, weight=1, limits=[(1, 10), (60, 120), (3600, 240, 60)], redis_time=False):
    if not hasattr(conn, 'over_limit_sliding_window_lua'):
        conn.over_limit_sliding_window_lua = conn.register_script(over_limit_sliding_window_lua_)

    now = conn.time()[0] if redis_time else time.time()
    return conn.over_limit_sliding_window_lua(
        keys=get_identifiers(), args=[json.dumps(limits), now, weight])

If you would like to see all of the rate limit functions and code in one place, including the over_limit_sliding_window() Lua script with wrapper, you can visit this Github gist.

Wrap up and conclusion

Congratulations on getting this far! I know, it was a slog through problems and solutions, followed by a lot of code, and now after seeing all of it I get to tell you what you should learn after reading through all of this.

Obviously, the first thing you should get out of this article is an implementation of sliding window rate limiting in Python, which is trivially ported to other languages -- all you need to do is handle the wrapper. Just be careful when sending timestamps, durations, and precision values to the script, as the EXPIRE call at the end expects all timestamp values to be in seconds, but some languages natively return timestamps as milliseconds instead of seconds.

You should also have learned that performing rate limiting with Redis can range from trivial (see our first example in part 1) to surprisingly complex, depending on the features required, and how technically correct you want your rate limiting to be. It also turns out that the problems that were outlined at the beginning of this article aren’t necessarily deal-breakers for many users, and I have seen many implementations similar to the over_limit_multi_lua() method from part 1 that are perfectly fine for even heavy users*. Really it just means that you have a choice about how you want to rate limit.

And finally, you may also have learned that you can use Redis hashes as miniature keyspaces to collect data together. This can be used for rate limiting as we just did, as well as a DB row work-alike (the hash keys are like named columns, with values the row content), unique (but unsorted) indexes (i.e. email to user id lookup table, id to encoded data lookup table, ...), sharded data holders, and more.

For more from me on Redis and Python, you can check out the rest of my blog at dr-josiah.com.

* When Twitter first released their API, they had a per-hour rate limit that was reset at the beginning of every hour, just like our most basic rate limiter from part 1. The current Twitter API has a per-15 minute rate limit, reset at the beginning of every 15 minute interval (on the hour, then 15, 30, and 45 minutes after the hour) for many of their APIs. (I have no information on whether Twitter may or may not be using Redis for rate limiting, but they have admitted to using Redis in some capacity by virtue of their release of Twemproxy/Nutcracker).

Monday, November 3, 2014

Introduction to rate limiting with Redis [Part 1]

This article first appeared on October 9, 2014 over on Binpress at this link. I am reposting it here so my readers can find it easily.

Over the years, I've written several different rate limiting methods using Redis for both commercial and personal projects. This two-part tutorial intends to cover two different but related methods of performing rate limiting in Redis using standard Redis commands and Lua scripting. Each method expands the number of use-cases for rate limiting, and cleans up some of the rougher edges of previous rate limiters.

This post assumes some experience with Python and Redis, and to a lesser extent Lua, but new users still reading docs should be okay.

Why rate limit?


Most uses of rate limiting on the web today are generally intended to limit the effect that someone can have on a given platform. Whether it is API limits at Twitter, posting limits at Reddit, or posting limits at StackOverflow, some limit resource utilization, and others limit the effect a spammer account can have. Whatever the reason, let's start with saying that we need to count actions as they happen, and we need to prevent an action from happening if the user has reached or gone over their limit. Let's start with the plan of building a rate limiter for an API where we need to restrict users to 240 requests per hour per user.

We know that we need to count and limit a user, so let's get some utility code out of the way. First, we need to have a function that gives us one or more identifiers for the user performing an action. Sometimes that is just a user id, other times it's the remote IP address; I usually use both when available, and at least IP address if the user hasn't logged in yet. Below is a function that gets the IP address and user id (when available) using Flask with the Flask-Login plugin.

from flask import g, request

def get_identifiers():
    ret = ['ip:' + request.remote_addr]
    if g.user.is_authenticated():
        ret.append('user:' + g.user.get_id())
    return ret

Just use a counter

Now that we have a function that returns a list of identifiers for an action, let's start counting and limiting. One of the simplest rate limiting methods available in Redis starts by taking the times of the actions as they happen, and buckets actions into ranges of times, counting them as they occur. If the number of actions in a bucket exceeds the limit, we don't allow the action. Below is a function that performs the rate limiting using an automatically-expiring counter that uses 1 hour buckets.

import time

def over_limit(conn, duration=3600, limit=240):
    bucket = ':%i:%i'%(duration, time.time() // duration)
    for id in get_identifiers():
        key = id + bucket

        count = conn.incr(key)
        conn.expire(key, duration)
        if count > limit:
            return True

    return False

This function shouldn't be too hard to understand; for each identifier we increment the appropriate key in Redis, set the key to expire in an hour, and if the count is more than the limit, we return True, signifying that we are over the limit. Otherwise we return False.

And that's it. Well, sort of. This gets us past our initial goal of having a basic rate limiter to limit each user to 240 requests per hour. But reality has a tendency to catch us when we aren't looking, and clients using the API have noticed that their limit is reset at the top of every hour. Now users have started making all 240 requests in the first few seconds they can, so all of our work limiting requests is wasted, right?

Multiple bucket sizes

Our initial rate limiting on a per-hour basis was successful in that it limited users on an hourly basis, but users started using all of their API requests as soon as they could (at the beginning of the hour). Looking at the problem, it seems almost obvious that in addition to a per-hour rate limit, we should probably also have a per-second and/or per-minute rate limit to smooth out peak request rates.

Let's say that we determined that 10 requests per second, 120 requests per minute, and 240 requests per hour were fair enough to our users, and let us better distribute requests over time. We could simply re-use our earlier over_limit() function to offer this functionality.

def over_limit_multi(conn, limits=[(1, 10), (60, 120), (3600, 240)]):
    for duration, limit in limits:
        if over_limit(conn, duration, limit):
            return True
    return False

This will work for our intended use, but with 3 rate limit calls, which can result in two counter updates and two expire calls (one for each of IP and user keys), and we may need to perform 12 total round trips to Redis just to say whether someone is over their limit. One common method of minimizing the number of round trips to Redis is to use what is called 'pipelining'. Pipelining in the Redis context will send multiple commands to Redis in a single round trip, which can reduce overall latency.

Coincidentally, our over_limit() function is written in such a way that we could easily replace our INCR and EXPIRE calls with a single pipelined request to increment the count and update the key expiration. The updated function can be seen below, and cuts our number of round trips from 12 to 6 when combined with over_limit_multi().

def over_limit(conn, duration=3600, limit=240):
    # Replaces the earlier over_limit() function and reduces round trips with
    # pipelining.
    pipe = conn.pipeline(transaction=True)
    bucket = ':%i:%i'%(duration, time.time() // duration)
    for id in get_identifiers():
        key = id + bucket

        pipe.incr(key)
        pipe.expire(key, duration)
        if pipe.execute()[0] > limit:
            return True

    return False

Halving the number of round trips to Redis is great, but we are still performing 6 round trips just to say whether a user can make an API call. We could write a replacement over_limit_multi() that makes all increment and expire operations at once, checking the limits after, but the obvious implementation actually has a counting bug that can prevent users from being able to make 240 successful requests in an hour (in the worst-case, a client may experience 10 successful requests in an hour, despite making over 100 requests per second for the entire hour). This counting bug can be fixed with a second round trip to Redis, but lets instead shift our logic into Redis.

Counting correctly

Instead of trying to fix a fully pipelined version, we can use the ability to execute Lua scripts inside Redis to perform the same operation while also keeping to one round trip. The specific operations we are going to perform in Lua are almost the exact same operations as we were originally performing in Python. We are going to iterate over the limits themselves, and for each identifier, we are going to increment a counter, update the expiration time of the updated counter, then check to see if we are over the limit. We will also use a small Python wrapper around our Lua to handle argument conversion and to hide the details of script loading.

import json

def over_limit_multi_lua(conn, limits=[(1, 10), (60, 120), (3600, 240)]):
    if not hasattr(conn, 'over_limit_lua'):
        conn.over_limit_lua = conn.register_script(over_limit_multi_lua_)

    return conn.over_limit_lua(
        keys=get_identifiers(), args=[json.dumps(limits), time.time()])

over_limit_multi_lua_ = '''
local limits = cjson.decode(ARGV[1])
local now = tonumber(ARGV[2])
for i, limit in ipairs(limits) do
    local duration = limit[1]

    local bucket = ':' .. duration .. ':' .. math.floor(now / duration)
    for j, id in ipairs(KEYS) do
        local key = id .. bucket

        local count = redis.call('INCR', key)
        redis.call('EXPIRE', key, duration)
        if tonumber(count) > limit[2] then
            return 1
        end
    end
end
return 0
'''

With the section of code starting with 'local bucket', you will notice that our Lua looks very much like and performs the same operations as our original over_limit() function, with the remaining code handling argument unpacking and iterating over the individual limits.

Conclusion

At this point we have built a rate limiting method that handles multiple levels of timing granularity, can handle multiple identifiers for a single user, and can be performed in a single round trip between the client and Redis. We started from a single-bucket rate limiter to a rate limiter that can evaluate multiple limits simultaneously.

Any of the rate limiting functions discussed in this post are usable for many different applications. In part two, I'll cover a different way of approaching rate limiting, which rounds out the remaining rough edges in our rate limiter. Read it over on Binpress.

More detailed information on Lua scripting can be found in the help for the EVAL command at Redis.io.

Friday, June 20, 2014

Why thorium reactors are important

This post is the first of a series of posts that I've been wanting to write for a long time, but I haven't been able to pick from among the collection of topics that I wanted to write about. After taking a quick vote on what people actually want me to write about, this one got the most votes, so you get to read about why I think that we (as a society) should take hurcluean efforts to get liquid fluoride thorium reactors available as quickly as possible.

Disclaimer: before you invest $1 on what I say below, think, read, and verify.

Doom and gloom

Climate change is happening: 97% of climate scientists agree. The single best thing that we can do to address climate change is to substantially reduce our production of greenhouse gasses, which at present is primarily in the form of exhaust from the burning of fossil fuels. The problem is that fossil fuels are used as an energy source in our transport and electric grid, the combination of which allows for the efficient transport of people and goods, and allows for the supporting of a modern energy-consuming society (lights, heat, computers, communication, ...).

In order to truly replace fossil fuels, we must find a power generation technology that can be put in locations where both small to large power generation facilities already exist, and where we can replace the most polluting of vehicular transport (large transport ships). Cars like the Tesla, and those evolved from Tesla's patents will be available in the coming years, and we can put highways underground, keep people out of them, and use automated vehicular transport. Think automated Uber. Then you get 4-6x the vehicle density per lane, and adding lanes doesn't make traffic worse. :D

This requires power generation in the 10 megawatt - 10 gigawatt range, if we are looking to cover up to the current largest fossil fuel and current technology nuclear reactors. I'll explain why replacing current nuclear technology is important later.

Now enters a challenger

Liquid fluoride thorium reactor technology was first developed in the 1950's by a group at General Electric that were tasked with coming up with a nuclear generator that could be placed in an airplane, which was followed up by mostly the same group of people working at the Oak Ridge National Lab in the mid/late 60's. They ended up with a thorium-based technology that uses molten fluoride salts to carry the nuclear material (addressing partial burns, structural breakdown of fuel, ...), which allows for a 96%+ burning of the fuel - compared to the 1-5% efficiency of the best traditional nuclear reactors available today (quoted numbers vary from 1-2% to 3-5% for current tech, so I'm using the full range). One other feature is that you can introduce other spent nuclear fuels that are currently stored at hundreds of sites all over the world, and you can burn it. Yep, thorium plants can recycle nuclear waste.

One convenient win is that with proper work, there is no reason why liquid fluoride thorium reactor technology can't be scaled from smaller 10-20 megawatt reactors that can be sealed and buried for zero maintenance for a decade or more for small municipalities (though this wouldn't happen for a while, the economics need to make sense), up to replace the horribly polluting 90 megawatt-equivalent diesel engines in large ships (we'll need more nuclear engineers), to arrays of reactors for 10+ gigawatt power generation stations near our mega cities.

Why replace current reactor technology?

While "modern" reactors are generally safe (though there are several unsafe reactors currently being used around the world), the ugly part of current reactor technology is a combination of fuel efficiency and waste. There is immense amounts of energy stored in refined nuclear fuels, and with only 1-5% of that fuel being burned with current nuclear technology, you are left with a massive amount of waste that will be dangerously radioactive for the next 10,000 years or more. It is a scary problem, and thorium is a solution.

This nuclear waste storage problem is not going away. The only way we can really address the issue is to stop using current nuclear technology and find another solution. I claim that liquid fluoride thorium reactor technology is the best option we have right now, as not only can it actually generate the power we need economically (we need roughly 400 tons of thorium ore to power all of the energy needs of the US for one year, and there are 160,000 tons of ore that is economically accessible at current thorium prices), but it can burn the wastes that we need to get rid of in the long term anyway.

And the vast majority of the waste is generally considered safe in 300 years, compared to 10,000+ for typical reactors.

Why not solar/wind/hydro?

The technologies behind solar, wind, and hydro generation are all great, and wonderful, and they work. And it even turns out that the largest power generation stations in the world are hydro-electric plants. The major problem with these technologies is that they are difficult to scale. Specifically, we've more or less tapped the majority of the reasonably usable hydro power available, and the environmental fallout of flooding land is not always worth the power generated.

Solar is great (I was just looking into the cost viability of installing solar panels on my house), but it requires a large amount of land to be viable, and the cost of production is still high. Solar also has a nasty cousin, typically coal or natural gas power plants that support the grid when the sun isn't shining. When looking at thermal solar plants (instead of photovoltaics), you get the option for storing the heat for release at night, but thermal solar plants are still rare, and my research suggests that only one solar plant in the US has heat storage (in molten salt). Finally, the land and financial cost make solar difficult to scale up to utility-level requirements (currently maxing out at roughly 500 megawatts), and make it essentially unusable as a mobile power generation method for commercial transport ships (you can check out The Guardian's article on shipping pollution for why this is necessary).

Wind is also great, but there are limited locations where it makes sense, and there are political challenges with locating windmills near occupied land. It's also not a mobile technology, so is not viable for addressing the container ship problem. Wind power also has all of the same problems with consistency as solar without the simple heat storage method of thermal solar.

Ultimately, what we don't need is more "bursting" power sources that require secondary generation technologies, and which partially compromise the overall environmental benefit of using renewable sources like wind and solar.

What about fusion?

There have been recent reports that hydrogen-fusion based nuclear plants are now to the point where there are obvious paths of research and development that will lead directly to commercially viable power generation in the next 10-20 years. This is wonderful, but it is still all theoretical. The research isn't done, so the opportunity doesn't yet exist. These types of plants are also of the form that they only make sense to build huge plants, which makes manufacture of the plants a difficult investment, but also prevents the reactors from being mobile.

That's not to say that we couldn't end up in a future world where fusion power is available as a standard on-land reactor technology, while huge ships (shipping, navies, etc.) run thorium plants. But given the current state of technology, and how little money (relatively speaking) is being spent on thorium generator technology, there are some early and obvious wins that can be had by investing in thorium today, while the long-term benefits of fusion power are still attainable and can be developed on a parallel path.

But cars!

Cars will go electric with batteries. Think an entry-level Tesla. Also think battery swap technology for sub 5-minute recharges. We just need to make the technology cheap enough to be available to everyone. With Elon Musk releasing Tesla's patents, this is possible!

Long term destination

One of the reasons why I really like the idea of liquid fluoride thorium reactor technology is that it is the first obvious step towards reduction/elimination of fossil fuel use. That solves our short/mid-term global warming problem. But it also solves our mid-long term power problem when we finally get around to putting people on other planets in our solar system.

I know, I'm getting a few decades ahead of things here, but if we have any chance in hell of having a viable settlement on *any* other planet in this solar system (never mind in another solar system), we need a compact and efficient power generation technology whose fuel can be found on other planets. Thorium is available on the Moon, Mars, and Mercury, which are the three most likely locations for an off-world human settlement with current or soon-available technology. No, solar isn't enough (especially on Mars, which gets significantly less sunlight that we get here on Earth).

To make our lives even easier, we can actually detect whether a planet has available thorium, how much, and where it is located. And we can do this from space. Yes, we can scan for thorium from space. With current technology. Could thorium get any better?

Why isn't it done yet?

The simple answer is because insufficient time and money have been spent to make it commercially viable. There are various historical, political, and strictly human reasons for this. From it not being usable for generating materials usable in nuclear weapons, to it not being preferred by the guy who decided to use uranium-based reactors in the US navy's largest ships, to it being a different technology than the currently understood standard nuclear technology. These are all valid excuses, but that doesn't mean that we can't go beyond these excuses and make it happen.

It also doesn't mean that we can't move past these excuses and realize that the ultimate destination of a clean and readily available power source is within our grasp! Not in 50 years, but easily within 5 years for a test reactor, and 10 years for a serious commercial reactor in the gigawatt+ range. The challenges for the liquid fluoride thorium reactor design are strictly of the materials science, chemistry, regulatory, and investment kind. And guess what? The materials science and chemistry part have been worked on for the last half-dozen+ years, and are effectively solved. Whether or not regulators and investors are willing to make it happen is what is the real question.

Some detractors will claim it's still 40-70 years out, but to them I would just point out that the original nuclear reactors didn't take 40-70 years from conception through design, test reactors, and final construction. From first commissioning of the design for the USS Nautilus in December 1949 until it sailed under nuclear power in January 1955, there were barely more than 5 years of concentrated effort to go from the order to design it to it being built and used to travel farther and faster underwater than any other submarine before.

There are also several world governments that are working on thorium-based liquid salt reactor technology, among which include: USA, China, Australia, Czech Republic, Russia, India, and the UK. It's going to happen, and it can't happen soon enough. What is $10 billion in the next 5-10 years if it means that everyone would have access to inexpensive, clean, life-altering electric supply. With electricity, you have water. With water, you have food, trees, and can bring more rain for more water. It is an amazing virtuous cycle, and can start reversing the CO2 levels in our atmosphere *now*.

Which, incidentally, might very well be the most important thing that we can do if we want to avoid a global drought. It is scary stuff, and we need to reforest the planet yesterday. This gets us there globally, and will teach us what we need for when we try to terraform Mars in the next 30-40 years. If we make an effort, we could seriously be living on Mars in our lifetimes.

Where can I get more information?

First off, watch this hour-long video: Dr. Joe Bonometti explains Thorium. That should give a pretty broad overview of what is going on with the history, economics, etc., of the reactor technology, and it includes a link to Energy From Thorium. Generally, I like most of the information provided on the site, but there are several articles that I've read there that are a long way from being professional enough from an advocacy perspective. Also, there seems to be more interest in raising awareness rather than actually doing things (like getting a reactor built and used).

A group of MIT graduates at Transatomic Power have been working on liquid fluoride thorium reactor technology for a few years now and have released a whitepaper describing their design in detail. I don't know their funding situation, or what sorts of regulatory challenges they are facing, but they do have a collection of experienced advisors helping them.

Another group has been working for and advocating the design at Flibe Energy for several years now, and are targeting the military base market, which seems to offer a potential bypass of both the investment and regulatory issues that may be blocking other efforts. They've also got a collection of great advisors and top-notch founding team members. They haven't released their design, but I think it would be foolish to believe that they haven't at least looked at Transatomic Power's design for ideas and possible improvements.

There are also links and information from one of a dozen different Wikipedia pages, for individual thorium reactors, including existing molten salt thorium reactors, information about companies, etc. Start from the page on Liquid fluoride thorium reactor and follow the links

What can you do?

Unfortunately, I don't have a list of things that you can do to help in the advancement of liquid fluoride thorium reactor technology. Heck, in writing this article I'm hard-pressed to find something for *me* to do beyond just writing a general advocacy and quick overview of why I think that thorium is the future. But as a start, we can become educated about what is going on, and attempt to understand and spread the fact that there are no other technologies that could offer such a change in energy generation technologies in such a short timespan. Nothing.

Update:

Okay, thanks to Eric Snow, from the comment, I have just learned about a potentially viable fusion reactor technology. This is what could get us there too (thorium is still viable for helping to burn up old nuclear waste, so keep looking and reading). First, read this article on Gizmag, and then go here to donate. I donated last night. And then I just read this article, which was written May 31, 2014: Record setting temperatures of roughly 1.0 - 1.8 billion degrees kelvin. This could happen people.

Wednesday, June 18, 2014

A message to (almost) everyone

Just so that I don't get caught out on this later in life, thinking that I might not get to say it later, I just wanted to tell you, Thank You. By reading these words, you have in some small way contributed to some part of me living on beyond my death. And all someone can really hope for is to be remembered in the future.

I wish that everyone would record a bit of themselves every day, and that future generations be required to read/listen to these auto-biographies as a matter of cultural heritage and learning. Because to be forgotten is such a great shame. Yet billions are already forgotten.

To the point: Thank You. By interacting with me through comments, conversations over the phone, conversations in person, arguments, discussions, anything and everything, you have helped make me who I am today. Your kindness, love, contempt, frustration, and all other such things have helped me along the way to become who I am. And honestly, I like me right now. So, thank you for helping me like me.

I have an amazing life. My wife is amazing, my first child, Mikela is a wonderful little girl. So sweet, playful, outgoing, ... everything a dad could want. And we are happy to report that we are expecting another little girl later this year! I am sure she will also be a wonderful little girl, especially with Mikela as an older sister. I am not sure I could be happier. :D

My work is going well, the boss-man just had a major insight into what will really drive Zeconomy's adoption, and I can't wait to build it.

I also am having an amazing burst of creativity in the last several months since starting to lift weights and run again, and I can't help but remember that I dream and am more creative when I work out regularly. On the creativity side of things, vote on some stories. On the physical side of things, I don't look bad for my age. Just need a haircut and a shave.

So yeah, thank you everyone for helping me get here, helping me be who I am. Oh... except for bullies when I was in school: you get no thanks from me.

Wednesday, April 30, 2014

Soylent and me

Like thousands of others last year, I saw the Soylent crowd funding last summer and the concept rang true to me. Sometimes you just need to eat. And if you are just going to eat to address your biological need to eat, it doesn't make sense to spend more time, money, and effort than absolutely necessary. Enter Soylent, which is intended to be a nutritionally-balanced meal replacement powder. How balanced?

Look at that nutrition information!

Why Soylent?

Over the last year or so, my wife and I have been fortunate enough to first have her mother stay with us to help take care of our child, then for us to find a nanny within a few days of the mother-in-law going home. Both my mother-in-law and our nanny are ethnic Chinese, but who lived their lives in Malaysia (our nanny having come to America about 30 years ago). Also fortunately, both are very good at cooking authentic Malaysian-Chinese food (the nanny having owned and operated a Chinese restaurant for 15 years), so my wife has been very happy.

Unfortunately for me, I'm sort-of a picky eater. Which isn't to say that I don't eat many different types of food (I like at least a half-dozen dishes from just about every type of cuisine that has a restaurant in Los Angeles), it's just that I'm very opinionated about the food that I eat. And when I'm confronted with flavors that are different enough from what I enjoy eating, I acknowledge that the food is not for me, and I find something else to eat.

When you take a picky eater (me) and add authentic Malaysian-Chinese home cooking 5-6 nights a week, you get a predictable outcome: me making my own dinner. But I am also very lazy, so most nights (4-5/week) I end up making myself a peanut butter and honey sandwich. It was getting a bit old last summer when I saw the Kickstarter for Soylent, and my thoughts were: if it's any good, I've at least got something balanced enough to mix it up a bit with the PB+H sandwich.

How is it?

I received my 1-month supply of Soylent yesterday, after the pitcher and measuring scoop arrived last week. It would seem that for some reason or another I chose the "vegan" version, which differs from the non-vegan version by a lack of a supplementary oil included in the package. Not a huge deal one way or another, but great for me because it arrived yesterday as opposed to mid-May or later.

For some taste context, over the years I've bought and consumed a variety of different types of protein powders (mixed with water, watered-down fruit juice, or a watered-down sports drink), typically just before or after going to the gym. If you've ever had one of those protein shakes, you probably already know that they are pretty awful.

Long story short: Soylent is better than any protein powder I've ever eaten. (note that Soylent is not a protein powder)

Overall, the taste is fairly neutral, but still has a small tinge of what I'm going to call "whey uck". If you've had one of the whey protein powders compared with a non-whey powder, you will know what I mean. I won't say that it's a deal-breaker, perhaps being on the order of 5-10% as strong as the weakest-tasting uck I've had in a whey protein powder, but it's still there. Note: the 'whey uck' taste goes away when the drink is cold.

Texture-wise, it's a bit grittier than I would have preferred, but it's not unpleasant. Note that if you let it sit in the fridge (or anywhere else), it will start separating in a few hours, so be prepared to shake or stir before drinking if you are going to let it sit.

So far I've had two 12-ounce glasses. A glass and a half yesterday for dinner, and half a glass today with a steak+salad+glass of milk the wife decided to make for us. Surprisingly, I wasn't particularly hungry yesterday evening after drinking the Soylent (roughly 6:45PM), and I ate a typical-sized snack for me (11:45PM-midnight) before going to bed (about 1AM).

Being that I've not been eating it as my only food source, and I've only been eating it for the last 2 days, I can't say that I'm experiencing any physical or mental changes that some others have reported after long-term consumption, but it's also entirely possible that my diet (which normally includes 24-36+ ounces of milk/day) was balanced enough for my particular physiology and metabolism for it not to make a huge difference. Time will tell.

Final thoughts

As a very balanced meal replacement, it's definitely passable in terms of taste and texture. The dump + add water + shake process makes it difficult to beat in terms of preparation. Cost-wise, I'm not sure you could cover all of the nutritional bases that Soylent does with regular food at the ~$3 per meal price.

So far I've had a positive experience, and I hope that I don't grow to hate Soylent like I've grown to hate protein powders.

Update November 18, 2014:
For roughly the last 5-6 months, Soylent has actually replaced my breakfast of raisin bran + milk in the mornings during the week. I can get a little over 5 glasses of Soylent out of one package/pitcher mix, which gets me M-F on one pitcher. I know, the packaging says "consume within 48 hours", but I don't and I'm fine. It gets thicker later in the week, which gives me a little extra for my tough gym day during Friday lunch.

On the occasion where I have had cereal in the mornings since switching, I will usually get halfway through the bowl, drink the milk, and toss the remaining cereal. I don't know if it is the taste, but it feels like my body is telling my brain, "this crap isn't worth your time".

In terms of dinners, I will still sometimes have my PB + H sandwich (along with my requisite glass of milk), and even occasionally some Soylent,. But the wife has started to push for a bit more variety in her diet as well, so a few nights a week I'll get something more Americanized at home and/or pick up some takeout on my way.

I opened my last box a couple weeks ago, and I've got a few more packages left, so I need to buy more. I still don't feel substantially different in my energy or otherwise that couldn't be attributed to my going to the gym consistently for the last 10+ months, and on the occasion where I don't eat Soylent in the mornings for one reason or another, replacing it with something else (leftover pizza, ham + egg + cheese sandwich from the Trimana next to my work, 20 ounces of milk, cheesy scrambled eggs, ...) doesn't leave me feeling any worse than when I eat the Soylent.

Thursday, April 17, 2014

Heartbleed and false equivalence

This morning I had the opportunity to listen to a bit of NPR, where a piece on Heartbleed was being discussed by Larry Mantle on Air Talk. You can read and download the piece: New report: The internet is too interconnected to fail. Medium story short: the segment and Larry have the apparent opinion that the way the collective technology industry handles these kinds of issues is "ineffective", comparing it to the the financial meltdown and its effects. I disagree, here's why.

Heartbleed is a huge "hair on fire" situation, no competent person on the tech side of things disagrees. Easily tens of thousands, perhaps hundreds of thousands of IT professionals of one kind or another spent hours or days each dealing with the fallout of Heartbleed. The total financial cost of updating/replacing every piece of software/hardware that suffers from this bug globally is easily into the $billions, spread out over millions of organizations. Some fixes were as simple as updating and restarting a few pieces of core software, others required what can only be described as major surgery. And what is the result?

The internet is recovering pretty quickly. Unlike the financial crisis, the direct cause of Heartbleed can be narrowed down to literally a half dozen lines of code or less. That code has been fixed, in some cases the "heartbeat" feature that was the source of the bug extracted completely, millions of copies of OpenSSL have been updated or replaced, and a variety of professionals who build safe and secure software have come together to make the situation better - to try and prevent this kind of thing from ever happening again. Are there still vulnerable systems? Yes. Will there be vulnerable systems in 6 months? Probably, just like there are still people using Windows XP with Internet Explorer 6. But thousands of technologically adept professionals are pushing as hard as they can to get as much fixed as absolutely possible.

Thinking back to what went on after the financial meltdown, can you see any difference? I can. The biggest that I can see is that none of the fundamental causes of the meltdown have ever been addressed. Banks are still extraordinarily under-regulated (as the Libor scandal showed), banks are still too large, banks are still trying to maximize profit via borrowing at low rates from the fed, banks are still leveraging, ... The only thing that has sort-of changed is that now the commoditization of consumer loans doesn't seem to happen anymore, and credit reporting companies are now paying better attention in those situations where commoditization occurs. But all of the fundamental things that banks do to maximize profit and to grow larger are still allowed.

While Larry may see parallels, I see almost none. The technologies underlying the internet that keeps us all connected to one another are supported and developed by passionate people who have every reason possible there could be to create and maintain a safe and secure environment for everyone. This point cannot be overstated. The people working on this do it because they know how important it is for it to be available to everyone. Some volunteer and some are underpaid. That's right, some people who design and build the core technology behind securing the majority of web servers in the world volunteer their time to make it better for everyone. If you haven't read this post from the finance guy at OpenSSL, you should.

On the other side of the coin, banks and the finance industry have no good reason to change anything, as they are still extracting immense value from the way the system is being run. Incomes among those in the finance industry are as high as they've ever been. And nothing has fundamentally changed in the finance industry to prevent anything like the meltdown from happening again.

If there is one thing to be learned from Heartbleed, it's that the global technology industry is pretty well equipped to deal with catastrophic failure. We (those of us in tech) fix the problem, prepare a variety of mitigation strategies to prevent a similar problem from happening again, and we move on to bigger and better things. We aren't done yet, but we'll get there.

Can you imagine if the finance industry was willing to fix itself to the same extent? I can't, but I'm a realist.

Thursday, March 6, 2014

Simplifying Postgres permissions

If you are anything like me, you would prefer to deal with as few headaches when developing software as absolutely possible. One recent headache that decides to revisit me occasionally is Postgres permissions. For users/platforms where you need a single user (that isn't the "postgres" user), while offering password authentication with both local and remote access, keep reading to find out how you can ensure that your permissions and table ownership are sane-ish. This is my gathering together of other information from docs, blogs, and tech articles.

First, you need to install Postgres. I'll assume you've already done that.

Next, you need to create the user that will be performing all of your operations (schema updates, data manipulation, etc.).
$ sudo -u postgres psql
# create user DBUSER with password 'PASSWORD';
# create database DATABASE;
# grant all privileges on database DATABASE to DBUSER;
# \q
$ 
After creating and updating your initial database, you now need to update your Postgres configuration to ensure that you can actually log in with your user.
$ sudo vim /etc/postgresql/9.1/main/pg_hba.conf

Scroll to the bottom of your configuration file. About 10 lines up (at least in Postgres 9.1), you should see a line that reads:

local   all             all                               peer

Change the last column to read "md5":

local   all             all                               md5

Then save and quit the editor.

After your configuration is updated, you need to restart Postgres:
$ sudo /etc/init.d/postgresql restart

From here, if you ever want to modify your schema, use data from a client, etc., you only need to log in with the user you created earlier, and everything will just work. You will be able to alter your schema as necessary, select, update, insert, and delete. In a lot of ways, this behaves a lot like a just-installed MySQL configuration after you've set a username/password for the root user (and only ever use the root user), but where the user *only* has access to manipulate the one database. This can be very useful when you've got your one Postgres install for personal projects, but want to silo your different projects into different named databases.

If you've already got some tables created with the Postgres user, and you need to change ownership, the accepted answer over on StackOverflow can get you all fixed up.

Long-term, you may want to create additional read/write users (no DDL updates), read-only users, etc., but if you just want to get a new application going, this can get you there.

Please upvote on Reddit and Hacker News.