Friday, June 20, 2014

Why thorium reactors are important

This post is the first of a series of posts that I've been wanting to write for a long time, but I haven't been able to pick from among the collection of topics that I wanted to write about. After taking a quick vote vote on what people actually want me to write about, this one got the most votes, so you get to read about why I think that we (as a society) should take hurcluean efforts to get liquid fluoride thorium reactors available as quickly as possible.

Disclaimer: before you invest $1 on what I say below, think, read, and verify.

Doom and gloom

Climate change is happening: 97% of climate scientists agree. The single best thing that we can do to address climate change is to substantially reduce our production of greenhouse gasses, which at present is primarily in the form of exhaust from the burning of fossil fuels. The problem is that fossil fuels are used as an energy source in our transport and electric grid, the combination of which allows for the efficient transport of people and goods, and allows for the supporting of a modern energy-consuming society (lights, heat, computers, communication, ...).

In order to truly replace fossil fuels, we must find a power generation technology that can be put in locations where both small to large power generation facilities already exist, and where we can replace the most polluting of vehicular transport (large transport ships). Cars like the Tesla, and those evolved from Tesla's patents will be available in the coming years, and we can put highways underground, keep people out of them, and use automated vehicular transport. Think automated Uber. Then you get 4-6x the vehicle density per lane, and adding lanes doesn't make traffic worse. :D

This requires power generation in the 10 megawatt - 10 gigawatt range, if we are looking to cover up to the current largest fossil fuel and current technology nuclear reactors. I'll explain why replacing current nuclear technology is important later.

Now enters a challenger

Liquid fluoride thorium reactor technology was first developed in the 1950's by a group at General Electric that were tasked with coming up with a nuclear generator that could be placed in an airplane, which was followed up by mostly the same group of people working at the Oak Ridge National Lab in the mid/late 60's. They ended up with a thorium-based technology that uses molten fluoride salts to carry the nuclear material (addressing partial burns, structural breakdown of fuel, ...), which allows for a 96%+ burning of the fuel - compared to the 1-5% efficiency of the best traditional nuclear reactors available today (quoted numbers vary from 1-2% to 3-5% for current tech, so I'm using the full range). One other feature is that you can introduce other spent nuclear fuels that are currently stored at hundreds of sites all over the world, and you can burn it. Yep, thorium plants can recycle nuclear waste.

One convenient win is that with proper work, there is no reason why liquid fluoride thorium reactor technology can't be scaled from smaller 10-20 megawatt reactors that can be sealed and buried for zero maintenance for a decade or more for small municipalities (though this wouldn't happen for a while, the economics need to make sense), up to replace the horribly polluting 90 megawatt-equivalent diesel engines in large ships (we'll need more nuclear engineers), to arrays of reactors for 10+ gigawatt power generation stations near our mega cities.

Why replace current reactor technology?

While "modern" reactors are generally safe (though there are several unsafe reactors currently being used around the world), the ugly part of current reactor technology is a combination of fuel efficiency and waste. There is immense amounts of energy stored in refined nuclear fuels, and with only 1-5% of that fuel being burned with current nuclear technology, you are left with a massive amount of waste that will be dangerously radioactive for the next 10,000 years or more. It is a scary problem, and thorium is a solution.

This nuclear waste storage problem is not going away. The only way we can really address the issue is to stop using current nuclear technology and find another solution. I claim that liquid fluoride thorium reactor technology is the best option we have right now, as not only can it actually generate the power we need economically (we need roughly 400 tons of thorium ore to power all of the energy needs of the US for one year, and there are 160,000 tons of ore that is economically accessible at current thorium prices), but it can burn the wastes that we need to get rid of in the long term anyway.

And the vast majority of the waste is generally considered safe in 300 years, compared to 10,000+ for typical reactors.

Why not solar/wind/hydro?

The technologies behind solar, wind, and hydro generation are all great, and wonderful, and they work. And it even turns out that the largest power generation stations in the world are hydro-electric plants. The major problem with these technologies is that they are difficult to scale. Specifically, we've more or less tapped the majority of the reasonably usable hydro power available, and the environmental fallout of flooding land is not always worth the power generated.

Solar is great (I was just looking into the cost viability of installing solar panels on my house), but it requires a large amount of land to be viable, and the cost of production is still high. Solar also has a nasty cousin, typically coal or natural gas power plants that support the grid when the sun isn't shining. When looking at thermal solar plants (instead of photovoltaics), you get the option for storing the heat for release at night, but thermal solar plants are still rare, and my research suggests that only one solar plant in the US has heat storage (in molten salt). Finally, the land and financial cost make solar difficult to scale up to utility-level requirements (currently maxing out at roughly 500 megawatts), and make it essentially unusable as a mobile power generation method for commercial transport ships (you can check out The Guardian's article on shipping pollution for why this is necessary).

Wind is also great, but there are limited locations where it makes sense, and there are political challenges with locating windmills near occupied land. It's also not a mobile technology, so is not viable for addressing the container ship problem. Wind power also has all of the same problems with consistency as solar without the simple heat storage method of thermal solar.

Ultimately, what we don't need is more "bursting" power sources that require secondary generation technologies, and which partially compromise the overall environmental benefit of using renewable sources like wind and solar.

What about fusion?

There have been recent reports that hydrogen-fusion based nuclear plants are now to the point where there are obvious paths of research and development that will lead directly to commercially viable power generation in the next 10-20 years. This is wonderful, but it is still all theoretical. The research isn't done, so the opportunity doesn't yet exist. These types of plants are also of the form that they only make sense to build huge plants, which makes manufacture of the plants a difficult investment, but also prevents the reactors from being mobile.

That's not to say that we couldn't end up in a future world where fusion power is available as a standard on-land reactor technology, while huge ships (shipping, navies, etc.) run thorium plants. But given the current state of technology, and how little money (relatively speaking) is being spent on thorium generator technology, there are some early and obvious wins that can be had by investing in thorium today, while the long-term benefits of fusion power are still attainable and can be developed on a parallel path.

But cars!

Cars will go electric with batteries. Think an entry-level Tesla. Also think battery swap technology for sub 5-minute recharges. We just need to make the technology cheap enough to be available to everyone. With Elon Musk releasing Tesla's patents, this is possible!

Long term destination

One of the reasons why I really like the idea of liquid fluoride thorium reactor technology is that it is the first obvious step towards reduction/elimination of fossil fuel use. That solves our short/mid-term global warming problem. But it also solves our mid-long term power problem when we finally get around to putting people on other planets in our solar system.

I know, I'm getting a few decades ahead of things here, but if we have any chance in hell of having a viable settlement on *any* other planet in this solar system (never mind in another solar system), we need a compact and efficient power generation technology whose fuel can be found on other planets. Thorium is available on the Moon, Mars, and Mercury, which are the three most likely locations for an off-world human settlement with current or soon-available technology. No, solar isn't enough (especially on Mars, which gets significantly less sunlight that we get here on Earth).

To make our lives even easier, we can actually detect whether a planet has available thorium, how much, and where it is located. And we can do this from space. Yes, we can scan for thorium from space. With current technology. Could thorium get any better?

Why isn't it done yet?

The simple answer is because insufficient time and money have been spent to make it commercially viable. There are various historical, political, and strictly human reasons for this. From it not being usable for generating materials usable in nuclear weapons, to it not being preferred by the guy who decided to use uranium-based reactors in the US navy's largest ships, to it being a different technology than the currently understood standard nuclear technology. These are all valid excuses, but that doesn't mean that we can't go beyond these excuses and make it happen.

It also doesn't mean that we can't move past these excuses and realize that the ultimate destination of a clean and readily available power source is within our grasp! Not in 50 years, but easily within 5 years for a test reactor, and 10 years for a serious commercial reactor in the gigawatt+ range. The challenges for the liquid fluoride thorium reactor design are strictly of the materials science, chemistry, regulatory, and investment kind. And guess what? The materials science and chemistry part have been worked on for the last half-dozen+ years, and are effectively solved. Whether or not regulators and investors are willing to make it happen is what is the real question.

Some detractors will claim it's still 40-70 years out, but to them I would just point out that the original nuclear reactors didn't take 40-70 years from conception through design, test reactors, and final construction. From first commissioning of the design for the USS Nautilus in December 1949 until it sailed under nuclear power in January 1955, there were barely more than 5 years of concentrated effort to go from the order to design it to it being built and used to travel farther and faster underwater than any other submarine before.

There are also several world governments that are working on thorium-based liquid salt reactor technology, among which include: USA, China, Australia, Czech Republic, Russia, India, and the UK. It's going to happen, and it can't happen soon enough. What is $10 billion in the next 5-10 years if it means that everyone would have access to inexpensive, clean, life-altering electric supply. With electricity, you have water. With water, you have food, trees, and can bring more rain for more water. It is an amazing virtuous cycle, and can start reversing the CO2 levels in our atmosphere *now*.

Which, incidentally, might very well be the most important thing that we can do if we want to avoid a global drought. It is scary stuff, and we need to reforest the planet yesterday. This gets us there globally, and will teach us what we need for when we try to terraform Mars in the next 30-40 years. If we make an effort, we could seriously be living on Mars in our lifetimes.

Where can I get more information?

First off, watch this hour-long video: Dr. Joe Bonometti explains Thorium. That should give a pretty broad overview of what is going on with the history, economics, etc., of the reactor technology, and it includes a link to Energy From Thorium. Generally, I like most of the information provided on the site, but there are several articles that I've read there that are a long way from being professional enough from an advocacy perspective. Also, there seems to be more interest in raising awareness rather than actually doing things (like getting a reactor built and used).

A group of MIT graduates at Transatomic Power have been working on liquid fluoride thorium reactor technology for a few years now and have released a whitepaper describing their design in detail. I don't know their funding situation, or what sorts of regulatory challenges they are facing, but they do have a collection of experienced advisors helping them.

Another group has been working for and advocating the design at Flibe Energy for several years now, and are targeting the military base market, which seems to offer a potential bypass of both the investment and regulatory issues that may be blocking other efforts. They've also got a collection of great advisors and top-notch founding team members. They haven't released their design, but I think it would be foolish to believe that they haven't at least looked at Transatomic Power's design for ideas and possible improvements.

There are also links and information from one of a dozen different Wikipedia pages, for individual thorium reactors, including existing molten salt thorium reactors, information about companies, etc. Start from the page on Liquid fluoride thorium reactor and follow the links

What can you do?

Unfortunately, I don't have a list of things that you can do to help in the advancement of liquid fluoride thorium reactor technology. Heck, in writing this article I'm hard-pressed to find something for *me* to do beyond just writing a general advocacy and quick overview of why I think that thorium is the future. But as a start, we can become educated about what is going on, and attempt to understand and spread the fact that there are no other technologies that could offer such a change in energy generation technologies in such a short timespan. Nothing.


Okay, thanks to Eric Snow, from the comment, I have just learned about a potentially viable fusion reactor technology. This is what could get us there too (thorium is still viable for helping to burn up old nuclear waste, so keep looking and reading). First, read this article on Gizmag, and then go here to donate. I donated last night. And then I just read this article, which was written May 31, 2014: Record setting temperatures of roughly 1.0 - 1.8 billion degrees kelvin. This could happen people. I donated last night.

Wednesday, June 18, 2014

A message to (almost) everyone

Just so that I don't get caught out on this later in life, thinking that I might not get to say it later, I just wanted to tell you, Thank You. By reading these words, you have in some small way contributed to some part of me living on beyond my death. And all someone can really hope for is to be remembered in the future.

I wish that everyone would record a bit of themselves every day, and that future generations be required to read/listen to these auto-biographies as a matter of cultural heritage and learning. Because to be forgotten is such a great shame. Yet billions are already forgotten.

To the point: Thank You. By interacting with me through comments, conversations over the phone, conversations in person, arguments, discussions, anything and everything, you have helped make me who I am today. Your kindness, love, contempt, frustration, and all other such things have helped me along the way to become who I am. And honestly, I like me right now. So, thank you for helping me like me.

I have an amazing life. My wife is amazing, my first child, Mikela is a wonderful little girl. So sweet, playful, outgoing, ... everything a dad could want. And we are happy to report that we are expecting another little girl later this year! I am sure she will also be a wonderful little girl, especially with Mikela as an older sister. I am not sure I could be happier. :D

My work is going well, the boss-man just had a major insight into what will really drive Zeconomy's adoption, and I can't wait to build it.

I also am having an amazing burst of creativity in the last several months since starting to lift weights and run again, and I can't help but remember that I dream and am more creative when I work out regularly. On the creativity side of things, vote on some stories. On the physical side of things, I don't look bad for my age. Just need a haircut and a shave.

So yeah, thank you everyone for helping me get here, helping me be who I am. Oh... except for bullies when I was in school: you get no thanks from me.

Wednesday, April 30, 2014

Soylent and me

Like thousands of others last year, I saw the Soylent crowd funding last summer and the concept rang true to me. Sometimes you just need to eat. And if you are just going to eat to address your biological need to eat, it doesn't make sense to spend more time, money, and effort than absolutely necessary. Enter Soylent, which is intended to be a nutritionally-balanced meal replacement powder. How balanced?

Look at that nutrition information!

Why Soylent?

Over the last year or so, my wife and I have been fortunate enough to first have her mother stay with us to help take care of our child, then for us to find a nanny within a few days of the mother-in-law going home. Both my mother-in-law and our nanny are ethnic Chinese, but who lived their lives in Malaysia (our nanny having come to America about 30 years ago). Also fortunately, both are very good at cooking authentic Malaysian-Chinese food (the nanny having owned and operated a Chinese restaurant for 15 years), so my wife has been very happy.

Unfortunately for me, I'm sort-of a picky eater. Which isn't to say that I don't eat many different types of food (I like at least a half-dozen dishes from just about every type of cuisine that has a restaurant in Los Angeles), it's just that I'm very opinionated about the food that I eat. And when I'm confronted with flavors that are different enough from what I enjoy eating, I acknowledge that the food is not for me, and I find something else to eat.

When you take a picky eater (me) and add authentic Malaysian-Chinese home cooking 5-6 nights a week, you get a predictable outcome: me making my own dinner. But I am also very lazy, so most nights (4-5/week) I end up making myself a peanut butter and honey sandwich. It was getting a bit old last summer when I saw the Kickstarter for Soylent, and my thoughts were: if it's any good, I've at least got something balanced enough to mix it up a bit with the PB+H sandwich.

How is it?

I received my 1-month supply of Soylent yesterday, after the pitcher and measuring scoop arrived last week. It would seem that for some reason or another I chose the "vegan" version, which differs from the non-vegan version by a lack of a supplementary oil included in the package. Not a huge deal one way or another, but great for me because it arrived yesterday as opposed to mid-May or later.

For some taste context, over the years I've bought and consumed a variety of different types of protein powders (mixed with water, watered-down fruit juice, or a watered-down sports drink), typically just before or after going to the gym. If you've ever had one of those protein shakes, you probably already know that they are pretty awful.

Long story short: Soylent is better than any protein powder I've ever eaten. (note that Soylent is not a protein powder)

Overall, the taste is fairly neutral, but still has a small tinge of what I'm going to call "whey uck". If you've had one of the whey protein powders compared with a non-whey powder, you will know what I mean. I won't say that it's a deal-breaker, perhaps being on the order of 5-10% as strong as the weakest-tasting uck I've had in a whey protein powder, but it's still there.

Texture-wise, it's a bit grittier than I would have preferred, but it's not unpleasant. Note that if you let it sit in the fridge (or anywhere else), it will start separating fairly quickly, so be prepared to shake or stir before drinking if you are going to let it sit.

So far I've had two 12-ounce glasses. A glass and a half yesterday for dinner, and half a glass today with a steak+salad+glass of milk the wife decided to make for us. Surprisingly, I wasn't particularly hungry yesterday evening after drinking the Soylent (roughly 6:45PM), and I ate a typical-sized snack for me (11:45PM-midnight) before going to bed (about 1AM).

Being that I've not been eating it as my only food source, and I've only been eating it for the last 2 days, I can't say that I'm experiencing any physical or mental changes that some others have reported after long-term consumption, but it's also entirely possible that my diet (which normally includes 24-36+ ounces of milk/day) was balanced enough for my particular physiology and metabolism for it not to make a huge difference. Time will tell.

Final thoughts

As a very balanced meal replacement, it's definitely passable in terms of taste and texture. The dump + add water + shake process makes it difficult to beat in terms of preparation. Cost-wise, I'm not sure you could cover all of the nutritional bases that Soylent does with regular food at the ~$3 per meal price.

So far I've had a positive experience, and I hope that I don't grow to hate Soylent like I've grown to hate protein powders

Thursday, April 17, 2014

Heartbleed and false equivalence

This morning I had the opportunity to listen to a bit of NPR, where a piece on Heartbleed was being discussed by Larry Mantle on Air Talk. You can read and download the piece: New report: The internet is too interconnected to fail. Medium story short: the segment and Larry have the apparent opinion that the way the collective technology industry handles these kinds of issues is "ineffective", comparing it to the the financial meltdown and its effects. I disagree, here's why.

Heartbleed is a huge "hair on fire" situation, no competent person on the tech side of things disagrees. Easily tens of thousands, perhaps hundreds of thousands of IT professionals of one kind or another spent hours or days each dealing with the fallout of Heartbleed. The total financial cost of updating/replacing every piece of software/hardware that suffers from this bug globally is easily into the $billions, spread out over millions of organizations. Some fixes were as simple as updating and restarting a few pieces of core software, others required what can only be described as major surgery. And what is the result?

The internet is recovering pretty quickly. Unlike the financial crisis, the direct cause of Heartbleed can be narrowed down to literally a half dozen lines of code or less. That code has been fixed, in some cases the "heartbeat" feature that was the source of the bug extracted completely, millions of copies of OpenSSL have been updated or replaced, and a variety of professionals who build safe and secure software have come together to make the situation better - to try and prevent this kind of thing from ever happening again. Are there still vulnerable systems? Yes. Will there be vulnerable systems in 6 months? Probably, just like there are still people using Windows XP with Internet Explorer 6. But thousands of technologically adept professionals are pushing as hard as they can to get as much fixed as absolutely possible.

Thinking back to what went on after the financial meltdown, can you see any difference? I can. The biggest that I can see is that none of the fundamental causes of the meltdown have ever been addressed. Banks are still extraordinarily under-regulated (as the Libor scandal showed), banks are still too large, banks are still trying to maximize profit via borrowing at low rates from the fed, banks are still leveraging, ... The only thing that has sort-of changed is that now the commoditization of consumer loans doesn't seem to happen anymore, and credit reporting companies are now paying better attention in those situations where commoditization occurs. But all of the fundamental things that banks do to maximize profit and to grow larger are still allowed.

While Larry may see parallels, I see almost none. The technologies underlying the internet that keeps us all connected to one another are supported and developed by passionate people who have every reason possible there could be to create and maintain a safe and secure environment for everyone. This point cannot be overstated. The people working on this do it because they know how important it is for it to be available to everyone. Some volunteer and some are underpaid. That's right, some people who design and build the core technology behind securing the majority of web servers in the world volunteer their time to make it better for everyone. If you haven't read this post from the finance guy at OpenSSL, you should.

On the other side of the coin, banks and the finance industry have no good reason to change anything, as they are still extracting immense value from the way the system is being run. Incomes among those in the finance industry are as high as they've ever been. And nothing has fundamentally changed in the finance industry to prevent anything like the meltdown from happening again.

If there is one thing to be learned from Heartbleed, it's that the global technology industry is pretty well equipped to deal with catastrophic failure. We (those of us in tech) fix the problem, prepare a variety of mitigation strategies to prevent a similar problem from happening again, and we move on to bigger and better things. We aren't done yet, but we'll get there.

Can you imagine if the finance industry was willing to fix itself to the same extent? I can't, but I'm a realist.

Thursday, March 6, 2014

Simplifying Postgres permissions

If you are anything like me, you would prefer to deal with as few headaches when developing software as absolutely possible. One recent headache that decides to revisit me occasionally is Postgres permissions. For users/platforms where you need a single user (that isn't the "postgres" user), while offering password authentication with both local and remote access, keep reading to find out how you can ensure that your permissions and table ownership are sane-ish. This is my gathering together of other information from docs, blogs, and tech articles.

First, you need to install Postgres. I'll assume you've already done that.

Next, you need to create the user that will be performing all of your operations (schema updates, data manipulation, etc.).
$ sudo -u postgres psql
# create user DBUSER with password 'PASSWORD';
# create database DATABASE;
# grant all privileges on database DATABASE to DBUSER;
# \q
After creating and updating your initial database, you now need to update your Postgres configuration to ensure that you can actually log in with your user.
$ sudo vim /etc/postgresql/9.1/main/pg_hba.conf

Scroll to the bottom of your configuration file. About 10 lines up (at least in Postgres 9.1), you should see a line that reads:

local   all             all                               peer

Change the last column to read "md5":

local   all             all                               md5

Then save and quit the editor.

After your configuration is updated, you need to restart Postgres:
$ sudo /etc/init.d/postgresql restart

From here, if you ever want to modify your schema, use data from a client, etc., you only need to log in with the user you created earlier, and everything will just work. You will be able to alter your schema as necessary, select, update, insert, and delete. In a lot of ways, this behaves a lot like a just-installed MySQL configuration after you've set a username/password for the root user (and only ever use the root user), but where the user *only* has access to manipulate the one database. This can be very useful when you've got your one Postgres install for personal projects, but want to silo your different projects into different named databases.

If you've already got some tables created with the Postgres user, and you need to change ownership, the accepted answer over on StackOverflow can get you all fixed up.

Long-term, you may want to create additional read/write users (no DDL updates), read-only users, etc., but if you just want to get a new application going, this can get you there.

Please upvote on Reddit and Hacker News.

Friday, December 6, 2013

Properties on Python modules

Python properties let programmers abstract away get, set, and delete methods using simple attribute access without exposing getValue(), setValue(), and delValue() methods to the user. Normally, properties can only be added to classes as either an instance-level property, or a class-level property (few people use class-level properties; I've only used them once, and I had to build a ClassProperty object to have it). But in this post, you will find out how to create module-level properties, and where to get a package that offers transparent module-level property creation and use from Python 2.4 all the way up to early versions of 3.4.

Why do I want module-level properties in Python?

For many people, the desire to have module-level properties boils down to flexibility. In this case, I came up with this solution while talking with Giampaolo Rodola (author of PyFTPDLibpsutil, current maintainer of asyncore and related Python libraries, ...) and hearing about psutil's use of module-level constants. Now, module-level constants aren't usually a big deal, but in this case, some of psutil's constants are actually relatively expensive to compute - expensive enough that Giampaolo was planning on deprecating the constants, deferring computation until the library user explicitly called the relevant compute_value() functions.

But with module properties, Giampaolo can defer calculation of those values until a program accesses the module's attributes, at which point the values can be computed (and cached as necessary). But even more useful is the fact that if you don't use those attributes, you don't need to compute them, so most people will get an convenient but unexpected performance improvement any time they need to import psutil.

What doesn't work and why

The first time someone wants to use a module property, they will try to decorate a function in their module with the property() decorator like they are used to using on methods of classes. Unfortunately, when trying to access that "property", they discover that the property didn't get any of the descriptor magic applied to it, and they have a property object that doesn't do anything particularly useful.

The reason it doesn't do anything useful is because properties are attached to classes, not instances. And during import/execution of a module, your definitions are being executed in the context of an instance dictionary of the module with no substantive post-processing. On the other hand, typical Python class definition results in the body of the class being executed, then the results passed to type() (via the type(name, bases, dict) form) for class creation.

Making it work

Someone who knows a bit more about how Python's internals are put together knows that you can muck with the contents of sys.modules, and that doing so during module import will let you replace the module object itself. So that's what we are going to do. Along the way, we're going to be doing a bit of deep magic, so don't be scared if you see something that you don't quite understand.

There are 5 major steps to make module properties work:
  1. Define your property
  2. Create a new type to offer unique properties for the module
  3. Ensure that the replacement module has access to the module namespace
  4. Fix up the module namespace and handle property definitions
  5. Replace the module in sys.modules

Our first two steps are easy. We can simply use the standard @property decorator (that we'll make work later) to create a property, and  we define an empty class definition that subclasses from object.

def module_property(module):
    return "I work!", module

class Module(object):

Our third step is also easy, we just need to instantiate our replacement module and replace its __dict__ with the globals from the module we are replacing.

module = Module()
module.__dict__ = globals()

Our fourth step also isn't all that difficult, we just need to go through the module's globals and extract any properties that are defined. Generally speaking, we really want to pull out *any* descriptors, not just properties. But for this version, we'll extract out only property instances.

for k, v in list(module.__dict__.items()):
    if isinstance(v, property):
        setattr(Module, k, v)
        del module.__dict__[k]

Note that when we move the properties from the module globals to the replacement module, we have to assign to the replacement module class, not the instance of the replacement module. Generally speaking, this kind of class-level function/descriptor assignment is frowned upon, but in some cases (like this), it is necessary in order to get the functionality that we want.

And our final step is actually pretty easy, but we have to remember to keep a reference to the original module, as standard module destruction includes the resetting of all values in the module to be equal to None.

module._module = sys.modules[module.__name__]
module._pmodule = module
sys.modules[module.__name__] = module

And that is it. If you copy and paste all of the above code into a module with all of your module properties defined before our fourth step executes, then after the module is imported you will be able to reference any of your defined properties as attributes of the module. Note that if you want to access the properties from within the module, you need to reference them from the _pmodule global we injected.

Where can I get a pre-packaged copy of this magic?

To save you (and me) from needing to copy/paste the above into every module we want module properties, I've gone ahead and built a Python package for module properties. You can find it on Github, or you can find it on the Python package index. How do you use it? Very similar to what I defined above:

def module_property(module):
    return "I work!", module

# after all properties are defined (put this at the end of the file)
import mprop; mprop.init()

Alternatively, if you don't want to remember to throw an mprop.init() call at the end, I've got a property work-alike that handles all of the magic:

from mprop import mproperty

def module_property(module):
    return "I also work!", module

And that's it. Module properties in Python. Enjoy :)

Hacker news thread here. Reddit thread here.

Thursday, October 24, 2013

Multi-column (SQL-like) sorting in Redis

Recently, I received an email from a wayward Redis user asking about using Redis Sets and Sorted Sets to sort multiple columns of data, with as close to the same semantics as a traditional SQL-style "order by" clause. Well it is possible, with limitations, keep reading to find out how.

What is Redis?

For those people who don't quite know what Redis is already, the TLDR version is: an in-memory data structure server that maps from string keys to one of 5 different data structures, providing high-speed remote access to shared data, and optional on-disk persistence. In a lot of ways, you can think of Redis like a version of Memcached where your data doesn't disappear if your machine restarts, and which supports a wider array of commands to store, retrieve, and manipulate data in different ways.

The setup

With that out of the way, our intrepid Redis user had come to me with a pretty reasonable problem to have; he needed to build an application to display a listing of businesses, sorted by several criteria. In his case, he had "price", "distance[1]", and "rating". In many cases that we have all seen in recent years with individual retailer searches, never mind restaurant searches on Yelp and similar applications, when searching for something in the physical world, there are a few things you care about primarily. These usually break down preferentially as lowest distance, lowest price, highest rating. In a relational database/SQL world, these fields would all be columns in a table (or spread out over several tables or calculated in real-time), so we are going to be referring to them as "sort columns" from here on.

Now, depending on preferences, you can sometimes get column preference and ascending/descending changes, which is why we need to build a system that can support reordering columns *and* switching the order of each individual column. Say that we really want the highest rating, lowest distance, lowest price? We need to support that too, and we can.

The concept

Because we are dealing with sort orders, we have two options. We can either use the Redis SORT command, or we can use sorted sets. There are ways of building this using the SORT command, but it is much more complicated and requires quite a bit of precomputation, so we'll instead use sorted sets.

We will start by making sure that every business has an entry in each of 3 different sorted sets representing price, distance, and rating. If a business has an "id" of 5, has a price of 20, distance of 4, and a rating of 8, then at some point the commands "ZADD price 20 5", "ZADD distance 4 5", and "ZADD rating 8 5" will have been called.

Once all of our data is in Redis, we then need to determine the maximum value of each of our sort columns. If you have ranges that you know are fixed, like say that you know that all of your prices and distance will all be 0 to 100, and your rating will always be 0 to 10, then you can save yourself a round-trip. We'll go ahead and build this assuming that you don't know your ranges in advance.

We are trying to gather our data range information in advance in order to carve up the 53 bits of precision [2] available in the floating-point doubles that are available in the sorted set scores. If we know our data ranges, then we know how many "bits" to dedicate to each column, and we know whether we can actually sort our data exactly, without losing precision.

If you remember our price, distance, and range information, you can imagine that (borrowing our earlier data) if we have price=20, distance=4, rating=8, and we want to sort by distance, price, -rating, we want to construct a "score" that will sort the same as the "tuple" comparison (20, 4, -8). By gathering range information, we could (for example) translate that tuple into a score of "20042", which you can see is basically the concatenation of "20", "04", and 10-8 (we subtract from 10 here because the rating column is reversed, and it helps to understand how we got the values).

Note: because of our construction, scores that are not whole numbers may not produce completely correct sorts.

The code

Stepping away from the abstract and into actual code, we are going to perform computationally what I just did above with some string manipulation.  We are going to numerically shift our data into columns, accounting for the magnitude of the data, as well as negative values in the columns (which won't affect our results). As a bonus, this method will even tell you if it believes that you could have a lower-quality sort because your data range is too wide[3].

import math
import warnings

def sort_zset_cols(conn, result_key, sort=('dist', 'price', '-score')):
    current_multiplier = 1
    keys = {result_key: 0}
    sort = reversed(sort)

    # Gets the max/min values in a sort column
    pipe = conn.pipeline(True)
    for sort_col in sort:
        pipe.zrange(sort_col, 0, 0, withscores=True)
        pipe.zrange(sort_col, -1, -1, withscores=True)
    ranges = pipe.execute()

    for i, sort_col in enumerate(sort):
        # Auto-scaling for negative values
        low, high = ranges[i*2][1], ranges[i*2+1][1]
        maxv = int(math.ceil(high - low if low < 0 else high))

        # Adjusts the weights based on the magnitude and sort order of the
        # column
        old_multiplier = current_multiplier
        desc = sort_col.startswith('-')
        sort_col = sort_col.lstrip('-')
        current_multiplier *= maxv

        # Assign the sort key a weight based on all of the lower-priority
        # sort columns
        keys[sort_col] = -old_multiplier if desc else old_multiplier

    if current_multiplier >= 2**53:
        warnings.warn("The total range of your values is outside the "
            "available score precision, and your sort may not be precise")

    # The sort results are available in the passed result_key
    return conn.zinterstore(result_key, keys)

If you prefer to check the code out at Github, here is the gist. Two notes about this code:
  • If the maximum or minimum values in any of the indexed columns becomes more extreme between the data range check and the actual query execution, some entries may have incorrect ordering (this can be fixed by translating the above to Lua and use Redis 2.6 and later support for Lua scripting)
  • If any of your data is missing in any of the indexes, then that entry will not appear in the results
Within the next few weeks, I'll be adding this functionality to rom, my Python Redis object mapper.

Interested in more tips and tricks with Redis? My book, Redis in Action (Amazon link), has dozens of other examples for new and seasoned users alike.

[1] For most applications, the distance criteria is something that would need to be computed on a per-query basis, and our questioning developer already built that part, so we'll assume that is available already.
[2] Except for certain types of extreme-valued doubles, you get 52 bits of actual precision, and 1 bit of implied precision. We'll operate under the assumption that we'll always be within the standard range, so we'll always get the full 53 bits.
[3] There are ways of adjusting the precision of certain columns of the data (by scaling values), but that can (and very likely would) result in scores with fractional components, which may break our sort order (as mentioned in the notes).