Archive for the ‘Biology’ Category

The Expert Engineer’s Fallacy

Wednesday, September 14th, 2011

Today’s news was abuzz with yet another Google attempt to engineer a better Internet, this time in the form of a alternative to JavaScript. There’s been some interesting talk about how Google is doing just what Microsoft used to do, namely propose their own technologies as better alternatives to current practice.  One argument I saw basically says:

  • Google is attempting to replace all major web technologies with alternatives developed in-house.
  • Microsoft attempted to replace all major web technologies with alternatives developed in-house.
  • Microsoft is (or at least was) evil.
  • Google is now becoming evil.

(Sorry, can’t remember where I saw this.)  I think there is actually some truth to this argument, but it has nothing to do with being “evil.”  Rather, it is simply a symptom of having too many very good engineers all working together in the same environment.

Engineers always want to make better pieces of technology.  The better the engineer, the greater their ambition.  Get a bunch of them together in the same place, and they’ll try to remake the world in their image.

Unfortunately, at a certain scale their efforts are almost guaranteed to fail.  It isn’t because they aren’t good.  It is because they don’t know their limits.  And their limits are also simple to explain, but hard for many to believe.

We cannot engineer novel complex systems.  Full stop.

If something is very complicated and you’ve built one before, you can copy the past design and do a reasonable job – as long as you don’t change too many things.  If you are engineering something on a smaller scale, it is possible to design it from scratch and have it be quite good.  But once a system gets complicated enough, nobody understands the whole problem.  And if you don’t understand the whole problem, you can’t engineer a solution.

But we build big systems all the time, right?  Yes, we do, but what we do is engineer parts and then put them together in a process that is, more or less, trial and error.  We try things, we see what happens, and inevitably tweak/hack things to get them to work.

Google’s effort with Dash may succeed on its own in some niche.  But JavaScript as a platform is simply too big to just be replaced.  It is big and complex and solves more problems than any human – or even moderately-sized group of humans – understands.  We can compete with it, but we won’t be able to just engineer it away.   The rules of evolution now apply, not those of design.

Google is doing what Microsoft tried before simply because they are both places with engineers who don’t know their limits.  Well, Microsoft now knows its limits – it hit a wall, and is now a much more humble company.  Eventually this will happen to Google as well.

Such is the way of evolution.

Statistical Mirages

Wednesday, March 3rd, 2010

I have to admit that I’ve always been suspicious of statistics as they are used in computer security.  Oddly enough, I also should have been suspicious of statistics in other sciences as well.  Turns out that when you examine large datasets with lots of tests, simply by random chance you are likely to find something, i.e., one of the tests is likely to come up with something “significant.”  Hence you get results that suggest breathing in bus exhaust is good for you.  In other words, you get false positives.

If anomaly detection is ever to become a fundamental defense technology, it will have to move beyond statistics to being grounded in the mechanisms of computers and the real behaviors of users.  This is going to take a while, because this is a lot harder than just running a bunch of tests on datasets.  Of course, given the current disrepute of anomaly detection in security circles, perhaps the door is wide open for better approaches.

RIGorous Cell-level Intrusion Detection

Monday, January 18th, 2010

When looking to biology for inspiration on computer security, it is natural to first look at mammalian immune systems.  While they are amazingly effective, they are also mind-numbingly complex.  As a result, many researchers get seduced by the immune system’s architecture when there is so much to learn from its lower-level workings.

Case in point: every cell in the human body can detect many kinds of viral infection on their own, i.e., with no assistance from the cells of the immune system.  As this recent article from Science shows, we are still far from understanding how such mechanisms actually work.  My high-level take on this article, as a computer security researcher, is that:

  • Basically all cells in mammals (and, I think, most animals in general) can generate immune system signals that generate responses from internal and external mechanisms.  A key source for such signals is foreign RNA (code) inside the cytoplasm of a cell.  Of course, there is a lot of other, “self”-RNA in that cytoplasm as well – so how does the cell tell the difference between them?
  • A key heuristic is that native RNA is only copied in the nucleus of a cell; RNA-based viruses, however, need to make RNA copies in the cytoplasm (that’s where they end up after getting injected and it isn’t easy to get into the nucleus – code basically only goes out, it hardly ever goes in).  RNA polymerases (RNA copiers) all use the same basic patterns to mark where copying should start.  Receptors such as RIG-I detect RNA with “copy me” signals (5′-PPP) in places where no copying should occur (the cytoplasm).
  • Of course, this is biology, so the picture isn’t so clear-cut.  A simple “copy me” signal won’t trigger a response; there must also be some base pairing – the RNA molecule must fold back on itself or be bound to another (partially complementary) RNA molecule.  I’d guess this additional constraint is there because normal messenger RNA is strictly single-stranded.   (Indeed, kinks or pairing in messenger RNA are bad in general because they’ll interfere with the creation of proteins.)

Of course, all of this is partial information – there’s evidence that these foreign RNA-detecting molecules (the RLR-family) trigger under other additional constraints.  This doesn’t surprise me either, as this mechanism must operate with extremely low false positives; one or two matching rules aren’t up to the task given the complexity of cellular machinery and (more importantly) given the evolution of viruses to circumvent these protections.  Viruses have evolved ways to shutdown or suppress RLR-related receptors.  Although cells will be pushed to evolve anti-circumvention mechanisms, in practice this is limited in the cellular environment—make the detectors too sensitive and a cell will kill itself spontaneously!  The solution has been to keep a circumventable but highly accurate detector in place; the arms race instead has moved to optimizing the larger immune system.

I leave any conclusions regarding the design of computer defenses as an exercise for the reader. :-)


Switch to our mobile site