Dynamic languages, the universe and everything


From Planet Perl I somehow ended up at a transcript of a talk about dynamic languages. It just so happens that during the same time I was reading the paper Eudaemon: Involuntary and On-Demand Emulation Against Zero-Day Exploits.

The paper is an extension of the Argos project, which tries to identify zero-days by correlating network traffic with executed code. Their basic idea is that if something gets executed which resulted from the network traffic, it is the sign of a zero-day. While it is an interesting idea and remarkable that they got it to work at a reasonable speed, in my opinion it has very limited use since:

  • People having undisclosed vulnerabilities will use them in targeted attacks (hence the probability of it hitting honeypots is fairly small)
  • Most vulnerabilities these days are of type push rather than pull (firewalls did have some effect on the ecosystem). This means that you have to visit the right URLs at the right time with the right browser (although this list criteria is relatively easy to satisfy, just use IE).
  • The method can have FP issues with technologies which use JIT to speed up execution (Java and if I recall correctly the next version of the Flash player and even some Javascript interpreters), since they execute code resulting from network data

This analysis is done by using Dynamic Code Translation, that is, disassembling the code to be executed and creating an equivalent code which includes instrumentation. The difference between the Eudaemon and Argos projects is that the first applies the method on-demand on individual processes running inside of an OS, while Argos virtualizes the whole machine.

This (and the point from the talk that dynamic languages can be made fast) got me thinking: what would be the advantages of an OS built entirely on a VM? One would be verifiability (given the small(ish) size of the VM, one could verify it, or even use mathematical proof style methods to show that certain things – such one process writing to the address space of an other – can not happen). An other would be the possibility of rapid change. OSs wouldn’t have to wait any more for CPU vendors to include hardware support for virtualization for example, everything could be changed by simply changing the VM.

Then again, there is nothing new under the sun: a mathematically sound VM would be very similar to a CPU, in the design of which there is a lot of mathematics involved. Binary translators exists since a long time (just take the Qemu project as an example). And there is the Singularity project which aims to implement a entire OS in MSIL on top of a thin VM. Also tools like Valgrind and Dynamo are already doing user-mode only DCT (not to mention the user-mode version of Qemu).

Using DCT to implement an alternative security model is an interesting idea (and certainly viable with todays processor power). The only problem would be that the original code and the translated code had to co-exists in the same memory space, which under 32 bit processors can lead to memory exhaustion. This can be solved by creating a separate memory space for the compiled code or by moving to 64 bit (or even by saying that this technology is not memory-hungry applications like SQL Server).

Then again, maybe I’m overcomplicating things, since in current OSs processes by definition must turn to the kernel to do anything, and monitoring the user-kernel communication channel should be enough to implement any kind of security system. Although MS is against this currently (see Patchguard), it is not impossible that they will introduce the possibility to filter all such calls in the future.


Leave a Reply

Your email address will not be published. Required fields are marked *