I love the interpreted languages. I love PHP, Perl, Java, C# and all the others. The liberty they give you is incredible! However there is a security aspect to them: because the actual machine code is shared by the programs written in one particular language, security features / products which depend on the executable image to uniquely identify the processes, fail on them.
Some security features which may be broken in some cases are: personal firewall (they get confused and allow / deny the wrong application
) and files running under different user accounts (they may all end up running under a different account)
The method used by the runtimes can be divided in four categories:
- A single runtime process shared by all the scripts – this is very rarely used, in fact I don’t know of any widely used interpreted language using this approach (there is an experimental project for Java which loads all the files in a single). The advantage of such an approach would be (slightly) better loading speed and (slightly) smaller memory footprint (depending on how big the private structures allocated for an instance are), but from the security standpoind there would be many disadvantaged:
- There would be no easy way how many
processes
are running inside the same VM - If one would to take the SHA1 hash of all the running executables (in a forensic examination for example), one would get the same hash for each running script (the one which corresponds to the interpreter), and even worse, one would get only one hash regardless of how many scripts are running inside the interpreter.
- Security products which depend on the process to uniquely identify the code trying to do certain things (like a personal firewall filtering internet access) would be unable to distinguish between the different scripts and they would allow or deny the rights to all of the scripts.
- Scripts would have to be run under the same user account (under the account the interpreter was first started). There are facilities in Windows for example for different threads of the same application to run under different user privileges but (a) it takes additional effort to implement them and (b) if there are facilities (functions or exploits) in the language which allow scripts to execute binary code directly, the script can
jump thread
and execute at the privilege level of an other thread. Also, there would be a potential for information disclosure vulnerabilities between different scripts.
Luckily there are no widely used interpreters employing this approach.
- There would be no easy way how many
- A second method is to have multiple instances of the interpreter executable running, one for each script. This is a very widely employed technique (Perl, PHP, Ruby, Python, Java and many others use it). This is slightly better because processes can be easily run under different user accounts and the issue of information disclosure between scripts is resolved. Also, one could determine which script corresponds to which instance fairly easily by looking at the command line of the process (which can determined with tool like Process Explorer). However the problem with security products like personal firewalls or HIPS still remain, since from their point of view all instances of the interpreter are the same.
- The third method is a variation of the second method. In this case there are small executable files corresponding to each script, which – when launched – load the interpreter – by loading a DLL or COM object for example – and then executing the script. This is the approach taken by the Adobe Apollo project for example. The method solves all but one of the problems enumerated above (the one which doesn’t solve is the similar hash for running executables problem) and also is good from a usability standpoint (because it gives something to the user which is probably more familiar with – an executable file rather than a script file which needs an interpreter). There is some grey area between case two and three represented by things like script
compilators
which bundle the script and the interpreter in one executable or the small executables used to bootstrap programs like Eclipse, which in turn run the installed Java VM (so that you are back to case two). - Finally there is the solution which is the best from a security standpoint: the executable contains the code to be interpreted and it loads the interpreter in its own address space (as a DLL or COM object for example). This means that:
- Different executables are different (and thus have different hashes – hopefully – unless you discovered a hash collision)
- Programs which rely on the executable to determine what the process can / can’t do will function correctly
- The program can easily be run under different user accounts.
This approach is taken by the .NET runtime and by the programs out there which produce executables from SWF (Flash) files for example.