authentication – Grey Panthers Savannah

Two channel authentication

gpanther — Fri, 27 Jul 2007 16:09:00 +0000

I’m no Bruce Schneier, so I welcome the comments of any more informed and/or more intelligent readers (which shouldn’t be too hard ;-)).

Two factor authentication is the buzz these days, it’s the silver bullet of the security industry. To provide a short explanation (which will almost certainly leave out essential facts and get others wrong :-p):

To prove your identity to the other party you can use several factors. Usually factors are categorized in three groups:

Something you know (and hopefully only you know :)) – like a password, date of birth, SSN, etc
Something you have – like and RSA token, smart card, etc
Something you are, also known as biometrics – fingerprint readers, iris scanner, etc.

The basic premise of two factor authentication is that providing two factors from different groups (so an user name and password is not two factor!) increases the trust.

The thing I want to discuss is the way the authentication data goes from the user to the server, the channel through which the information flows. Why is this important? Because communication channels have their own weaknesses like replay attack (where the attacker captures the data – without necessarily being able to interpret it – and resends the exact sequence later) or man-in-the-middle attack. The most frequently used communication channel (HTTP over TCP/IP) is very vulnerable and even protocols which deal with data in transit (HTTPS) can not (and are not designed to) deal with endpoint security. To give a possible attack scenario:

The user goes to a website (lets say over HTTPS)
She enters her username and password
She uses her finger with a fingerprint reader to create a hash that uniquely identifies her finger
She sends all the data to the server for authentication
However, there is an information collecting software (trojan) running on her computer, which captures and relays all the data before it enters the HTTPS tunnel
Now all the attacker has to do is to replay all the data, including the fingerprint hash, without even needing a fingerprint reader (the existence of which the remote server can not check, it can only check that it was supplied the correct hash)

One of the problems in this scenario was that the channel (or more precisely one of its ends) got compromised. PC’s (and Macs too :-)) are relatively easy to compromise because they were built as general purpose computers. However if one of the factors was transported over an other channel, it would mean that the attacker would need to listen in on two channels. This is what I would call Two factor / Two channel authentication.

What attributes would such a channel need? It should be as widely accessible as possible and should be safe (to the extent possible) against spoofing. The (cell) phone network is something that comes close to this. So here is a possible implementation of such a system using the cell phone network:

When the user registers on a site, she gives her cell phone number
When she needs to log-in, she supplies her username and password
After a very short time her cellphone rings, she picks it up and presses #
Now she authenticated using two factors (something she knows – the username and password – and something she has – her cellphone) and two chennels!

Something like this already exists can be used for free by everyone in need to add a second factor to the authentication: PhoneFactor. (Disclaimer: I have no relations with PhoneFactor or anyone involved with it, I just think that it is a very cool idea).

Some issues with this method:

Can’t the attacker just specify an other phone number and authenticate using that? – No, if the system is built properly. That is, the user supplies the phone number at the registration phase (which is supposed to be done in a safe environment) and it can later be changed only after logging in. If there is no safe environment when the user registers for the first time with the website, all bets are off anyway.

But phone numbers can be spoofed! – Yes, but to fool this system, the attacker would have to be able to (a) know the phone number where the call will be placed and (b) compromise the (cell) phone network to re-route calls. While both of these are possible, they are (hopefully) much more difficult than installing a trojan on the users machine

Isn’t it a privacy concern that my phone number is stored in a database? – Yes it is, but if the database gets compromised, there are probably more valuable information than your phone number. Unfortunately it’s not possible to store your phone number as a one-way hash, because (a) the system needs to able to call it and (b) phone numbers are relatively short (10 numeric characters), so a brute-force is very feasible. This can be mitigated if PhoneFactor would offer a service to store the phone numbers, which could work as follows (I don’t know if they have such a system):

When a new user register, the site asks PhoneFactor for a UID (Unique Identifier) to be associated with that user
The user gives her phone number to the site, which in turn forwards it to PhoneFactor together with the UID and doesn’t store it.
From here on, the website only needs to send the UID whenever it wants to ask PhoneFactor to verify the user, and it will know which number to call
When the user wants to change her phone number, the website would send the UID and the new phone number to be associated with it, without actually storing the phone number

In this case the information would be split up between two places (the website and PhoneFactor) and unless both are compromised, the complete identity-phone number can not be reconstructed.

What if I’m at a place where there is no coverage for my phone or I’m unable to use it for other reasons (battery died, etc)? – then you are SOL, pardon my language. But then again, what if you are in a place where there is no internet access? Or no fingerprint reader?

PS. I argue that the RSA-style one-time pad generators also fall in this category because one could say that there is a virtual channel between the RSA-token and the authentication server, through the clock built into both of them and the serial number of the token.

Serving up authenticated static files

gpanther — Wed, 25 Jul 2007 13:14:00 +0000

Two components which are usually found in web applications are authentication and static files. In this post I will try to show how these two interact. The post will refer to PHP and Apache specifically, since these are the platforms I’m familiar with, however the ideas are generally applicable.

The advantages of static files are: cacheability out of the box (with a dynamically generated result this is very hard to get right) and less overhead when serving up (even more so if something specialized is used like tiny httpd). However you might feel the need to apply authentication to the static files also (that is only users with proper privileges can have access to them). Of course you want to retain the advantages of caching and low overhead as much as possible.

One option (and probably the one with less overhead and ultimately simpler to implement) is to use mod_auth_mysql on the directory hosting the static files and generate a random long (!) username and password for each user session, insert them to the authentication table, and modify the links to the resources to include these credentials. For example, a link in this case might look like this:


http://w7PLTHUDxK:xLaLGkku8O@example.com/static/image.jpg

The advantage of this approach is that we get all those wonderful things like content type or cache headers (or even zlib compression if we configured it) for free. The main pitfall is the choosing of the place where to do the cleanup (remove this temporary user from the table). The session destroy handler is not good enough since it won’t be called if the user doesn’t properly log-out. One solution would be to do repeated “garbage collections” on the tables (in this case care must be taken to set this garbage collection interval the same or larger as the session timeout interval, since otherwise the access might “go away” from under the users feet while they are still logged on). An other option would be to add a user id column to the table and use the “REPLACE INTO” SQL command (which is AFAIK unique to MySQL, not standard) to ensure that the temporary user table has at most as many users as the main user table.

A quick note: all the above can of course be done with static authentication also (that is a hardcoded username and password in the .htaccess file). This is a very simple solution (an easier to apply, since mod_auth_mysql might not be installed/enabled on all the webservers, but mod_auth is on most of them), but is insecure, it can not be used to separate users (ie. to have files which only certain users can access) and because it does not expire automatically, one link is enough for search engines / other crawlers to find it.

This is all well and good, but what if you don’t have control over the server configuration? While I strongly recommend against using shared PHP hosting, some people might be in this situation. The solution is to recreate (at least some of) Apache’s functionality.

The first step is to put the actual static files outside your web root (preferably) or to deny access to the folder where the files are placed with .htaccess (less preferable). If the files would to reside in a public folder, this system would provide obfuscation at best and is equivalent with a 302 or 301 redirect at worst.

The next step is to decide on the method of referencing your static file. You have three options:

Put the file name directly in as a GET parameter (for example get_static.php?fn=image.jpg)
Use mod_rewrite to simulate a directory structure (static/image.jpg which will be rewritten by a rule into the form showed at the previous point)
Use the fact that Apache walks up the path until if finds the first file / directory, so you can do something like get_static.php/image.jpg

The second and third options are the ones I recommend. The reason behind this is that it gives the browser the illusion that you are dealing with different files which can help it do proper caching without relying on the ETag mechanism discussed later.

I would like to pause for a moment and remind everybody that security is a big concern in the web world, since you are practically putting your code out for everybody, meaning that anybody can come and try to break it. One particular concern with these types of scripts (which read and echo back to you arbitrary files) is path traversal. This attack is easy to demonstrate with the following example:

Let’s say that the script works by taking the filename given, concatenating it with the directory (which for this example is /home/abcd/static/) and echoing back the given file. Now if I supply in the filename something like ../../../etc/password, the resulting path will be /home/abcd/static/../../../etc/password, meaning that I can read any file the web server has access to. And before the Windows guys start jumping up and down saying that this is a *nix problem, the example is very easy to translate to Windows.

Now your first reaction would be to disallow (blacklist) the usage of the . character in the path, but don’t go this way. Rather, define the rule which your files will follow and verify that the supplied parameters follow that rule. For example the filenames will contain one or more alphanumeric, underscore or dash character and will have a png, jpg, css or js extension. This translates into the regular expression ^[a-z0-9_-]+.(png|jpg|css|js)$. Be sure to include the start and end anchors (otherwise it only has to contain a substring matching the rule, the whole string doesn’t have to match the rule) and watch out for other regular expression gotcha’s. As an added security measure use the realpath function (which resolves things like symbolic links or .. sequences) before performing any further verification.

Now we have the file, and need to generate the headers. The important headers are:

Content-Size – this is very straight forward, it is the size of the file. While theoretically the HTTP protocol supports other measurement units than bytes, practically bytes are always used
Content-Type – this can be obtained using the mime_content_type function, however be aware that sometimes it fails to identify the correct type and action must be taken to correct it (for example a CSS file might be identified as text/plain, but it must be served up text/stylesheet to work in all the browsers)
Cache headers – depending on how long you think the clients / intermediate proxies should cache your content, these must be set accordingly.
ETag – this is a header which helps the browser distinguish between multiple content sources from the same URL. For example if the link to an image is http://example.com/image.php?id=1 and to the second one http://example.com/image.php?id=2, without an ETag these will represent the same cache entry, meaning that you can have situations where the second image is displayed instead of the first or vice-versa, because the browser operates under the assumption that they are the same and pulls one out of the cache, when instead the should be used. ETag’s can be an arbitrary alphanumeric string, so for example you could use the MD5 hash of the file (and no, there is no information disclosure vulnerability here which would warrant the usage of salted hashes for example because the user is already getting the file! S/he can recalculate the MD5 of it is s/he wishes!)
Content-Encoding – if you wish and it makes sense to compress your content, be sure to output the proper Content-Encoding header. Also make sure to adjust the Content-Size header, otherwise you could have some serious breakage.
Accept-Range – if you wish to enable resume support for the file (that is for the client to be able to start downloading from the middle of file for example), you need to provide (and handle, as described below) this header.

The script also needs to take into account the request headers:

If-Modified-Since – the browser is checking the validity of the cached object, so this should return a 304 header if the content didn’t change and provide no content body.
Accept-Encoding – this should be checked before providing compressed (gzipped) content. Also, beware that some older browser falsely claim to support gzipped content.
Range – if you specified that you handle ranges, you must look out for this header and send only which was requested. This of course can further be complicated with compression, in which case you need to take the specified chunk, compress it, make sure to output the correct Content-Length, and the send it
ETag – if you supplied an ETag when serving up the content, it will (should) be returned to you, when doing cache checking

After I’ve written this all up, I’ve found that there is a PHP extension which provides most of the functions for this: HTTP. Use it. It’s much easier than rolling your own and you have less chance to miss some corner cases (like the fact that as per HTTP/1.1 request headers are case-insensitive, meaning that If-Modified-Since and iF-mOdIfIeD-sInCe are the same thing and should be treated the same).

PS. I didn’t mention, but mechanism can also be used to hide the real file names. This might be needed when for whatever reason you don’t want to divulge it (because file names can provide additional information which you might not want your users to have). This can be achieved by using an additional step and giving the user a token which is translated in a file-name at the server. These tokens can be:

Generated from the file name
Arbitrarily chosen
Created using a random process
Created using a deterministic process

For maximum security I recommend to go with arbitrarily chosen random tokens for each file (otherwise an attacker might break the security by trying other IDs – for example if the IDs are numeric, s/he can try other numbers – or by guessing the file names and applying the generator function on it and checking the existence of the file).

Update: I’ve looked at using mod_xsendfile with PHP, however it seems to be a dormant project (the latest posted version is for Apache 2.0, nothing there for 2.2 :-(). An other option which may be worth exploring is the following (if you are using PHP as a loadable module rather than CGI): use virtual to redirect the request to the static files. You even find a good example in the comments.