08 April 2016

Are tokens enough to prevent CSRF?

Image: Pixabay
CSRF attacks exploit the trust that a website has in a client like a web browser.  These attacks rely on the website trusting that a request from a client is actually the intention of the person using that client.

An attacker will try to trick the web browser into issuing a request to the server.  The server will assume that the request is valid because it trusts the client.

At its most simple a CSRF attack could involve making a malicious form on a webpage that causes the client to send a POST request to a url.

As an example, imagine that a user called Alice is logged into Facebook in one tab and is browsing the internet on another tab.  A filthy pirate Bob creates a malicious form in a webpage that submits a POST request to Facebook that sends a person to a link of Rick Astley dancing.  Alice arrives on the page we made and Javascript submits the form to Facebook.  Facebook trusts Alice's web browser and there is a valid session for her so it processes the request.  Before she knows it her Facebook status is a link to Rick Astley (who, by the way, will never give you up).

Of course Facebook is not vulnerable to this, and neither should your code be.

The best way to mitigate CSRF attacks is to generate a very random token which you store in Alice's session.  You then make sure that whenever your output a form on your site that you include this token in the form.  Alice will send the token whenever she submits the form and you can compare it to the one stored in her session to make sure that the request is originating from your site.

Bob has no way of knowing what the token in Alice's session is and so he can't trick her browser into submitting it to our site.  Our site will get a request from Alice's client but because it doesn't have the token we can reject it.

In other words the effect of the token is to stop relying on implicit trust for the client and rather set up a challenge response system whereby the client proves it is trustworthy.  If Bob wants to send a request that will be accepted he must find a way to read a token off a form that your site has rendered for Alice.  This is not a trivial task but can possibly be done - there are very creative ways (like this attack) to abuse requests.

Another way to prevent CSRF is to rely on multi-factor authentication.  We can group ways to authenticate into knowledge (where you know something like a password), possession (where you have something like a USB dongle), or inherent (where you are something).

Instead of just relying on one of these mechanisms we can use two (or more) in order to authenticate.  For example we can ask a person for a password and also require that they enter a code sent to the mobile phone which proves they have the mobile phone linked to their account.

CSRF will become much harder for Bob to accomplish if our form is protected with multi-factor authentication (MFA).  Of course this comes with a user experience cost so only critical forms need to be protected with MFA.  For less critical forms the single authentication method of a CSRF token will suffice.

There is debate around whether it is useful to check whether the referrer header matches your site is helpful in deterring CSRF.  It is true that it is trivial to spoof this header in a connection that you control.  However it is more difficult to get this level of control in a typical CSRF attack where browsers will rewrite the referrer header in an ajax call (see the specification).  By itself it is not sufficient to deter CSRF, but it can raise the difficulty level for attackers.

Cookies should obviously not be used to mitigate CSRF.  They are sent along with any request to the domain whether the user intended to make the request or not.

Setting a session timeout window can help a little bit as it will narrow the window that requests will be trusted by your application.  This will also improve your session security by making it harder for fixation attacks to be effective.

Tokens are the most convenient way to make CSRF harder to accomplish on your site.  When used in conjunction with referrer checks and a narrow session window you can make it significantly harder for an opponent to accomplish a successful attack.

For critically important forms multi-factor authentication are the way to go.  They interrupt the user experience and enforce explicit authentication.  This has a negative affect on your UX but makes it impossible (I think!) for an automated CSRF attack to be effective.

11 March 2016

Exploring Russian Doll Caching


This technique was developed in the Ruby community and is a great way to approach caching partial views. In Ruby rendering views is more expensive than PHP, but this technique is worth understanding as it could be applied to data models and not just views.

In the Ruby world Russian Doll caching is synonymous with key-based expiration caching.  I think it's useful to rather view the approach as being the blend of two ideas.  That's why I introduce key-based expiration separately.

Personally I think Russian Dolls are a bit of a counter-intuitive analogy.  Real life Russian Dolls each contain one additional doll, but the power of this technique rests on the fact that "dolls" can contain many other "dolls".  I find the easiest way to think about it is to say that if a child node is invalidated then its siblings and their children are not affected.  When the parent is regenerated those sibling nodes do not need to be rendered again.

Cache Invalidation

I use Laravel which luckily allows the use of tagging cache entries as a way of grouping them.  I started the habit of tagging my cache keys with the name of the model, then whenever I update the model I invalidate the tag, which clears out all the related keys for that model.

In the absence of the ability to tag keys the next best approach to managing cache invalidation is to use key-based expiration.

The idea behind key-based expiration is to change your key name by adding a timestamp.  You store the timestamp separately and fetch it whenever you want to fetch the key.

If you change the value in the key then the timestamp changes.  This means the key name changes and so does the stored timestamp.  You'll always be able to get the most recent value, and Memcached or Redis will handle expiring the old key names.

The practical effect of this strategy is that you must change your model to update the stored timestamp whenever you change the cache.  You also have to retrieve the current timestamp whenever you want to get something out of the cache.

Nested view fragments, nested cache structure

Typically a page is rendered as a template which is filled out with a view.  Blocks are inserted into the view as partial views, and these can be nested.

The idea behind Russian Doll caching is to cache each nested part of the page.  We use cache keys that mimic the frontend nesting.

If a view fragment cache key is invalidated then all of the wrapping items keys are also invalidated.  We'll look at how to implement this in a moment.

The wrapping items are invalidated, so will need to be rendered, but the *other* nested fragments that have not changed still remain in the cache and can be reused.  This means that only part of the page will need to be rendered from scratch.

I find the easiest way to think about it is to say that if a child node is invalidated then its siblings and their children are not affected.  When the parent is regenerated those sibling nodes do not need to be rendered again.

Implementing automatically busting containing layers

We can see that the magic of Russian Doll caching lies in the ability to bust the caches of the wrapping layers.  We'll use key-based expiration together with another refinement to implement this.

The actual implementation is non-trivial and you'll be needing to write your own helper class.  There are Github projects like Corollarium which implement Russian Doll caching for you.

In any case lets outline the requirements.

Lets have a two level cache, for simplicity, that looks like this:

Parent (version 1)
- Child (version 1)
- Child (version 1)
- Child (version 1)

I've created a basic two tier cache where every item is at version 1, freshly generated.  Expanding this to multiple tiers requires being able to let children nodes act as parents, but while I'm busy talking through this example lets constrain ourselves to just having one parent and multiple children.

Additional cache storage needs

First lets define our storage requirements.

We want keys to be automatically invalidated when they are updated and key-based expiration is the most convenient way to accomplish this.

This means that we'll have a value stored for each of them that holds the most recent value.  Currently all of these values are "version 1".

In addition to storing the current version of each key we will also need to store and maintain a list of dependencies for the key.  These are cache items which the key is built from.

We need to be certain that the dependencies have not changed since our current item was cached.  This means that our dependency list must store the version that each dependency was at when the current key was generated.

The parent node will need to store its list of dependencies and the version that they were when it was cached.  When we retrieve the parent key we need to check its list of dependencies and make sure that none of them have changed.

Putting it together

Now that we've stored all the information we need to manage our structure, lets see how it works.

Lets say that one of the children changes and is now version 2.  We update the key storing its most current value as part of the update to the value, using our key based expiration implementation.

On the next page render our class will try to pull the parent node from cache.  It first inspects the dependency list and it realises that one of the children is currently on version 2 and not the same version it was when the parent was cached.

We invalidate the parent cache object when we discover a dependency has changed.  This means we need to regenerate the parent.  We may want to implement a dogpile lock for this, if you're expecting concurrency on the page.

Only the child that has changed needs to be regenerated, and not the other two.  So the parent node can be rebuilt by generating one child node and reading the other two from cache.  This obviously results in a much less expensive operation.

03 February 2016

Working with classic ASP years after it died

I searched for "dead clown" but all the pictures were too
disturbing.  I suppose that's kind of like the experience
of trying to get classic ASP up and running with todays libraries
I'm having to work on a legacy site that runs on classic ASP.  The real challenge is trying to get the old code to run on my Ubuntu virtual machine.

There is a lot of old advice on the web and most of it was based on much older software versions, but I persevered and have finally managed to get classic ASP running on Apache 2.4 in Ubuntu.

The process will allow you to have a shot at getting your code running, but my best advice is to use a small Windows VM.  There's no guarantee that your code will actually compile and run using this solution, and the effort required is hardly worthwhile.

The Apache module you're looking for is Apache::ASP.  You will need to build it manually and be prepared to copy pieces of it to your perl include directories.  You will also need to manually edit one of the module files.

The best instructions I found for getting Apache::ASP installed were on the cspan site.  You'll find the source tarball on the cpan modules download page.

I'm assuming that you're able to install the pre-requisites and build the package by following those instructions.  I was able to use standard Ubuntu packages and didn't have to build everything from source:

 sudo apt-get install libapreq2-3 libapache2-request-perl  

Once you've built and installed Apache::ASP you need to edit your apache.conf file to make sure it's loaded:

 PerlModule Apache2::ASP  
  # All *.asp files are handled by Apache2::ASP  
  <Files ~ (\.asp$)>  
   SetHandler perl-script  
   PerlHandler Apache::ASP  
  </Files>  

If you try to start Apache at this point you will get an error something like Can't locate Apache2/ASP.pm in @INC (you may need to install the Apache2::ASP module)

Unfortunately the automated installs don't place the modules correctly.  I'm not a perl developer and didn't find an easy standard way to add an external path to the include path, so I just copied the modules into my existing perl include path.  You'll find the requested files in the directory where you build Apache2::ASP

The next problem that I encountered was that Apache 2.4 has a different function name to retrieve the ip of the connecting request.  You'll spot an error in your log like this : Can't locate object method "remote_ip" via package "Apache2::Connection".

The bug fix is pretty simple and is documented at cpan.  You'll need to change line 85 of StateManager.pm.  You'll find the file in the directory where you copied the modules into the perl include directory, and its location is in your error log.

 # See https://rt.cpan.org/Public/Bug/Display.html?id=107118  
 Change line 85:  
     $self->{remote_ip}     = $r->connection()->remote_ip();  
 To:  
     if (defined $r->useragent_ip()) {  
         $self->{remote_ip} = $r->useragent_ip();  
     } else {  
         $self->{remote_ip} = $r->connection->remote_ip();  
     }  

Finally after all that my code doesn't run because of compile issues - but known good test code does work.

This is in no way satisfactory for production purposes, but does help in getting a development environment up and running.

25 January 2016

Laravel - Using route parameters in middleware

I'm busy writing an application which is heavily dependent on personalized URLs.  Each visitor to the site will have a PURL which I need to communicate to the frontend so that my analytics tags can be associated with the user.

Before I go any further I should note that I'm using Piwik as my analytics package, and it respects "Do Not Track" requests.  We're not using this to track people, but we are tying it to our clients existing database of their user interests.

I want the process of identifying the user to be as magical as possible so that my controllers can stay nice and skinny.  Nobody likes a fat controller right?

I decided to use middleware to trap all my web requests to assign a "responder" to the request.  Then I'll use a view composer to make sure that all of the output views have this information readily available.

The only snag in this plan was that the Laravel documentation was a little sketchy on how to get the value of the request parameter in middleware.  It turns out that the syntax I was looking for was $request->route()->parameters()which neatly returns the route parameters in my middleware.

The result is that every web request to my application is associated with a visitor in my database and this unique id is sent magically to my frontend analytics.

So, here are enough of the working pieces to explain what my approach was:

19 January 2016

Using OpenSSH to setup an SFTP server on Ubuntu 14.04

I'm busy migrating an existing server to the cloud and need to replicate the SFTP setup.  They're using a password to authenticate a user and then uploading data files for a web service to consume.

YMMV - My use case is pretty specific to this legacy application so you'll need to give consideration to the directories you use.

It took a surprising amount of reading to find a consistent set of instructions so I thought I should document the setup from start to finish.

Firstly, I set up the group and user that I will be needing:

 groupadd sftponly  
 useradd -G sftponly username  
 passwd username  

Then I made a backup copy of and then edited /etc/ssh/sshd_config

Right at the end of the file add the following:

  Match group sftponly   
    ChrootDirectory /usr/share/nginx/html/website_directory/chroot   
    X11Forwarding no   
    AllowTcpForwarding no   
    ForceCommand internal-sftp -d /uploads   

For some reason if this block appears before the UsePAM setting then your sshd_config is borked and you won't be able to connect to port 22.

We force the user into the /uploads directory by default when they login using the ForceCommand setting.

Now change the Subsystem setting.  I've left the original as a comment in here.  The parameter "-u 0002" sets the default umask for the user.

 #Subsystem sftp /usr/lib/openssh/sftp-server  
 Subsystem sftp internal-sftp -u 0002  

I elected to place the base chroot folder inside the website directory for a few reasons.  Firstly, this is the only website or service running on this VM so it doesn't need to play nicely with other use cases.  Secondly I want the next sysadmin who is trying to work out how this all works to be able to immediately spot what is happening when she looks in the directory.

Then because my use case demanded it I enabled password logins for the sftp user by finding and changing the line in /etc/ssh/sshd_config like this:

 # Change to no to disable tunnelled clear text passwords  
 PasswordAuthentication yes  

The base chroot directory must be owned by root and not be writeable by any other groups.

cd /usr/share/nginx/html/website_directory
mkdir chroot
chown root:root chroot/  
chmod 755 chroot/  

If you skip this step then your connection will be dropped with a "broken pipe" message as soon as you connect.  Looking in your /var/log/auth.log file will reveal errors like this: fatal: bad ownership or modes for chroot directory

The next step is to make a directory that the user has write privileges to.  The base chroot folder is not writeable by your sftp user, so make an uploads directory and give them "writes" (ha!) to it:

 mkdir uploads  
 chown username:username uploads  
 chmod 755 uploads  

If you skip that step then when you connect you won't have any write privileges.  This is why we had to create a chroot base directory and then place the uploads folder off it.  I chose to stick the base in the web directory to make it obvious to spot, but obviously in more general cases you would place this in more sensible locations.

Finally I link the uploads directory in the chroot jail to the uploads directory where the web service expects to find files.

 cd /usr/share/nginx/html/website_directory  
 ln -s chroot/uploads uploads  

I feel a bit uneasy about a password login being used to write files to a directory being used by a webservice, but in my particular use case my firewall whitelists our office IP address on port 22.  So nobody outside of our office can connect.  I'm also using fail2ban just in case somebody manages to get access to our VPN.

01 December 2015

Lowering your AWS bill by moving to reserved instances

The cost saving from reserving an EC2 instance is quite dramatic.  This morning I moved two web servers to reserved instances and am dropping my hosting cost by 64% for those servers.

There isn't actually any effort required in moving your on demand EC2 instance to a reserved instance.  The only actual change is a billing change, you don't need to do anything in your instance configuration.

The only important thing to remember is that you're reserving an instance in a particular availability zone.  The billing effect will only apply to instances launched in the same availability zone.

Amazon will apply the discount of having a reserved instance to your invoice automatically.  They provide quite extensive documentation on reserved instances on their site (here).

23 November 2015

Fixing where php5 cronjob maxlife mails root user about module mcrypt already loaded

I'm running an nginx server with php5-fpm and was always getting mail in /var/mail/root telling me that the cronjob running usr/lib/php5/maxlifetime was throwing warnings.

The warnings were that:
PHP Warning:  Module 'mcrypt' already loaded in Unknown on line 0

To fix this I had a look at the file and noticed that it was looping through the various sapi methods and running a command.  The line in the shell script looks like this:
for sapi in apache2 apache2filter cgi fpm; do
if [ -e /etc/php5/${sapi}/php.ini ]; then
So I removed the mcrypt extension from my apache2 php.ini (/etc/php5/apache2/php.ini) and now the maxlifetime shell script runs without throwing warnings.