Monday, December 15, 2014

Caching Laravel with Varnish


PHP Framework popularity as at 2013 - Sitepoint
After having a very good experience with using Varnish to cache a Wordpress site we decided to look at caching Laravel.

Laravel always generates cookies regardless of whether a person is logged in or not.  This interferes with Varnish which by default will pass all requests with a cookie to the backend and skip the cache.

In our particular case our site supported the ability for users to login and would then present them with custom content.  This means that cookies are not restricted to a particular path so we can't discard cookies based on the request as we did for Wordpress when discarding everything except /wp-admin/* requests.

My solution was to use a package called session-monster ( Packagist ) which sets a response header if the data in the Laravel session can be ignored.  Varnish can detect this header and prevent the cookie from being set since we don't really need it.  This together with the varnish config below handily caches pages for all the users who are not logged in.

Unfortunately we're doing this as an afterthought to add value to a client and caching was not part of our original project design.  This means that there is not development time available to make use of edge side includes which would allow caching the parts of a page that are static even for logged in users.  Early proof of concept tests show that implementing ESI is not particularly difficult.  Here's a useful looking blog post on how to implement it.  Luckily in our case we don't expect there to be many logged in users compared to non.

So assuming that you've gotten your nginx, hhvm, and varnish up and running here is an example configuration file:

 backend default {  
  .host = "127.0.0.1";  
  .port = "8080";  
 }  
 acl purge {  
  "127.0.0.1";  
  "localhost";  
 }  
 sub vcl_recv {  
   # handle purge requests  
   if (req.request == "PURGE") {  
     if (!client.ip ~ purge) {  
       error 405 "Not allowed.";  
     }  
     ban("req.url ~ "+req.url+" && req.http.host == "+req.http.host);  
     error 200 "OK";  
   }  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
     return(lookup);  
   }  
 }  
 sub vcl_fetch {  
   # strip the cookie before the static file is inserted into cache.  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
     unset beresp.http.set-cookie;  
   }  
   # remove some headers we never want to see  
   unset beresp.http.Server;  
   unset beresp.http.X-Powered-By;  
   unset beresp.http.X-Pingback;  
   set beresp.do_esi = true; /* Do ESI processing */  
   set beresp.ttl = 10m;  
   # don't cache response to posted requests or those with basic auth  
   if ( req.request == "POST" || req.http.Authorization ) {  
      return (hit_for_pass);  
   }  
   # Laravel always adds a session cookie - we remove it with session monster and check it here  
   # Do this before checking the page state but after post  
   if (beresp.http.X-No-Session ~ "yeah") {  
     unset beresp.http.set-cookie;  
   }  
   # only cache status ok  
   if ( beresp.status != 200 ) {  
     return (hit_for_pass);  
   }  
   # else ok to cache the response  
   return (deliver);  
 }  
 sub vcl_deliver {  
   if (obj.hits > 0) {  
     set resp.http.X-Cache = "HIT";  
   }  
   else {  
     set resp.http.X-Cache = "MISS";  
   }  
   unset resp.http.Via;  
   unset resp.http.X-Varnish;  
 }  
 sub vcl_hit {  
  if (req.request == "PURGE") {  
   purge;  
   error 200 "OK";  
  }  
 }  
 sub vcl_miss {  
  if (req.request == "PURGE") {  
   purge;  
   error 404 "Not cached";  
  }  
 }  

Installing the Laravel side of things is simple:
  1. Add a require for session monster to your composer file ( "haifanghui/session-monster": "dev-master" )
  2. Edit your application config and include the provider as in the snippet below
  3. Edit app/config/session.php and set the session lifetime to a number you feel comfortable with
   'providers' => array(  
     'HaiFangHui\SessionMonster\SessionMonsterServiceProvider'  
   ),  

You need to set the session timeout so that the cookie expires sometime after the user logs out.  Even though Laravel will stop emitting the cookie when the user logs out the browser will keep sending it and breaking the cache.  

Wordpress on Hiphop / Nginx / Varnish

I recently was asked to investigate speeding up one of the Wordpress sites of a fairly large government organization in Britain.  A large part of my investigation focused on the server stack because I felt that we could get more out of the hardware that was provisioned for us.

I decided to set up a stack on my development machine to see how it would work and if it was feasible.  I settled on nginX with hiphop and a Varnish frontend cache.  I realize that nginX would be just fine as the cache and server but in this particular case it would not be possible to replace Apache with nginx on the live server.  I also wanted to experiment with ESI and it looked better documented in Varnish than nginx.

Installing HHVM is very easy:

 wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -  
 echo deb http://dl.hhvm.com/ubuntu saucy main | tee /etc/apt/sources.list.d/hhvm.list  
 apt-get update  
 apt-get install hhvm  

Installing nginx is also very easy:

 sudo apt-get update  
 sudo apt-get install nginx  

Instead of manually configuring nginx to use hhvm I used a tool which ships with it (found at /usr/share/hhvm/install_fastcgi.sh).  The Github page has documentation (here) in case you don't want to use the packaged install script.  Note that the install script will install for Apache and nginx.

There is a tool (here) that will migrate your Apache config to nginx.  I used it to get a demonstration config file which I then edited after RTFM on nginx config.

My test config nginx file ( /etc/nginx/sites-enabled/default ) looks like this:

 server {  
     listen    8080;  
     server_name fastwp.andy;  
     root /home/andy/development/fastwp.andy;  
     index index.php index.html;  
     port_in_redirect off;  
     set $cache_uri $request_uri;  
     if ( $request_method = POST) {  
       set $cache_uri 'null_cache';  
     }  
     location / {  
       add_header X-Esi 1;    
       try_files $uri $uri/ /index.php?$args;  
     }  
 }  

Take note of the port_in_redirect because it can help with issues around Varnish or nginx including 8080 in the url when doing a redirect (this can happen if you access a url without a trailing slash).  If you're getting port 8080 and you've tried this then also double check your Wordpress site config to make sure the site root does not include 8080.

Varnish is included in the Ubuntu packages, but on their site they recommend rather using the packages supplied by varnish-cache.org.  They list the steps required to set it up and I'm not going to reproduce them here because they might change - rather go to their site.

To configure Varnish on Ubuntu you need to nano /etc/default/varnish.  On RHEL this file is /etc/sysconfig/varnish

There are a number of options provided.  The easiest way to get running is to pick alternative 2 by commenting out the other options.

Just make sure to change the port to 80 as below:

 DAEMON_OPTS="-a :80 \  
        -T localhost:6082 \  
        -f /etc/varnish/default.vcl \  
        -S /etc/varnish/secret \  
        -s malloc,256m"  

At this point Varnish will listen for incoming web requests on port 80 and all we need to do is wire it up to nginx. To do so nano /etc/varnish/default.vcl

It stitched my varnish configuration from a number of sources and will run through it piece by piece here.

Firstly we tell Varnish where to find nginx and set up an authentication that identifies the local machine (more later).

 backend default {  
  .host = "localhost";  
  .port = "8080";  
 }  
 acl purge {  
  "127.0.0.1";  
  "localhost";  
 }  

After that we add to the various hooks that Varnish provides.  The code below will likely need to be modified for your site.  I got it from a variety of sources and there might even be some unnecessary duplication.

 sub vcl_recv {  
   # only using one backend  
   set req.backend = default;  
   # only cache example.com and optionally the www subdomain  
   if (req.http.host !~ "(www)?example.com") {  
      return(pass);  
   }  
   # remove cookie from static content and always return cached version  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
     unset req.http.cookie;  
     return(lookup);  
   }  
   # allow for purge option but only from the site we allow  
   if (req.request == "PURGE") {  
    if (!client.ip ~ purge) {  
     error 405 "Not allowed.";  
    }  
    ban("req.url ~ "+req.url+" && req.http.host == "+req.http.host);  
    error 200 "OK";  
   }  
   # set standard proxied ip header for getting original remote address  
   set req.http.X-Forwarded-For = client.ip;  
   # logged in users must always pass  
   if( req.url ~ "^/wp-(login|admin)" || req.http.Cookie ~ "wordpress_logged_in_" ){  
     return (pass);  
   }  
   # don't cache search results  
   if( req.url ~ "\?s=" ){  
   #  return (pass);  
   }  
   # always pass through posted requests and those with basic auth  
   if ( req.request == "POST" || req.http.Authorization ) {  
      return (pass);  
   }  
   # remove cookies from everything other than admin areas so we can cache content  
   if (!(req.url ~ "wp-(login|admin)")) {  
     unset req.http.cookie;  
   }  
   # else ok to fetch a cached page  
   return (lookup);  
 }  
 sub vcl_fetch {  
   # remove some headers we never want to see  
   unset beresp.http.Server;  
   unset beresp.http.X-Powered-By;  
   unset beresp.http.X-Pingback;  
   set beresp.do_esi = true; /* Do ESI processing */  
   set beresp.ttl = 24h;  
   # don't cache response to posted requests or those with basic auth  
   if ( req.request == "POST" || req.http.Authorization ) {  
      return (hit_for_pass);  
   }  
   # only cache status ok  
   if ( beresp.status != 200 ) {  
     return (hit_for_pass);  
   }  
   # remove cookies from static content  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
    unset beresp.http.set-cookie;  
   }  
   # Drop any cookies Wordpress tries to send back to the client.  
   if (!(req.url ~ "wp-(login|admin)")) {  
     unset beresp.http.set-cookie;  
   }  
   # else ok to cache the response  
   return (deliver);  
 }  
 sub vcl_deliver {  
   if (obj.hits > 0) {  
     set resp.http.X-Cache = "HIT";  
   }  
   else {  
     set resp.http.X-Cache = "MISS";  
   }  
   unset resp.http.Via;  
   unset resp.http.X-Varnish;  
 }  
 sub vcl_hit {  
  if (req.request == "PURGE") {  
   purge;  
   error 200 "OK";  
  }  
 }  
 sub vcl_miss {  
  if (req.request == "PURGE") {  
   purge;  
   error 404 "Not cached";  
  }  
 }  
 sub vcl_hash {  
   hash_data( req.url );  
   if ( req.http.host ) {  
     hash_data( regsub( req.http.host, "^([^\.]+\.)+([a-z]+)$", "\1\2" ) );  
   } else {  
     hash_data( server.ip );  
   }  
   # ensure separate cache for mobile clients (WPTouch workaround)  
   if( req.http.User-Agent ~ "(iPod|iPhone|incognito|webmate|dream|CUPCAKE|WebOS|blackberry9\d\d\d)" ){  
     hash_data("touch");  
   }  
   return (hash);  
 }  

For further reading I recommend:

Monday, July 28, 2014

Server sent events in PHP

Server sent events are really pretty cool.  They let your application function a little bit more like an application and a little less than a click adventure in the web wonderland.  They are very simple to code and allow your backend code to notify your frontend of progress or other changes.
I could get into trouble for this class comment

At time of writing they are not supported by Internet Explorer ( see here ) but hopefully Microsoft will either stop making Internet Explorer or bring it up to speed with modern browsers.  Yeah I know that's not going to happen, but we can wish right?

You don't need to retain an open connection for every visitor to your site because browsers will reopen a closed connection after a few seconds.  The additional load for implementing SSE seems to be manageable according to people like this guy who have done tests.

My first project implementing them was for a database consistency tool that is intended to be run against the database in off-peak times to verify that things are as we expect them to be.  Basically I wanted a tool to check for dirty data, but more importantly to audit the financial transactions occurring on our site.  I wanted to play with SSE so implemented them as a notification service.  As the program runs through the various batches of tests it spits out information to the frontend about what it's busy with, if it found an error, and other interesting snippets.

There are bunches of tutorials available on them out there so I'm not going to paste code.

One thing that did concern me a bit is the possibility of forged messages.  It's pretty simple to check the origin of incoming messages against a whitelist of domains.  Here is an example from html5rocks.com that has a simple origin validation:

 source.addEventListener('message', function(e) {  
  if (e.origin != 'http://example.com') {  
   alert('Origin was not http://example.com');  
   return;  
  }  
  ...  
 }, false);  

Personally I think SSE should be in the toolkit of any web application developer.

Saturday, November 16, 2013

Useful tool when helping Windows users install their OS

This post is more of a "note to self" than an attempt to be useful to somebody else.

I am even typing this post in Internet Explorer. It feels like a Pterodactyl is about to swoop into the room and try to shit on HTML standards.

In any case I gave my backup laptop to my kid and so had to install Windows on it. Why? Well I want her to be able to play games so Windows seems the best choice. Plus she can still learn open source programming languages, albeit in a funny way.

Luckily I remembered this blog that I read and they posted this really nifty tool called Ninite. Click the link here (http://ninite.com/). This helpful tool lets you download a single install file that installs free (either OSS or free to use) software like Libre Office, Flash, Notepad++, antivirus, etc.

Lets just say that it feels almost like an Ubuntu meta-package that helpfully installs everything you need, but you get to choose.

Plus it's all free and the only software I don't trust (at the moment) is Truecrypt. Why? Well we don't know who wrote it.

So save yourself a bunch of time and use Ninite :) It's a really useful tool to introduce kids to FOSS.

Thursday, September 26, 2013

A rare Google UI failure

A Google UI employee contemplating how to make
Larry and Sergei some money
I updated my Google Chrome mostly because I trust Google in terms of UI. In the past they've always released updates that make things faster or easier to reach (while perhaps adding new extra functions).

Why am I so disappointed? Well now when you open a new tab you're confronted with a Google search bar and your most visited sites. This most probably sounds like a good thing. BUT... your tab is in the omnibar which is the same as a Google search if you don't specify a valid URL. So to make use of the default Google search requires you to tab or click into the box.

More annoyingly you can't choose to have your installed apps show instead of a Google search box and an acknowledgement that they track your browsing activity and share that with the American spy agency. So instead of hitting CTRL-T and clicking feedly I have to face NSA surveillance by Google showing me they're tracking me, click on the top left to open apps, then click Feedly. Why? Because for some reason Google thinks it is more important for me to have no option but to use their search engine and that I am incapable of deciding to use applications instead of sites I visit regularly.

Do I support American spy surveillance? Not really. This nation has invaded a foreign nation every decade for the past 60 years. They're the only nation to have ever used nuclear WMD on civilian targets. They routinely hack and spy on neutral foreign nations. I trust the American government a great deal less than I trust Google.

Nonetheless, lets assume for the moment that using an American Internet product is somehow "safe".

You can't choose which search engine is default. Google says "Don't be evil" .... really? As soon as you have a majority browser share you enforce your search engine as a mandatory non-changeable option? And you require me to click into it? Fuck off Larry and Sergei.... you click (since you charge people for clicks ).

And now instead of pressing CTRL-T and hitting an app (like RescueTime or my REST interface or my news browser) I have to move my mouse to the top left? How is this beneficial to me, Larry or Sergei? It really reduces the usefulness of Chrome as a developer browser. What benefit is there for Chrome to load your Google default search into my app tabs when your omnibar works the same? Are you fucking retarded? Don't you see you're stopping me from hitting my apps quickly?

Fork Chromium and get a sensible community version.

Thursday, August 15, 2013

Fixing the "smsbox_list empty" error in Kannel

I got this error even though I had an smsbox defined. 

Unfortunately I had forgotten to create the smsbox-route group, so this is a very quick fix:

#---------------------------------------------
## SMS ROUTING
##---------------------------------------------
group = smsbox-route
smsbox-id = mainbox
smsc-id = ztemodem-smsc-group
sim-buffering = false
... continues ...

Just make sure the id's matchup, so my smsbox group begins with this:
#---------------------------------------------
## SMS BOX
##---------------------------------------------
group = smsbox
smsbox-id = mainbox
... continues ...

And my SMSC looks like this:
#---------------------------------------------
## GSM MODEM SMSC
##---------------------------------------------
group = smsc
smsc-id = ztemodem-smsc-group
... continues ...

This successfully cleared the "smsbox_list empty" error and allowed messages to be delivered properly. Kannel had until then been able to send messages and receive delivery reports, but was not reading messages stored on the SIM.

I've included sim-buffering = false in the above config because some people on the kannel mailing list suggested experimenting with it.

Also, the "message-storage" tag in my modem definition is set to "sm". Some people suggested trying it on "me", but this didn't solve the problem for me and setting to "sm" now works.

Wednesday, August 14, 2013

Setting up a Kannel SMS centre on a Raspberry Pi with a K3565-Z modem

Our office (who wants me to mention how awesome they are) had a need for a means to send/receive automated sms messages relating to online banking. We decided to give a Raspberry Pi the job and so I have had the fun of setting it all up.

First off I installed Rasbian ( http://www.raspbian.org/ ) which is a Debian fork and is sufficiently easy to do following the guide on eLinux ( here ) which is linked to from the Raspberry Pi project website downloads page.

Next came a standard LAMP stack install.  I needed to have a web-server running because we will be running a management package on the box to track messages etc.

Next came Kannel.  I used the default package available from the Rasbian repositories so there was no need for fiddling.

My kannel.conf file was the next hurdle.  For the most part this is pretty easy to set up from the examples given on Kannel's site and elsewhere on the web.  I want to just mention the modem part, which was perhaps the most challenging part of this endeavour:

 #---------------------------------------------                
 ## GSM MODEM SMSC  
 ##---------------------------------------------   
 group = smsc  
 smsc = at  
 smsc-id = ztemodem-smsc-group  
 device = /dev/gsmmodem  
 log-level = 0  
 connect-allow-ip = "127.0.0.1; localhost; 192.168.3.*"  
 modemtype = auto  
 speed = 9600  
 my-number =   
 sms-center = +27831000113  
 #validityperiod = 167  
 alt-charset = "ASCII"  
 #---------------------------------------------                
 ## MODEM DEFINITION  
 ##---------------------------------------------   
 group = modems  
 id = vodafone  
 name = "vodafone"  
 detect-string = "ZTE INCORPORATED"  
 message-storage = sm  
 init-string = "ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0"  

Note that I've set an init string explicitly. This is because I was getting the "got +CMT but pdu_extract failed" error consistently when I sent a message. I obtained that init string by using wvdialconf, so if you have a modem other than the K3565-Z you might be able to use it to get your init string. Apparently it's not required for all modems though.

Ah, but I'm missing the best part... how do you actually get the modem to be a modem? The K3565-Z is a "zerocd" modem. When you plug it into a Windows box it mounts as a CD-ROM, installs drivers, which then flip it into being a modem.

To illustrate here is the output of lsusb on the Raspberry with the modem as it is on powerup:

 Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp.  
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  
 Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp.  
 Bus 001 Device 004: ID 19d2:2000 ZTE WCDMA Technologies MSM MF627/MF628/MF628+/MF636+ HSDPA/HSUPA  

And here is the output after using usb_modeswitch -c /etc/usb_modeswitch.conf to flip it into a modem:

 Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp.  
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub  
 Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp.  
 Bus 001 Device 006: ID 19d2:0063 ZTE WCDMA Technologies MSM K3565-Z HSDPA  

Note that the device product has changed, as well as the description. I am not sure why but the stick was never mounted as a cdrom for me, so I could not eject it to flip its state. So I had to use usb_modeswitch to do it.

 sudo usb_modeswitch -c /etc/usb_modeswitch.conf  

I found the correct settings for this on a forum (here ) which also gives some useful information on how to accomplish the flip. You need to append the following snippet to your /etc/usb_modeswitch.conf file.

 #######################################################  
 # ZTE-K3565 , VODAFONE BITE GSM  
 # Gediminas Simanskis  
 # www.edevices.lt  
 DefaultVendor= 0x19D2  
 DefaultProduct= 0x2000  
 TargetVendor= 0x19D2  
 TargetProduct= 0x0052  
 MessageEndpoint=0x01  
 MessageContent="5553424308E0CC852400000080000C85000000240000000000000000000000"  

You will need to experiment on finding the best way to execute the flipping, but this is really the only complication in the setup.