Adventures in IT

Posts

Showing posts from 2015

Lowering your AWS bill by moving to reserved instances

The cost saving from reserving an EC2 instance is quite dramatic. This morning I moved two web servers to reserved instances and am dropping my hosting cost by 64% for those servers. There isn't actually any effort required in moving your on demand EC2 instance to a reserved instance. The only actual change is a billing change, you don't need to do anything in your instance configuration. The only important thing to remember is that you're reserving an instance in a particular availability zone. The billing effect will only apply to instances launched in the same availability zone. Amazon will apply the discount of having a reserved instance to your invoice automatically. They provide quite extensive documentation on reserved instances on their site ( here ).

Fixing where php5 cronjob maxlife mails root user about module mcrypt already loaded

I'm running an nginx server with php5-fpm and was always getting mail in /var/mail/root telling me that the cronjob running usr/lib/php5/maxlifetime was throwing warnings. The warnings were that: PHP Warning: Module 'mcrypt' already loaded in Unknown on line 0 To fix this I had a look at the file and noticed that it was looping through the various sapi methods and running a command. The line in the shell script looks like this: for sapi in apache2 apache2filter cgi fpm; do if [ -e /etc/php5/${sapi}/php.ini ]; then So I removed the mcrypt extension from my apache2 php.ini (/etc/php5/apache2/php.ini) and now the maxlifetime shell script runs without throwing warnings.

Updating database when migrating a Wordpress site between domains

If you're using a staging server to test your Wordpress changes then you'll be deploying Wordpress to a new domain once your test team gives the go ahead. Unfortunately this can break Wordpress quite badly. All the links in your content are essentially hard coded into the database content table. There are settings in the options table that help Wordpress with deciding on redirects. Here are three useful sql statements that will make your life a little easier when migrating. You can include them as part of your scripted deploy or just run them manually if you don't deploy Wordpress often. Edit them to suit your domain configuration, but they'll help you to change the links and settings in your database to point to the new domain.

Setting up a new user in Ubuntu from scratch

Adding new users to Ubuntu is easy because of the convenience tools that exist. Start with the command sudo useradd -d /home/testuser -m testuser This creates the user and sets up a default home directory. The user doesn't have a password, but you could add one with passwd if you wanted to. Then create a directory .ssh in their home directory. Create a file called authorized_keys in the directory and copy in contents of the users public key into it. Chown the .ssh directory (and file) to the user and chmod the file to 600. The directory should be mode 700. Make sure that /etc/sshd_config is set up to deny logging in by password. If you want to set up their bash profile you can copy the ".profile" and ".bashrc" files to their home directory. Remember to edit /etc/passwd and set their shell to bash. The user should be able to login using their public key by setting up their .ssh/config on their home machine. Host foo HostName server.ip.addres

Fixing missing msvcp110.dll in xampp on Windows

Microsoft Happiness I need to use a Windows server to deploy a program I'm busy writing. I kept on getting a problem with Apache starting up that prevented it from including the MS sql server pdo drivers. The message was: the program can't start because msvcp110.dll is missing from your computer. Try reinstalling the program to fix the problem and this usually relates to the Visual C++ Redistributable for Visual Studio 2012 package not being installed. I had previously installed this (it's available on the Microsoft site ) but was still getting the error. Eventually I stumbled on this topic on the Apache friends forum which advised to copy the msvcp110.dll file to both the \xampp\apache\bin and the \xampp\php directories. Apparently Apache wasn't able to follow the OS path to find the file. In my case it was already in the php directory but not with the other binaries for Apache. After copying it there Apache restarted without errors and the PDO dr

Fixing broken thumbnails in Drupal

If your autoscaled images are not loading on Drupal then here are some steps to take to troubleshoot the problem. Firstly, if you have logging enabled then visit http://yoursite.com/admin/reports/dblog to see a list of the events. If you see events with the message "Unable to generate the derived image located at...." then check the permissions on your files directory. This is usually /sites/default/files. Then check that the GD library is installed for PHP. On Ubuntu you can install it with apt-get install php5-gd If you don't see any events then try opening the image url in a new tab on your browser to confirm that your web server is properly rewriting the request to index.php instead of trying to serve a file out of the directory. On Nginx you should consider using location blocks like this: # These next locations help with thumbnails - https://www.drupal.org/node/2374961 location @rewrite { rewrite ^/(.*)$ /index.php?q=$1 last; }

Setting up Nginx as a reverse proxy for Apache with SSL termination

Reverse Proxy diagram from Wiki Commons We're currently hosting client sites on a Rackspace server and using their Load Balancer feature as a way to terminate SSL and not have to have multisite certificates. We only attach one node to the Load Balancer so we're paying for more than we're using. My proof of concept is to use Nginx to terminate SSL certificates and proxy to the Apache server. This will save us £ 225 per load balancer, and since we're using ten of them that's quite a significant saving. My first step was to spin up a free tier EC2 instance running Ubuntu 14.04 LTS. I guess you can replace this with your favourite cloud or on-the-metal server. Then I installed my packages. These are the ones I remember so YMMV. sudo apt-get install nginx apache2 fail2ban php5-fpm mcrypt php5-mcrypt openssl php5-cli php5 libapache2-mod-php My network diagram is slightly different from the picture for this post in that the web server is hosted on the

Securing Jenkins with oAuth

Jenkins is pretty easy to secure with the help of some useful plugins. The first that I suggest using is an oAuth provider. Our repositories are hosted on Bitbucket so I'm using their oAuth, but there is also a Github oAuth plugin. The instructions to set up the plugin are very clear (see the plugin page ). When you're configuring your Jenkins to use the oAuth security remember to leave the Authorization setting to "logged in users can do anything" for now. We'll change this later, but we don't want to get locked out of Jenkins when we apply the security settings. Now install the plugin Role Based Authentication Strategy (see the plugin page ). Add a new group called "Anonymous" and uncheck everything. When a user logs into the oAuth they'll be given a message by Jenkins saying that they don't have any permissions. This means that not everybody with a Bitbucket account can access your site so thats a good thing. You just need

Checking the SSL certificates for a list of domains

We have a number of domains that are secured by SSL and need to be able to automate checks for certificate validity and expiry. Luckily there is a script to do exactly this. On Ubuntu you can apt-get-install ssl-cert-check but there are copies of the script online in case your distro doesn't have it as a package. Create a file with a list of the domains you want to check and the port to check on. It should look something like this: yourdomain.com 443 www.anotherdomain.com 443 www.yetanotherclientdomain.com 443 Lets assume you called your file domainlist.txt You can then run ssl-cert-check -f domainlist.txt to get the tool to run through it and display to console the status and expiry date of the domains you provided. Using the options shown in the help page for the script lets you use the script to send an email to you if a certificate is going to expire soon. ssl-cert-check -a -f domainlist.txt -q -x 30 -e yourmail@foo.com If you get a message about a

Allowing the whole wide world to read your S3 bucket

This is a bucket policy that you can use to allow the whole world the ability to read files in your s3 bucket. You might want to do this if you're serving static web content from S3 and don't need the fine grained control that the Amazon documentation details. You will still need to set up permissions on the bucket but this policy will let people read the files you're storing on S3.

Storing large values in Memcached with PHP

Memcached saved my users a minute per query I'm working on a business intelligence tool that requires as an intermediary calculation a list of UK postcodes that are within a radius of a user supplied postcode. It currently takes about 7 seconds to query my Postgres database to get this list out. Unfortunately I need to do this several times as part of a goal seeking function so I need to greatly improve this lookup speed. I'm already using the Postgres earthdistance module and have properly indexed my table so I realized that I needed to look for a caching solution. Memcached places limits on the size of the value you can store. The default setup is 1meg and I'm reluctant to change this because it adds to the deployment burden. My result sets were sometimes up to 4 megs large - searching on a 20 mile radius in London yields a lot of postcodes! My idea was to split the large piece of data into several smaller pieces and to place an index referencing the piece

Using Fail2Ban to protect a Varnished site from scrapers

I'm using Varnish to cache a fairly busy property site. Varnish works like a bomb for normal users and has greatly improved our page load speed. For bots that are scraping the site, presumably to add the property listings to their own site, though the cache is next to useless since the bots are sequentially trawling through the whole site. I decided to use fail2ban to block IP's who hit the site too often. The first step in doing so was to enable a disk based access log for Varnish so that fail2ban will have something to work with. This means setting up varnishncsa. Add this to your /etc/rc.local file: varnishncsa -a -w /var/log/varnish/access.log -D -P /var/run/varnishncsa.pid This starts up varnishncsa in daemon mode and appends Varnish access attempts to /var/log/varnish/access.log Now edit or create /etc/logrotate.d/varnish and make an entry to rotate this access log: /var/log/varnish/*log { create 640 http log compress postr

Solving Doctrine - A new entity was found through the relationship

There are so many different problems that people have with the Doctrine error message: exception 'Doctrine\ORM\ORMInvalidArgumentException' with message 'A new entity was found through the relationship 'App\Lib\Domain\Datalayer\UnicodeLookups#lookupStatus' that was not configured to cascade persist operations for entity: Searching through the various online sources was a bit of a nightmare. The best documentation I found was at http://www.krueckeberg.org/ where there were a number of clearly explained examples of various associations. More useful information about association ownership was in the Doctrine manual , but I found a more succinct explanation in the answer to this question on StackOverflow . Now I understood better about associations and ownership and was able to identify exactly what sort I was using and the syntax that was required. I was implementing a uni-directional many to one relationship, which is supposedly one of the most simpl

Monitoring Varnish for random crashes

I'm using Varnish to cache the frontend of a site that a client is busy promoting. It does a great job of reducing requests to my backend but is prone to random crashes. I normally get about two weeks of uptime on this particular server, which is significantly lower than other places that I've deployed Varnish. I just don't have enough information to work with to try and solve why the random crash is occurring. The system log shows that a child process doesn't respond to CLI and so is killed. The child never seems to be able to be brought up again. My /var/log/messages file looks like this: 08:31:45 varnishd[7888]: Child (16669) not responding to CLI, killing it. 08:31:45 varnishd[7888]: Child (16669) died signal=3 08:31:45 varnishd[7888]: child (25675) Started 08:31:45 varnishd[7888]: Child (25675) said Child starts 08:31:45 varnishd[7888]: Child (25675) said SMF.s0 mmap'ed 1073741824 bytes of 1073741824 08:32:19 varnishd[7888]: Child (2

Fixing when queue workers keep popping the same job off a queue

From Amazon Queue documentation My (Laravel) project uses a queue system for importing because these jobs can take a fair amount of time (up to an hour) and I want to have them run asynchronously to prevent my users from having to sit and watch a spinning ball. I created a cron job which would run my Laravel queue work command every 5 minutes. PHP is not really the best language for long-running processes which is why I elected to rather run a task periodically instead of listening all the time. This introduced some latency (which I could cut down) but this is acceptable in my use case (imports happen once a month and are only run manually if an automated import fails). The problem that I faced was that my queue listener kept popping the same job off the queue. I didn't try running multiple listeners but I'm pretty confident weirdness would have resulted in that case as well. Fixing the problem turned out to be a matter of configuring the visibility time of my queu

Setting up the admin server in HHVM 3.6.0 and Nginx

Hiphop has a built-in admin server that has a lot of useful functions. I found out about it on an old post on the Hiphop blog . Since those times Hiphop has moved towards using an ini file instead of a config.hdf file. On a standard prebuilt HHVM on Ubuntu you should find the ini file in /etc/hhvm/php.ini Facebook maintains a list of all the ini settings on Github . It is into this file that we add two lines to enable the admin server: hhvm.admin_server.password = SecretPassword hhvm.admin_server.port = 8888 I then added a server to Nginx by creating this file: /etc/nginx/conf.d/admin.conf (by default Nginx includes all conf files in that directory): server { # hhvm admin listen 8889; location ~ { fastcgi_pass 127.0.0.1:8888; include fastcgi_params; } } Now I can run curl 'http://localhost:8889' from my shell on the box to get a list of commands. Because I host this project with Amazon and have not set up a s

Searching in a radius with Postgres

Postgres has two very useful extensions - earthdistance and postgis . PostGIS is much more accurate but I found earthdistance to be very easy to use and accurate enough for my purpose (finding UK postcodes within a radius of a point). To install it first find your Postgres version and then install the appropriate package. On my Debian Mint dev box it looks like the below snippet. My production machine is an Amazon RDS and you can skip this step in that environment. psql -V sudo apt-get install postgresql-contrib postgresql-contrib-9.3 sudo service postgresql restart Having done that you should launch psql and run these two commands. Make sure that you install cube first because it is a requirement of earthdistance. CREATE EXTENSION cube; CREATE EXTENSION earthdistance; Now that the extensions are installed you have access to all of the functions they provide. If you want to check that they're working you can run SELECT earth(); as a quick way to test

Ignoring duplicate inserts with Postgres when processing a batch

I'm busy on a project which involves importing fairly large datasets of about ~3.3GB at a time. I have to read a CSV file, process each line, and generate a number of database records from the results of that process. Users are expected to be able to rerun batches and there is overlap between different datasets. For example: the dataset of "last year" overlaps with the dataset of "all time". This means that we need an elegant way to handle duplicate updates. Searching if a record exists (by PK) is fine until the row count in the table gets significant. At just over 2 million records it was taking my development machine 30 seconds to process 10,000 records. This number steadily increased as the row count increased. I had to find a better way to do this and happened across the option of using a database rule to ignore duplicates. While using the rule there is a marked improvement in the performance as I no longer need to search the database for a r

Adding info to Laravel logs

I am coding a queue worker that is handling some pretty large (2gig+) datasets and so wanted some details in my logs that Vanilla laravel didn't offer. Reading the documentation at http://laravel.com/docs/4.2/errors wasn't much help until I twigged that I could manipulate the log object returned by Log :: getMonolog ( ) ; . Here is an example of adding memory usage to Laravel logs. In app/start/global.php make the following changes Log::useFiles(storage_path().'/logs/laravel.log'); $log = Log::getMonolog(); $log->pushProcessor(new Monolog\Processor\MemoryUsageProcessor); You'll find the Monolog documentation on the repo

Support for Postgres broken in HHVM 3.6.0

On my desktop machine I run my package upgrades every day. The other day my Hiphop version got updated to 3.6.0 and suddenly my Postgres support died. Running Hiphop gave a symbol not found error in the postgres.so file ( undefined symbol: _ZTIN4HPHP11PDOResourceE\n ) exactly like the issue reported on the driver repository ( here ). I tried to recompile the postgres driver against Hiphop 3.6.0 but hit a number of problems, mostly to do with hhvm-pgsql-master/pdo_pgsql_statement.cpp it seems. The fix for the incompatibility was unfortunately rolling back to my previous version of Hiphop. To do this on Mint/Ubuntu just do this: Run cat /etc/*-release to get your release information Download the appropriate package for your distro from http://dl.hhvm.com/ubuntu/pool/main/h/hhvm/ Remove your 3.6.0 installation of hhvm: sudo apt-get remove hhvm Install the package you downloaded : sudo dpkg -i <deb package> After that everything should be installed properly and y

Fixing puppet "Exiting; no certificate found and waitforcert is disabled" error

While debugging and setting up Puppet I am still running the agent and master from CLI in --no-daemonize mode. I kept getting an error on my agent - ""Exiting; no certificate found and waitforcert is disabled". The fix was quite simple and a little embarrassing. Firstly I forgot to run my puppet master with root privileges which meant that it was unable to write incoming certificate requests to disk. That's the embarrassing part and after I looked at my shell prompt and noticed this issue fixing it was quite simple. Firstly I got the puppet ssl path by running the command puppet agent --configprint ssldir Then I removed that directory so that my agent no longer had any certificates or requests. On my master side I cleaned the old certificate by running puppet cert clean --all (this would remove all my agent certificates but for now I have just the one so its quicker than tagging it). I started my agent up with the command puppet agent --test whi

Installing the Postgres extension for Hiphop

PostGres is rapidly gaining traction as a solid relational database manager. It provides transaction reliability (ACID), standards compliance, and has a better reputation for handling large datasets than MariaDB / mySQL. Luckily installing it for Hiphop is painless and there is an externally provided extension for it. There was not a prebuilt package available for the PocketRent/hhvm-pgsql extension for my version of Hiphop so I built it following the advice in the project readme. Their instructions worked first time. I did not have success with using the "ini" method of including the extension so had to create a Hiphop configuration file. DynamicExtensionPath = /path/to/hhvm/extensions DynamicExtensions { * = pgsql.so } I placed the snippet they provide (above) into a file called config.hdf in my Hiphop location. I'm using Mint on my dev box so this was /etc/hhvm/config.hdf for me. Then I edited /etc/default/hhvm and set Hiphop up to use

New site in nginx is downloading PHP instead of executing it

I've just set up a new nginx host and was having problems with visiting it. Whenever I loaded the page it would download a copy of my index file. In my case the problem was two-fold. Firstly I had mistyped my server name so it was falling back to an alternative catchall. Secondly Chromium has a "won't fix" bug whereby redirects are cached. Interestingly the cache also persisted into Incognito mode. Clearing my cache with the "Clear Browsing History" menu function didn't work to clear out the redirect cache. This gave the symptom of stopping nginx, stopping varnish, and even stopping hhvm but still having Chromium download the file when I tried to visit it. What did work was to affix a get variable to the link. So instead of visiting http://mysite.local I tried http://mysite.local?foo=1 which invalidated the cache. This wasn't an entirely satisfactory solution so the next thing I tried was to use the developer tools (ctrl shift I) a

Compressing Apache output with mod_deflate on Centos

Apache on Centos ships with mod_deflate installed and enabled by default. To check this you can grep your config file and make sure the line which loads it is not commented out. cat /etc/httpd/conf/httpd.conf | grep LoadModule deflate_module When Apache loads it reads all the config files (ending in .conf) in /etc/httpd/conf.d so we'll add configuration options for mod_deflate into this directory. Lets use a file called deflate.conf to specify the config: <IfModule mod_deflate.c> AddOutputFilterByType DEFLATE text/plain AddOutputFilterByType DEFLATE text/html AddOutputFilterByType DEFLATE text/xml AddOutputFilterByType DEFLATE text/css AddOutputFilterByType DEFLATE text/javascript AddOutputFilterByType DEFLATE image/svg+xml AddOutputFilterByType DEFLATE image/x-icon AddOutputFilterByType DEFLATE application/xml AddOutputFilterByType DEFLATE application/xhtml+xml AddOutputFilterByType DEFLATE application/rss+xml AddOutput

Separating business logic from persistence layer in Laravel

There are several reasons to separate business logic from your persistence layer. Perhaps the biggest advantage is that the parts of your application which are unique are not coupled to how data are persisted. This makes the code easier to port and maintain. I'm going to use Doctrine to replace the Eloquent ORM in Laravel. A thorough comparison of the patterns is available here . By using Doctrine I am also hoping to mitigate the risk of a major version upgrade on the underlying framework. It can be expected for the ORM to change between major versions of a framework and upgrading to a new release can be quite costly. Another advantage to this approach is to limit the access that objects have to the database. Unless a developer is aware of the business rules in place on an Eloquent model there is a chance they will mistakenly ignore them by calling the ActiveRecord save method directly. I'm not implementing the repository pattern in all its glory in this demo.