Varnish, WordPress, Security.VCL

Now although many sites rarely sees 2billion hits a month or 1000 hits per second, why should they not be capable of this, although the direct correlation is loose, with concurrency can come speed.

Speed in this sense is possibly even serving a page to a single user at a time but why not do it in a decent timeframe.

Throwing hardware at it is one way to achieve this goal but why not do the same off a tiny VPS just with a little bit of caching.

Varnish 3.0

Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up delivery with a factor of 300 – 1000x, depending on your architecture. A high level overview of what Varnish does can be seen in the video attached to this web page.
https://www.varnish-cache.org

How can Varnish help

  • Can be deployed on the same physical box as your web server
    No extra hardware required
  • Can forward requests for pages you choose onto your web server and not interfere
    Avoid the limitations of caching with simple rules
  • Can response to cached requests notably quicker than your web server
    Speed up your users’ experiences and reduce the load on your web server

In Practice with WordPress

Looking specifically at formulating rules for WordPress let’s look at how Varnish works.
Rules are broken down into a couple of categories:
  • vcl_recv
    Rules applied to requests being sent to the server
  •  vcl_fetch
    Rules applied to pages being served from the server
  • vcl_deliver
    Rules applied  when delivering a cached page to the user
Alongside just serving cached content Varnish gives the added ability to filter headers stripping unnecessary keys, adding caching statements, and stripping cookies from domains when they’re not required which will improve the cachability of your pages.

vcl_recv

URL cleaning

  • Ensure any hash parameters don’t make it to the server to contaminate the cache
  • Clear unnecessary GET parameters from the URL i.e. Google Analytics
  • Clean parts of the URL i.e. replacing double slashes to ensure consistency

Encoding Normalization

  • Check if the filetype is appropriate for compression
  • Ensure the correct headers are set

Strip Cookies

  • When serving static files it’s generally unnecessary to pass cookies
  • Varnish can check the filetype and remove the cookies to save the server having to parse them

Allow WordPress login

  • All content won’t be static, users should still be able to login to WordPress and access the backend
  • Varnish can be instructed to pass through pages directly to the webserver
Allow Cache Purging
  • Varnish will remember static content for a certain amount of time
  • To ensure timeliness of publishing when adding new content to your site, it is possible to purge the Varnish cache ensuring that your new content appears immediately

vcl_fetch

Clean unnecessary header parameters

Set correct cache-control headers for the content

Allow for caching of error states from pages

vcl_deliver

Allow for final tidying of the request being served back to the client

WordPress Setup

WP Varnish is a great plugin allowing WordPress to very simply make a request to the varnish server telling it to purge it’s cache when you publish new content.

Find the plugin on GitHub at: https://github.com/pkhamre/wp-varnish

Security.VCL

Security.VCL is a Web Application Firewall implemented in Varnish Control Language.

Security.VCL aims to provide:
– A standardized framework for security-related filters
– Several core rule-sets
– A tool to generate Security.VCL modules from mod_security rules.
– A limited set of default ‘handlers’, for instance CGI scripts to call
upon when Bad Stuff happens.

This is done mainly by using clever VCL, and with as little impact on
normal operation as possible. The incident handlers are mainly CGI-like
scripts on a backend.

https://github.com/comotion/security.vcl

These config files allow you to easily utilise the power of the OWASP Core Rule Set Project helping to reject URLs that appear to have malicious content in them.  Obviously these have to be used with some sanity as there are situations in which your correct URLs could match a rule, but as it’s immensely easy to integrate and unintrusive to the average user of your site it’s hard to go wrong.

Find the config on GitHub at: https://github.com/comotion/security.vcl

Example Config

And just to help with implementations a basic config file for using Varnish with WordPress is available here.

# include security.VCL config file
include "/etc/varnish/security/main.vcl";

# set up backend pointing to your webserver
backend default { .host = "127.0.0.1"; .port = "8080"; }

# define servers with permission to clear cached content
acl purge {
  "your.domain.co.uk";
}

sub vcl_recv {

  # allow PURGE from defined servers
  if( req.request == "PURGE" ){
    if( ! client.ip ~ purge ){
      error 405 "Not allowed.";
    }
    return( lookup );
  }

  # set proxied ip header for client address
  remove req.http.X-Forwarded-For;
  set req.http.X-Forwarded-For = client.ip;

  set req.grace = 30m;

  # ensure hash doesn't make it to server
  if( req.url ~ "#" ){
    set req.url = regsub( req.url, "#.*$", "" );
  }

  # clean google analytics
  if( req.url ~ "(?|&)(gclid|utm_[a-z]+)=" ){
    set req.url = regsuball( req.url, "(gclid|utm_[a-z]+)=[-_A-z0-9]+&?", "" );
    set req.url = regsub( req.url, "(?|&)$", "" );
  }

  # remove double // in urls
  set req.url = regsuball( req.url, "//", "/" );

  # remove urls from part way through urls
  if( req.url ~ "^/?http://" ){
    set req.url = regsub( req.url, "?http://.*", "" );
  }

  # normalize encoding
  if (req.http.Accept-Encoding){
    if (req.url ~ ".(jpg|png|gif|gz|tgz|bz2|lzma|tbz)(?.*|)$"){
      remove req.http.Accept-Encoding;
    } elsif (req.http.Accept-Encoding ~ "gzip"){
      set req.http.Accept-Encoding = "gzip";
    } elsif (req.http.Accept-Encoding ~ "deflate"){
      set req.http.Accept-Encoding = "deflate";
    } else {
      remove req.http.Accept-Encoding;
    }
  }

  # strip cookies for static files
  if (req.url ~ "^/[^?]+.(jpeg|jpg|png|gif|ico|js|css|txt|gz|zip|lzma|bz2|tgz|tbz|html|htm)(?.*|)$") {
    # unset req.http.cookie;
    # set req.url = regsub(req.url, "?.*$", "");
    # always cache
    return( lookup );
  }

  # allow certain paths to allow login and admin to not be cached
  if( req.url ~ "^/[^?]+/wp-(login|admin)" || req.url ~ "^/wp-(login|admin)" || req.url ~ "preview=true" || req.url ~ "^/login" ){
    return( pass );
  }

  # do not cache post requests or HTTP auth
  if( req.request == "POST" || req.http.Authorization ){
    return( pass );
  }

  # strip cookies for cached content
  unset req.http.Cookie;
  return( lookup );

}

sub vcl_fetch {

  set beresp.grace = 30m;
  set beresp.ttl = 48h;

  unset beresp.http.Server;
  unset beresp.http.X-Powered-By;
  unset beresp.http.x-backend;

  # set correct caching headers
  if( beresp.http.Cache-Control ~ "private" ){
    set beresp.http.X-Cacheable = "NO:Cache-Control=private";
  } elsif( beresp.ttl < 1s ){
    set beresp.ttl = 5s;
    set beresp.grace = 5s;
    set beresp.http.X-Cacheable = "YES:FORCED";
  } else {
    set beresp.http.X-Cacheable = "YES";
  }

  if( req.request == "POST" || req.http.Authorization ){
    return( hit_for_pass );
  }

  # cache error pages for a short amount of time, allows them to change but not cause server problems under high load
  if( beresp.status == 404 || beresp.status == 500 || beresp.status == 301 || beresp.status == 302 ){
    set beresp.ttl = 1m;
    return( deliver );
  }

  # don't cache non success responses
  if( beresp.status != 200 ){
    return( hit_for_pass );
  }

  return(deliver);
}

sub vcl_deliver {
  # set header for successful cache requests
  if( obj.hits > 0 ){
    set resp.http.X-Cache = "HIT";
  }else{
    set resp.http.X-Cache = "MISS";
  }

  # clean unnecessary headers
  unset resp.http.Via;
  unset resp.http.X-Varnish;
  unset resp.http.X-Varnish;
  unset resp.http.Via;
  unset resp.http.Age;
  unset resp.http.X-Powered-By;
  unset resp.http.X-Cacheable;
  unset resp.http.Server;
}

sub vcl_error {
  # try to request pages that error multiple times
  if( obj.status == 503 && req.restarts < 2 ){
    set obj.http.X-Restarts = req.restarts;
    return( restart );
  }
  if( obj.status == 301 ){
    set obj.http.Location = req.url;
    set obj.status = 301;
    return( deliver );
  }
}

sub vcl_hit {
  if( req.request == "PURGE" ){
    purge;
    error 200 "Purged.";
  }
}

sub vcl_miss{
  if( req.request == "PURGE" ){
    purge;
    error 200 "Purged.";
  }
}

WordPress.vcl

Thanks

Many of these rules were built thanks to examples from ocaoimh.ie


Also published on Medium.