Drupal Benchmarks

Unless otherwise stated, the benchmarks were run using Drupal 8.0.5 using the standard install profile. All benchmarks were run four times, with the first result discarded (to account for cache warming).

Drupal 8 benchmarks

Drupal 8 on Raspberry Pi model 3 cluster

Drupal Version Notes Page Req/s
8.1.1 Standard profile, Nginx cached, Gigabit / page 2630.80
8.1.1 Standard profile, Nginx cached, 100 Mbps / page 1274.14
8.1.1 Standard profile, anonymous / page 33.33
8.1.1 Standard profile, authenticated / page 11.28

Drupal 8 on Raspberry Pi model 2 Cluster

Drupal Version Notes Page Req/s
8.0.5 Standard profile, Nginx cached, Gigabit / page 3163.00
8.0.5 Standard profile, anonymous / page 33.60
8.0.5 Standard profile, authenticated / page 8.47
8.1.0-beta1 Standard profile, anonymous, BigPipe* / page 33.09
8.1.0-beta1 Standard profile, authenticated, BigPipe* / page 7.93
8.0.5 Minimal profile, anonymous / page 41.73
8.0.5 Minimal profile, authenticated / page 32.60

Drupal 8 on Raspberry Pi Zero cluster

Drupal Version Notes Page Req/s
8.0.5 Standard profile, Nginx cached / page 232.04
8.0.5 Standard profile, anonymous / page 2.49
8.0.5 Standard profile, authenticated / page 0.68
  • Example benchmark used (non-auth requests):
    wrk -t4 -c48 -d10 http://www.pidramble.com/?nocache=true
  • Example benchmark used (authed requests):
    ab -n 750 -c 10 -C "SESS1234=XYZ" http://www.pidramble.com/?nocache=true

*BigPipe is typically used to increase perceived page load times, and can help in certain scenarios with scalability and per-request performance. I wanted to see what impact this module (and 8.1.0) have on Drupal 8 as a whole.

Raspberry Pi vs MacBook Air vs DigitalOcean

I used the local Vagrant configuration in testing/vagrant to bootstrap Dramble locally on a set of 6 VMs on my MacBook Air (1.7 Ghz i7 quad-core, 8GB RAM, Mac OS X 10.11.3), and the DO provisioning playbook in testing/digitalocean to bootstrap Dramble on DigitalOcean’s SSD-based cloud servers (1GB instances with 2 CPU cores, 30GB SSD, and Debian 8 x64).

Here is the comparison for uncached page loads, using the standard install profile, Pi 2 vs. Pi 3 vs. MBA vs. DO:

Dramble location Requests/second
Raspberry Pi 3 11.28 (auth), 33.33 (anon)
Raspberry Pi 2 8.47 (auth), 33.60 (anon)
MacBook Air i7 22.10 (auth), 163.44 (anon)
DigitalOcean Droplets 29.48 (auth), 344.82 (anon)

I've previously run these tests in a configuration that used a more performance-oriented infrastructure architecture (4 Raspberry Pi webservers, a balancer, and a database server, instead of just 2 webservers). That configuration was focused more on performance than availability, but in that configuration, the Pi 2 cluster holds its own and is only about 50% slower than the DigitalOcean droplets.

I'm excited to see how the Raspberry Pi 3 will do in this particular benchmark, with it's upgraded and faster-clocked 64-bit CPU.

Drupal 7 vs Drupal 8

I’m doing some testing of D7 vs D8 performance, but please note this strong caveat: Caching and performance metrics have changed a lot from D7 to D8; these tests were out-of-the-box Drupal installations with the standard profile, meaning Anonymous page caching in D7 wasn't enabled; with it enabled, Drupal 7's anonymous number on the Dramble jumps up to 291.92 requests/second.

Additionally, when using Nginx (or Varnish) as a reverse proxy, Drupal 7 and Drupal 8 will be more or less identical in terms of anonymous throughput, since Drupal/PHP doesn't have to be touched at all.

Drupal version Environment Requests/second
7.43 Dramble (standard profile, anonymous home) 63.65
7.43 Dramble (standard profile, authenticated home) 37.56
8.0.5 Dramble (standard profile, anonymous home) 33.60
8.0.5 Dramble (standard profile, authenticated home) 8.47
7/8 Dramble (cached via reverse proxy, anonymous home) 3163.44

Benchmarks used:

  • $ wrk -t4 -c48 -d10 http://www.pidramble.com/ (anonymous).
  • $ ab -n 750 -c 10 -C "SESSxyz=XYZ" http://www.pidramble.com/?nocache=true (authenticated as uid 1).

Single Raspberry Pi Drupal 8 benchmarks

You can easily install Drupal on a single Raspberry Pi using the Drupal Pi project; it's sometimes easier to get a quick performance overview with just one Pi, and it's always a little faster to experiment on the full LEMP stack running on a single Pi, though networking and clustering capabilities are rather hard to test :)

Pi Drupal version PHP Version Environment Requests/second
3B 7.43 7.0.4 Standard profile, anonymous home 393.51
3B 7.43 7.0.4 Standard profile, authenticated home 52.02
3B 8.0.5 7.0.4 Standard profile, anonymous home 128.03
3B 8.0.5 7.0.4 Standard profile, authenticated home 12.02
3B 8.0.5 5.6.x Standard profile, anonymous home 63.97
3B 8.0.5 5.6.x Standard profile, authenticated home 4.87
2B 8.0.5 7.0.4 Standard profile, anonymous home 94.35
2B 8.0.5 7.0.4 Standard profile, authenticated home 8.61

Benchmarks used:

  • $ wrk -t4 -c48 -d10 http://www.drupalpi.dev/ (anonymous).
  • $ ab -n 750 -c 10 -C "SESSxyz=XYZ" http://www.drupalpi.dev/ (authenticated as uid 1).

MySQL benchmarks

MySQL DB Location Type of test Extra info Time or Req/s
OWC Envoy USB SSD DB export 4.5 MB 0:02
OWC Envoy USB SSD DB import 4.5 MB 0:05
SanDisk Ultra Fit USB DB export 4.5 MB 0:03
SanDisk Ultra Fit USB DB import 4.5 MB 0:10
SanDisk Extreme microSD DB export 4.5 MB 0:03
SanDisk Extreme microSD DB import 4.5 MB 0:10

Importing and exporting databases aren't the most insane thing to attempt for MySQL benchmarking, but I was mostly trying to get a feel for any order-of-magnitude differences in disk I/O with MySQL. I may run further database performance benchmarks at a later time to see how MySQL on the Pi handles different scenarios.

Benchmark details

Cached HTTP requests

wrk -t4 -c100 -d30 http://www.pidramble.com/
ab -n 10000 -c 100 http://www.pidramble.com/

Rationale: Test the raw throughput of the Nginx reverse proxy. This test should only hit the first Pi, acting as a cache in front of the rest of the stack, and should just be a matter of Nginx serving as many requests per second as possible (typically 1,000+, but this depends mostly on the size of the page in question).

This test does nothing to test the entire infrastructure, as, after the first request is cached by Nginx locally, all the rest of the requests are served from that local cache.

Current Bottleneck: Network I/O. If you use a Gigabit adapter, you can push through as many requests/sec that a ~200 Mbps connection can support. If you use the built-in Pi 2 LAN, you can get about 95 Mbps max.

Uncached HTTP requests

wrk -t4 -c4 -d10 http://pidramble.com/?nocache=true
ab -n 100 -c 10 http://pidramble.com/?nocache=true

Rationale: Test the full stack for typical uncached responses. Setting the URL parameter nocache=true flags the request as uncachable according to our Nginx proxy/balancer configuration. Requests will be distributed to the backend webservers, which in turn will load some data from the MySQL database server.

Current Bottleneck: Pi 2 CPU (specifically PHP processes on the webservers). If you run this test, you’ll notice that all the webserver Pis max out their CPU usage almost immediately. The other servers have a tiny bit of load, but nothing major. This is because Drupal 8 is (currently) a pretty CPU-heavy framework that requires a good deal of PHP processing for a standard request (even when using an opcode cache and retrieving data from Redis, which makes requests 10-15% faster than querying through MySQL).

Authenticated HTTP requests

ab -n 100 -c 10 -C "SESSxxxxxxxxxxx=xxxxxxxxxxxx" http://pidramble.com/

Rationale: This is the heaviest kind of request possible for a Drupal 8 site. Many sites don’t have much authenticated traffic, but if your site does, every request goes through the full stack, and often results in some database writes, reads, potentially some caching layer updates, etc. In all my performance optimization, I was able to squeeze about 15-16 requests/second with a 6 Pi cluster. This number should see a linear increase with the number of webservers added, until resource limits of the cache and database server are reached.

Current Bottleneck: CPU (see previous Uncached HTTP requests benchmark). However, depending on the modules enabled and functionality provided to authenticated users, the cache or database servers could see more activity. But that is not the current bottleneck with 6 Raspberry Pi 2 nodes.