Skip to content

Category: Linux

Linux/Ubuntu CPU Perf Scaling with Modern Intel CPUs

I’m posting this largely because all the documentation I can find, and the discussions around this appear to be out of date, or at least not entirely accurate.

I run a Ubuntu server as my home NAS, storage server, general do things host, and for development work on some sites I maintain. It’s built around an Intel Xeon E3-1220 v2, 16 GB of DDR3 RAM and storage running on ZFS.

By default, Ubuntu runs a process on boot called ondemand (/etc/init.d/ondemand). The process is simple enough, it’s a 73 line shell script that basically looks at the available CPU governors (/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors).

It then looks at what governors are available and if either “interactive”, “ondemand”, or “powersave” are, it sets those governors in that order.

So for example, if you have an E3-1220 v2; then the output of /sys/.../scaling_available_governors will be powersave perforamnce.

Since powersave shows up in the available list, it will then echo that to /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor.

So far all this seems reasonable, in a sense, and for older CPUs or maybe AMD or ARM cpus (I don’t know about this one for sure, as I don’t have any AMD or ARM systems) this may be the ideal way to go about getting the clocks on the CPU to scale dynamically.

Ending an Era of OpenBSD: Or a Brief History of my Firewalls

For something approaching 20 years, I’ve used OpenBSD to firewall my network from the internet and provide basic network services (DHCP, DNS, NTP, VPN, etc.). Just recently I’ve decided to retire OpenBSD and stand alone computers from the role of firewalls for something smaller, lower power, and easier to manage and upgrade.

I’ve been steadily moving towards smaller and lower power systems for as long as I’ve been doing OpenBSD based firewalls. My first machines were nothing more than mid-tower desktops that I had upgraded away from. In 2000-2003 I made my first moves towards building something more specialized, when I switched from using old towers to building a specific micro-atx pizza box style machines; though still with standard Athlon XP CPUs and parts.

In 2010 I replaced the micro-ATX Athlon XP with a mini-ITX based Intel Atom D510 machine. This halved the power consumption, from somewhere around 80-100 W[1] to something closer to 40 W.

Around 2015 or so I started looking into running OpenBSD off a USB flash drive instead of a standard hard drive. Part of this was to remove the power consumption of the HDD from the equation. In this, final configuration, the D510 machine with 2 NICs and 2GB of RAM turned in at a somewhat respectable 30 W. Though that was hampered by an abysmally bad PSU with almost 0 power factor correction that pulled nearly 60 VA.

Lua String Compare Performance Testing (Nginx-Lua)

In another article I wrote about my ongoing attempt to move my server’s WordPress’s security plugin’s firewall functionality out of PHP and into the embedded lua environment in Nginx. While I’m certainly not nearly as the scale where the C10K problem is a real issue for me, I still do my best to insure that I’m doing things as efficiently as possible.

In my last post, I was looking at the performance degradation between doing no firewalling at all (just building the page in WordPress and serving it), and using the embedded Lua environment to do basic application firewalling tests.

In that article, I saw approximately 425 microsecond latency impact form the Lua processing compared to just building the page. Of course, that was still on the order of 2 orders of magnitude faster than doing the same work in PHP.

Part of the larger part of the actual processing that is being done, is looking for various strings in the myriad of data that’s pushed along as part of the various requests. Things like, know bad user agents, key bits used in SQL injection attacks, and various things like that.

Lua and Nginx both offer some options for searching strings. On the Lua side, there’s the built in string.find() (Lua5.1 docs) and associated functions. On the Nginx-Lua side of things there’s ngx.re.find() (lua-nginx-module docs) which allows calls into Nginx’s regex engine.

I’ve done a significant amount of digging trying to find performance informational about both of these methods, and I haven’t been able to find any. So I sat down and did my own testing.

Nginx-Lua Module: Access Control Performance Testing

I’ve been playing with the Lua engine in Nginx for a while. My primary intent is to offload most, if not all, of my WordPress security stuffy from running in the PHP environment to running in something that potentially won’t use as much in the way of resources. The first question I need to answer before I can reasonably consider doing this is what kind of of overhead doing extended processing in Nginx–Lua imposes in terms of performance.

To put some perspective on this, I’ve been running the WordPress security plug-in Word Fence for a while now. When I compare my production server (which has Wordfence enabled) and my development server (which doesn’t have word fence installed, but is otherwise running the same plugins and code base), I see on average a 10–20 ms increase page rendering times, and nearly 20 additional database queries per page.

The overhead from Wordfence isn’t creating a performance problem per say, however, shaving even 15 ms off a 50–60 ms page render time is an appreciable improvement. Additionally, less resources consumed by a bad actor means more resources are available for actual users.

In any even the question here is how much performance overhead does the Nginx-Lua module carry for doing some reasonable processing.

Setting up OpenVPN with Certificates

I did this a couple of years ago, with certificates that had a 1 year expiry date. Then my certs expired, and I’d forgotten what to do. So I figured it out again, and this time I’m writing it down.

There are two ways to setup client auth in OpenVPN, a shared secret and TLS certificates. TLS certificates are the preferred way if you can manage them, as they make it possible to revoke access to devices without having to change the shared secret for every other device.

To do this you need to setup a certificate authority and sign and issue your own certificates. Most OpenVPN guides tell you how to do this using OpenSSL and it’s associated long cryptic commands. I like my method better.

Lets Encrypt & Nginx: Www-root method and Subject Alt Names

Digital Ocean has a pretty good guide for setting up Lets Encrypt with Nginx on Ubuntu 14.04. However, their guide requires you to turn down your Nginx server while initially getting you Lets Encrypt TLS certificates, this of course is problematic for server/site operators who either need or want to continue to have service continuity while getting lets Encrypt Certificates. They also don’t explain how to use subject alternative names to handle multiple sub domains on the same server.

Lets Encrypt’s software requires that they be able to connect to your server to verify that the domain you’re attempting to register a certificate for you control. In the Digital Ocean guide, this process is handled by using the built-in web server in the lens encrypt package. However, Lets Encrypt does not need to operate in this manner to create new certificates, it can use the wwwroot/filesystem approach and your existing server configuration.

The process is very similar to Digital Oceans guide, but the order of operations are slightly different.

Fixing my slow Wordpress, Nginx, & WP-Supercache Setup

I probably should have caught this one a long time ago, but I didn’t. For a while now I’ve been complaining endlessly, at least in my internal monologue, about the poor performance I’ve been seeing from WP-Supercache on my VPS. Preloaded cache files simply shouldn’t take 1-1.5 seconds to serve up. They should be quick quick quick. Yet I was seeing such slow load times.

I’ve been struggling with the issue for quite sometime. I had changed from WP-Supercache to W3 Total Cache, added memcached, cached DB operations and so on and so forth trying to figure out why my pages were so slow to load.

What struck me, was that when I rolled back my W3 Total Cache implementation to no caching, responsiveness stayed about the same or got slightly better. As a non-logged in user, the inverse should have been true. Pages should have taken longer to load without a caching implementation than with one.

Then it hit me, Nginx config files are parsed in order.

Okay, let me take  a step to the side here a moment. On Dreamhost VPSes if you’re running Nginx the server looks for supplementary config files in ~/nginx/$domain/ for each of the virtual hosts it’s configured to serve. Knowing that Nginx config files are read in order, I organized mine using the ##-description.conf style. So I might have 10-rewrites.conf, 50-wordpress.conf, 60-supercache.conf.

On a lark the idea struck me that maybe I should try loading the supercache rules before I got to the regular Wordpress rules, which include the directives for passing requests for .php files back to the PHP back-end.

A quick rename of 45-supercache.conf to 30-supercache.conf, thus placing the supercache rules ahead of the Wordpress rules, and my non-logged in user (i.e. reading from the static HTML cache) page response times dropped form 1-1.5s to less than 400ms.

Suffice to say, all my grumbling about slow performance was entirely due to poor configuration on my part.

I’m sure there’s probably a note about this somewhere in one of the myriad of guides for running Wordpress and WP-Supercache on Nginx, but I missed it.

I’d still love to improve the response time for pages that are being processed with PHP, but I can live with it being a touch slower for me knowing that it’s a whole lot faster for everybody else.

Bash: Watching Aliases

If you’re trying to watch the output of an alias, you need to make watch an alias of watch as well.

For example, if you have an alias like say:

alias zfslist='zfs list -o name,volsize,used,avaiable,referenced,compressratio,mountpoint'

And want to watch the output over time as if you ran:

user@host:~/$ watch zfslist

Then you need to set up watch as an alias to watch, as follows.

alias watch='watch '

The space after watch is necessary to get bash to expand subsequent aliases after the first one.

Unclear Instructions: Setting Ubuntu/Unity Keyboard Shortcuts

I don’t know who wrote the text for the dialog for changing Keyboard shortcuts in Ubuntu 11.10, but wow could they have been more misleading.

The instructions read, “click the row and hold down the new keys,” in reality the only place that will allow you to change the short cut is to click the text directly under the black arrow in the above image. I spent the better part of 20 minutes trying to figure out why I couldn’t change a keyboard shortcut because the directions are utterly useless.

This should probably also be filed under: Won’t file a bug report as it’s too much bloody work, and now I won’t forget how to do it.

Wordpress on Nginx on Dreamhost

By the time this is done being written I’ll have been running Wordpress on Nginx on a Dreamhost VPS. Better yet, I’ll be doing it with a smaller more well defined resource foot print, with better response times, and faster page loads than I had with Apache. The tradeoff, some more upfront configuration.

I’ve covered the broad strokes about my motivation here, here, and here. In short, Nginx offers a more consistent dependable level resource usage while still having the capability of scaling to serve may users, reducing the possibility of crashing the VPS while under moment of heavy load.

This is long, and may end up being multiple parts, so if you’re interested follow the jump and keep reading.