Skip to content

Lua String Compare Performance Testing (Nginx-Lua)

Last updated on August 17, 2018

In another article I wrote about my ongoing attempt to move my server’s WordPress’s security plugin’s firewall functionality out of PHP and into the embedded lua environment in Nginx. While I’m certainly not nearly as the scale where the C10K problem is a real issue for me, I still do my best to insure that I’m doing things as efficiently as possible.

In my last post, I was looking at the performance degradation between doing no firewalling at all (just building the page in WordPress and serving it), and using the embedded Lua environment to do basic application firewalling tests.

In that article, I saw approximately 425 microsecond latency impact form the Lua processing compared to just building the page. Of course, that was still on the order of 2 orders of magnitude faster than doing the same work in PHP.

Part of the larger part of the actual processing that is being done, is looking for various strings in the myriad of data that’s pushed along as part of the various requests. Things like, know bad user agents, key bits used in SQL injection attacks, and various things like that.

Lua and Nginx both offer some options for searching strings. On the Lua side, there’s the built in string.find() (Lua5.1 docs) and associated functions. On the Nginx-Lua side of things there’s ngx.re.find() (lua-nginx-module docs) which allows calls into Nginx’s regex engine.

I’ve done a significant amount of digging trying to find performance informational about both of these methods, and I haven’t been able to find any. So I sat down and did my own testing.

Native Lua – string.find()

Lua’s native string.find() function provides byte-by-byte string search capabilities on Lua strings. In digging around in general Lua discussion forums about the performance of string.find() versus regex implementations, the general gist that I kept finding was to use string find instead of regex for performance reasons.

String.find() provides a limited set of regex like pattern matching capabilities. The full details are outlined in the above linked Lua5.1 docs page. The broad gist of it though is that string.find() can search for generalized patterns (e.g. %s meaning all white space characters), as well as specific character sequences, (e.g. [abc] meaning the letter a, b, or c).

While much of the search capabilities looks somewhat like regex in spirit and format, there are significant deficiencies in the string.find() engine that separate it from regex.

For starters, pipes (|) cannot be used to separate alternative matches. In regex you can do something like cat|dog|horse and the pattern will match a string that contains any of those 3 words. This is not possible in Lua’s string.find() engine. Instead, to mimic the capability you have to repeatedly search for the various strings. Something similar to the following:

local haystack = "the string you want to search for the content in"
local search = { 'cat', 'dog', 'horse' }
local found = false
for _,v in pairs( search ) do
  if haystack:find( v ) then
    found = true
    break
  end
end

The other major departure is that string.find() is case sensitive. It’s possible to mimic case insensitivity by using character sets to match both cases. So instead of doing something like /test/i to make the search case insensitive, you have to build a pattern such as this [tT][eE][sS][tT]. Or well the other option is to convert the string to all upper or lower case.

Nginx Regex

Nginx implements a Perl Compatible Regular Express (PCRE) regex engine. Honestly, there’s not a whole lot to say about this. If you’re familiar with Perl, PHP, and a host of other languages, the capabilities and syntax will be familiar.

That said, there are a couple of points worth touching on. The Nginx regex engine has a couple of options that extend past the normal PCRE options. Specifically the ones of most note are “j” and “o”. “J” enables the PCRE JIT compilation engine. To go with “j”, is the “o” which enables compile-once mode.

There are some limitations to compile-once mode, such as the number of expressions that can be cached (1000 by default, though you can change the lua_regex_cache_max_entries variable to expand it if necessary.

OpenRESTY (the people behind the lua-nginx-module) don’t provide much in the way of discussions about the performance of the Nginx regex engine, other than to say that you should use ngx.re.find() whenever possible over ngx.re.match() due to the latter needing to create lua tables.

Performance Testing

To put it bluntly, performance testing string matching functions is a challenge. There are quite a few variables that can significant alter the performance of the search function. This includes things like where the target word is in the search string, if it’s there at all, the size of the regular expression being used, the kinds of matches involved, and the size of the search string; among others.

That makes it very difficult to make a single simple test.

For these tests I generate a random string that’s at least a specified size in bytes (e.g. 2^6 characters). Every 5 characters a space is inserted. Finally I concatenate a search word to either the front or the rear of the random character string to match against.

The test is repeated in a loop 100,000 times to generate a time for the run. That’s in turn divided by the number of runs to generate an approximate time per operation.

Each test I’m going to present has a summary table in it. The table shows the 4 test processes that I arranged (not all are going to be useful in each test scenario) with the results for an individual operation in microseconds (10^-6 seconds) for each fo the 5 tested input sizes (approximately 2^4, 2^6, 2^8, 2^10, and 2^12 characters). The three numbers are for the following cases in order:

  • Match at front of test string
  • Match at rear of test string
  • No match in test string

These tests are being run on my development web server. It’s specs are:

  • Xeon E3-1220v2 (3.1 GHz, Turbo to 3.5 GHz, 8MB cache)
  • 16 GB of DDR-1066 ECC RAM
  • 128 GB Samsung EVO (EXT4)
  • 2x 4TB Seagate Iron wolfs (ZFS Mirror)

The relevant software is:

  • Nginx version 1.13.0
  • Luajit version 2.0.2
  • Ubuntu 14.04.4 LTS / 4.4.0-79 generic

Simple String Test Run

Starting off I want to look at a simple string find.

I’m searching for the string “test” in various size random strings.

Engine / Algo 26 (2^4) 83 (2^6) 315 (2^8) 1239 (2^10) 4939 (2^12) [10k loops]
Ngx.re.find joui 0.27 / 0.29 / 0.26 0.27 / 0.34 / 0.31 0.28 / 0.56 / 0.54 0.27 / 1.32 / 1.3 0.3 / 4.5 / 4.5
string.find() 0.08 / 0.08 / 0.07 0.08 / 0.07 / 0.07 0.07 / 0.1 / 0.09 0.08 / 0.23 / 0.24 0.1 / 0.8 / 0.8
string.find() icase 0.14 / 0.48 / 0.41 0.14 / 1.36 / 1.3 0.14 / 4.95 / 4.88 0.14 / 19.33 / 19.34 0.1 / 77.4 / 77.4
find loop icase 0.16 / 0.5 / 0.44 0.16 / 1.39 / 1.31 0.16 / 4.97 / 4.9 0.16 / 19.34 / 19.28 0.2 / 77.5 / 77.4

The 4th run and the 3rd run put in similar times because the 4th run is merely the 3rd run looped over a table with a single string in it. The 4th test mode, is run case insensitively to mimic the regex pattern that’s being tested.

There are a couple of points to take away from this test. Plain string searches using Lua’s built in string.find() can be very fast. Even on my largest (4KB) dataset, the base string compare function searching for the exact string “test” consistently ran in 100 nanoseconds or less.

The early takeaway is to always use string.find() when you can exactly match the string you’re searching for.

Simple Test 2: Multiple Search Strings (small list, match early)

This test looks at matching against multiple terms in a string.

The regex pattern for this test was

/test|cat|dog|lion|cheetah|leopard|lynx/i

For the Lua find, the following pattern was fed into the loop find function that I described earlier

{ 'test', 'cat', 'dog', 'lion', 'cheetah', 'leopard', 'lynx'}

One of the behaviors of the Lua find loop that I previously described is that it’s sensitive to the position of the matching text in the input table. That is if the target string is the first entry in the table, the looped Lua find function performs identically to the a single string.find() call. However, if the search is not in the first position the process will run longer.

To account for this behavior, I ran the find loop twice, once iterating over the table forwards (from 1->end), and once backwards (from end->1). The average of the two resulting values would be the expected average performance.

Engine / Algo 26 (2^4) 83 (2^6) 315 (2^8) 1239 (2^10) 4939 (2^12) [1k loops]
ngx.re joui 0.28 / 0.32 / 0.31 0.29 / 0.45 / 0.43 0.28 / 0.96 / 0.91 0.28 / 3.17 / 3.13 0.3 / 14 / 13.9
string.find 0.07 / 0.07 / 0.07 0.07 / 0.08 / 0.07 0.08 / 0.1 / 0.09 0.08 / 0.23 / 0.24 0.1 / 0.8 / 0.8
string.find icase 0.15 / 0.48 / 0.41 0.15 / 1.36 / 1.29 0.14 / 4.95 / 4.88 0.14 / 19.33 / 19.26 0.2 / 77.4 / 77.4
string.find loop icase 0.16 / 0.51 / 2.97 0.17 / 1.38 / 9.24 0.16 / 4.97 / 34.45 0.16 / 19.35 / 135.56 0.1 / 77.5 / 540.3
string.find loop rev icase 3.24 / 3.57 2.97 8.62 / 9.85 / 9.24 30.26 / 35.08 / 34.44 116.97 / 136.19 / 135.58 463.8 / 540.9 / 540.2
string.find loop avg 1.7 / 2.04 / 2.97 4.4 / 5.62 / 9.24 15.29 / 20.03 / 34.45 58.57 / 77.77 / 135.57 232 / 308.9 / 540.6

It’s at this point that we start to see the problem. While straight exact matches are very fast in Lua’s string.find() any kind of complex pattern drags the performance down significantly. Moreover, as the size of the search space increases, the time to process the data starts to scale significantly.

There’s a, potentially larger, second problem here though. Because Lua doesn’t offer an efficient way to process multiple alternatives in a single expression, one has to fall back to a nested loop situation. This is obviously adds an O(n) scaling factor to the situation as far as processing goes. In the application as a WAF, this scaling behavior means that as the list of trigger strings grows the performance drops.

Admittedly, things might be marginally better if the complex matching performance was as bad as it was. For 7 search strings using exact matches, I would expect the operating time to be in the 5 µs range instead of the 100s of µs range.

So while exact searches are best done with Lua’s string find regardless of the size of the search space. Once complex patterns are involved, sticking to string.find() is clearly no longer desirable.

Larger Alternative Sets

For this test, I added another 40 odd strings to the search set. The objective here it so see how the Nginx regex engine behaves when it has to search for many alternate strings as opposed to just a few. In other words, instead of processing a regex that looks like:

/test|cat|dog|lion|cheetah|leopard|lynx/i

We’re now searching with a patter that’s some 434 byes long (and now including complex patterns as well, [yes, I know that’s bad benchmarking I should only change one thing at a time]).

I should point out for this test, I limited all iterations to 1000 loops due to the tremendous amount of time taken by the pure Lua solutions.

Engine / Algo 26 (2^4) 83 (2^6) 315 (2^8) [10k loops] 1239 (2^10) [1k loops] 4939 (2^12) [1k loops]
ngx.re joui 0.5 / 0.99 / 0.88 0.49 / 2.77 / 2.62 0.5 / 9.8 / 8.8 1.0 / 36 / 35 1 / 146 / 147
string.find 0.08 / 0.07 / 0.07 0.07 / 0.08 / 0.07 0.1 / 0.1 / 0.1 0 / 0 / 0 0 / 1 / 1
string.find icase 0.14 / 0.48 / 0.41 0.15 / 1.36 / 1.3 0.1 / 4.9 / 4.9 0 / 19 / 20 0 / 77 / 77
string.find loop icase 0.18 / 0.51 / 17.65 0.18 / 1.4 / 53.12 0.2 / 5 / 196.8 0 / 19 / 770 0 / 77 / 3064
string.find loop rev icase 21.04 / 21.36 / 17.67 55.65 / 56.88 / 53.12 195.7 / 200.4 / 196.8 754 / 774 / 769 2991 / 3069 / 3064

The Lua string.find() behaviors aren’t so interesting here. They explode like you would expect, from an O(n) algorithm backed by a relatively slow string processing library. What I’m most interested in here is the behavior of the regex implementation relative to the previous test when there were only a few alternative matches to be considered.

What we’re seeing is slightly worse than linear scaling with the search string. I went from 7 alternatives to look for to 52, a 7.4x increase in patterns. At the same time, we see the search time for an 4K string grow from 14 µs average to 146 µs average. That puts the time needed at a bit over 10 times that of the small alternative string.

Conclusions

What can we take away from this?

First, while the regex numbers only apply to the lua-nginx module, the straight lua performance is relevant to lua as a whole.

Second, Lua’s native string.find() is extremely fast, so long as it’s being used with exact matches (case sensitive, no globing, no wildcards). Once you start throwing wildcards into the mix, string.find() slows down considerably.

Though I should point out, I’m running these tests in Luajit 2.0.2, since that’s the version that’s the version that’s available in the Ubuntu package repository. The OpenRESTY project currently is recommending using Luajit 2.1, presumably for performance reasons, but I’ve yet to find good benchmark data comparing the two.

Third, for complex text searches in Lua-Nginx, it’s clearly preferable to use the ngx.re.find functions, not native lua find calls as one might expect.

To put some real world perspective on this, since ultimately that was my point here. When I published my last article looking at lua-nginx performance in a WAF like role in front of WordPress, I noted that when lua was enabled I was seeing a 425 µs increase in latency of no processing at al (including no PHP based WordPress security plugins[1]). Going back through my WAF code, and replacing the loop based string.find operations with a single loop to generate a regex pattern from a table and a single ngx.re.find call, reduced the latency added by lua to a mere 37 µs.


  1. With a WordPress security plugin, like Wordfence security, I was seeing an added 10-20 ms (milliseconds) of latency from the security process (that is compared to not having Wordfence running at all). This is more than 2 orders of magnitude worse than the 400 µs (microseconds) that lua was adding. ↩︎
Published inComputersLinuxWordpress