A little something we tripped over this week. We’re providing an experimental DNS-over-TLS server that supports TLS v1.3. Right now TLS v1.3 is still an Internet Draft; in other words, it’s not a finished standard, though close to it. The latest version of the draft is draft 23, support for which was merged into the OpenSSL master branch yesterday, January 25th. Yup, we’re living on the bleeding edge.
Support for the final standard TLS v1.3 will be in the next OpenSSL release, v1.1.1.
We’re providing the service by fronting a regular name server with haproxy v1.8.3 built against OpenSSL master.
For some time, our experimental server has happily accepted connections for an hour or two, but then stopped accepting new connections. To deepen the mystery, it’s configured in exactly the same way as two other servers that are working fine; the only difference is that those servers are using the standard packaged OpenSSL libraries from Ubuntu Xenial. In odd moments this week I’ve been digging into why.
The answer turns out to be entropy. OpenSSL needs a source of random bits for its crypto magic, and these are provided by a Deterministic Random Bit Generator (DRBG) seeded by some entropy. This part of OpenSSL has been completely rewritten for v1.1.1, and while I’m certainly not in a position to judge the technical details, the code looks far cleaner than the previous code, and appears to offer expanded possibilities for alternate entropy sources and hardware DRBG in the future. So, a thoroughly good thing.
However, there is change in behaviour on Linux compared to OpenSSL v1.1.0 and previous. In the old version, OpenSSL would attempt to read entropy from /dev/urandom (or /dev/random or /dev/srandom if not found). It would then mix in entropy from any Entropy Gathering Daemon (EGD) present, and then mix in further entropy based on the process PID, process UID and the current time. In v1.1.1 at present (the comments indicate an ongoing discussion on this), only the first configured entropy source is used, which in the case of a default Linux build is getting entropy from /dev/urandom (and again falling back to /dev/random or /dev/srandom if not found).
We have haproxy configured to run in a chroot jail. And this chroot jail did not contain /dev/urandom and friends. As it happens, OpenSSL obtains its first slab of entropy before the chroot takes effect, so that succeeds and haproxy starts to run. When, however, OpenSSL needs to read more entropy (which by default will be after at hour at latest), it cannot open /dev/urandom and friends and get more entropy. This appears as the connection failing to open, as SSL_new() fails. This is generally reported as a memory allocation failure. If you print the OpenSSL error chain, it’s slightly more informative:
140135125062464:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:221: 140135125062464:error:2406B072:random number generator:RAND_DRBG_generate:in error state:crypto/rand/drbg_lib.c:479: 140135125062464:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:221: 140135125062464:error:2406B072:random number generator:RAND_DRBG_generate:in error state:crypto/rand/drbg_lib.c:479: 140135125062464:error:2406C06E:random number generator:RAND_DRBG_instantiate:error retrieving entropy:crypto/rand/drbg_lib.c:221: 140135125062464:error:140BA041:SSL routines:SSL_new:malloc failure:ssl/ssl_lib.c:839:
So, if you’re seeing mysterious OpenSSL failures and you are running in a chroot jail, make sure /dev/urandom at least is available.
# mkdir -p <chroot-base>/dev # mknod <chroot-base>/dev/urandom c 1 9 # chmod 0666 <chroot-base>/dev/urandom
In fact, we’d recommend you do the same if you’re using OpenSSL 1.1.0 or before in an application run in a chroot. If you don’t, and you aren’t running an EGD, the chances are that the only entropy you’re getting is from your PID, UID and the time. All of which may be guessable from a relatively small range.
At least we recommend you read this page on the OpenSSL wiki, which discusses the issue in more detail.