Background: DNS resolution on Linux is controlled by /etc/resolv.conf
, where up to three nameservers can be configured among other things (search list, timeout, attempts, etc.)
Nameservers are queried in order, the second nameserver is only asked if there is no response from the first. When there is an answer (even a negative one), further nameservers are not consulted. This can be changed with the rotate
option.
The man page has more info.
At least for bridged container network configuration (default), Docker mounts some host files into the container:
$ mount ... /dev/nvme0n1p3 on /etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro) /dev/nvme0n1p3 on /etc/hostname type ext4 (rw,relatime,errors=remount-ro) /dev/nvme0n1p3 on /etc/hosts type ext4 (rw,relatime,errors=remount-ro)
Hence, resolv.conf
that the resolver service of the host uses is used by the container.
I.e. on a system with systemd's resolved
, the nameservers used by resolved
will
be used by the container, and NOT the local resolver 172.0.0.53. I don't know why, I think this makes no sense
and complicates configuration.
All nameservers in /etc/resolv.conf
shall return the same information. However, this is not the case if
local, private and public nameservers are used. Private domains (such as example.local
) can only be
resolved by the private nameserver. This is in principle possible to configure in resolved
, but not easily passed on to the container.
In case multiple nameservers are configured for the container and the first local, private nameserver is unreliable or too slow, the fallback nameserver will be queried. This leads to sporadic host name lookup failure for private hosts on a local domain.
nameserver 172.x.y.z # private, can resolve example.local nameserver 1.1.1.1 # public
The dockerd
config file (--config-file=/var/snap/docker/nnnn/config/daemon.json
) for Snap
luckily lives in var/snap/docker/current/config/
and is editable, hurray!
Edit /var/snap/docker/current/config/daemon.json
to override DNS configuration for all containers:
{ "dns": ["172.x.y.z"] }Restart the docker container service:
sudo snap restart docker
posted at: 19:05 | path: /configuration | permanent link
A signed Windows executable allows windows to display the publisher name in the UAC dialog, except sometimes it doesn't work. Windows uses Authenticode to verify the integrity of a PE32 executable and provide authentication via code signing.
One way to learn more what UAC does w.r.t. crypto is to enable CAPI2 diagnostics , i.e. event logging.
Things to remember: the entire certificate chain up to but not including the root CA's certificate should be in the executable, i.e. all intermediate certificate. When certificate are missing, they might be retrieved by Certificate Authority Information Access (AIA), specified in RFC5280 via some HTTP URLs given in the certificates.
Different applications implement different verification policies: caching of certifiates, revocation list checks, etc. It's know clear what checks Windows, or the UAC dialog, or other application do to check the authenticity of an executable.
Tooling is difficult: again, it's not clear what the verification policy is. For example, Microsoft's signtool
does not complain about missing intermediate certificates.
Looking for some more mystery to research: Try page hashes!
posted at: 00:45 | path: /programming | permanent link
Sometimes one feels adventurous: so I made the hasty decision to upgrade from Ubuntu 22.04 to Ubuntu 24.04 (Noble Numbat) in alpha/pre-beta state -- I swear, this has worked nicely previously.
The update went semi-smooth, only the network/nameserver configuration was lost, but that was easy to fix by manually installing the network-manager package and editing /etc/resolv.conf
. It's good if you can remember to manually bring up an network interface, add a default route, set a static DNS:
ifconfig -a # to discover available network interfaces ifconfig enp5s0 192.168.1.123 # set static IP route add default gw 192.168.1.1 # add default route nano /etc/resolv.conf # set static DNS
All looks good, except: Firefox cannot resolve any host name. Also Chromium. Also Slack is dead. Uugh, the machine is pretty useless without the Web.
However, network connectivity is good: ping
works, nslookup
works. Turned out only snap applications are affected, but that was not so clear in the beginning.
Initially, I thought some browser-related package is at fault. Here are my notes of the debugging process and the train of thought; in any case, I learned quite a bit...
I used strace
to look for failing syscalls. I also tried ltrace
, but that did not really work.
So strace showed that the /etc/nsswitch.conf
and the /etc/hosts
files are being read, a connect()
call to establish a connection to the DNS resolver/nameserver (to 127.0.0.53, port 53 in my case) given in /etc/resolv.conf
, the DNS query being sent, and the DNS response being received, but the recvfrom() syscall returning an error code (EINVAL). Inspecting the query traffic with wireshark
, everthing looked good on the network layer.
I made a couple of experiments: the browser had no issue connecting to IP addresses directly, also a connection to a hostname in the /etc/hosts file was no problem. So DNS...
Next, I tried to corner the issue from above and learned about Firefox's built-in debugging and logging capabilities. Firefox is my main browser; it exposes a wide range of information. Entering about:networking
in the browser's address bar, one can observe recent HTTP connections, network socket information, a glimpse on the DNS cache (see screenshot). Using the 'DNS Lookup' functionality, it is easy to confirm that Firefox cannot resolve any hostname.
Entering about:logging
presents Firefox's logging manger. One restrict logging to certain modules and log levels (supposedly, that is what '5' means). The Firefox Profiler is some online capability to diagnose log files and obviously not very helpful to track down network issues, hence I selected 'Logging to a file'. Since Firefox is usually run in a Snap container, the log file actually ends up inthe /tmp/snap-private-tmp/snap.firefox/tmp/
directory of the file system.
A successful DNS lookup for 'example.com' produces the following lines:
Parent 17655: Main Thread]: D/nsHostResolver Resolving host Parent 17655: Main Thread]: D/nsHostResolver No usable record in cache for host Parent 17655: Main Thread]: D/nsHostResolver NameLookup host:example.com af:0 Parent 17655: Main Thread]: D/nsHostResolver NameLookup: example.com effectiveTRRmode: 1 flags: 0 Parent 17655: Main Thread]: D/nsHostResolver TRR service not enabled - off or disabled Parent 17655: Main Thread]: D/nsHostResolver NativeLookup host:example.com af:0 Parent 17655: Main Thread]: D/nsHostResolver DNS thread counters: total=6 any-live=0 idle=6 pending=1 Parent 17655: Main Thread]: D/nsHostResolver DNS lookup for host Parent 17655: DNS Resolver #123]: E/nsHostResolver DNS lookup thread - Calling getaddrinfo for host Parent 17655: Main Thread]: D/nsHostResolver Resolving host Parent 17655: Main Thread]: D/nsHostResolver No usable record in cache for host Parent 17655: Main Thread]: D/nsHostResolver NameLookup host:example.com af:0 Parent 17655: Main Thread]: D/nsHostResolver NameLookup: example.com effectiveTRRmode: 1 flags: 0 Parent 17655: Main Thread]: D/nsHostResolver TRR service not enabled - off or disabled Parent 17655: DNS Resolver #123]: E/nsHostResolver DNS lookup thread - lookup completed for host Parent 17655: DNS Resolver #123]: D/nsHostResolver nsHostResolver::CompleteLookup example.com 7650d9e24650 0 resolver=0 stillResolving=0 Parent 17655: DNS Resolver #123]: D/nsHostResolver nsHostResolver record 7650de6a1060 new gencnt Parent 17655: DNS Resolver #123]: D/nsHostResolver Caching host Parent 17655: DNS Resolver #123]: D/nsHostResolver CompleteLookup: example.com has 93.184.215.14 Parent 17655: DNS Resolver #123]: D/nsHostResolver CompleteLookup: example.com has 2606:2800:21f:cb07:6820:80da:af6b:8b2cThere's a couple of interesting things: (1) it calling getaddrinfo(), which lives in the C runtime; (2) nsHostResolver is a module in Firefox's code base, which is online browsable, e.g. using searchfox; (3) it ultimately calls PR_GetAddrInfoByName(), which lives in he libnspr4 package/library, it seemed like a good candidate to investigate further. Again, it's browsable online and indeed, it calls getaddrinfo().
The strace tool would have been really helpful here to trace at runtime the arguments passed and results returned, however, it didn't work for me. So I tried attaching the IDA Pro 8.3 disassembler/debugger to the Firefox process, a free IDA Pro version is available from Hex-Rays.
In IDA, I looked for the libnspr4.so
shared object in 'Modules' list, then searched for the PR_GetAddrInfoByName() symbol in the module's function list to get a disassembly of the function.
Using the basic block graph of the function (and having the almost matching, corresponding source code), it's relatively easy to locate the potential getaddrinfo() call (right after the checks for "localhost" domain). I've set a breakpoint (press F2) on the call and on the next instruction.
Having the debugger interrupt and block the Firefox process is tedious: those functions are called too often for manual inspection of the call arguments and return code. A better way is conditional breakpoints and logging. In IDA, edit the breakpoints and enter a condition:
msg("!!!1 %s\n", get_strlit_contents(rdi, -1, STRTYPE_C)), 0and
msg("-> %08x\n", eax), 0respectively. Both output a message, and at the end (", 0") return 0 as the results of the conditional expression (i.e. the breakpoint's action is NOT triggered). The first message outputs the null-terminated string pointed to by the rdi register, that is the host name to resolve. The second message outputs the value of the eax registers, i.e. the result value of the getaddrinfo() function. After instrumenting the Firefox process as described, press F9 to continue execution of the debuggee. Perform a DNS query in Firefox, and the logging -- thanks to the breakpoints -- will be shown in IDA's "Output" window.
I tried to step into (press F7 in IDA) the getaddrinfo() implementation, but quickly gave up on the code. It will somehow call connect() to query the DNS resolver as seen in strace...
After a long time deducing all the steps, and ignoring the fact that a fresh install might have been the quicker option, here are the finding:
I tried to implement the PR_GetAddrInfoByName() in a small test program, of course no issue there:
// compile: gcc -g -Wall nspr.c -lnspr4 #include "nspr/nspr.h" #include <stdio.h> #include <stdlib.h> int main(int argc, char* argv[]) { PR_Init(0, PR_PRIORITY_NORMAL, 0); PRAddrInfo* ai = PR_GetAddrInfoByName("example.com", PR_AF_INET, PR_AI_ADDRCONFIG | PR_AI_NOCANONNAME); if (!ai) { printf("PR_GetAddrInfoByName() = %d\n", PR_GetError()); goto error; } PR_FreeAddrInfo(ai); error: PR_Cleanup(); return EXIT_SUCCESS; }
My suspicion was that either apparmor or the Snap containerization is to blame. Apparmor can be easily disabled, and hence ruled out. Snap is not so easy and I know little about it and I had to defer further investigation. Plan was to run the above test program under Snapand hope that the error would reproduce...
A couple of days later, I learned that other people experienced the same problem: "Network problems with snap apps" (on askubuntu.com), "DNS for snaps like Firefox and Chromium fails" (on launchpad), "Snaps unable to connect to network under linux-lowlatency", "Noble kernel regression with new apparmor profiles/features"
And also a solution became known: kernel update from Linux 6.8.0-25-lowlatency to Linux 6.8.0-28-lowlatency -- and the issue is gone! Somehow disappointing, I've tried to track down the actual fix, maybe it's this:
diff -u linux-6.8.0/security/apparmor/af_inet.c linux-6.8.0/security/apparmor/af_inet.c --- linux-6.8.0/security/apparmor/af_inet.c +++ linux-6.8.0/security/apparmor/af_inet.c @@ -103,14 +103,12 @@ AA_BUG(!maddr); maddr->addrtype = addrtype; - if (!addr) { + if (!addr || addrlen < offsetofend(struct sockaddr, sa_family)) { maddr->addrp = NULL; maddr->port = 0; maddr->len = 0; return 0; } - if (addrlen < offsetofend(struct sockaddr, sa_family)) - return -EINVAL; /* * its possibly to have sk->sk_family == PF_INET6 andThis somehow would match the initial finding that recvfrom() returns -EINVAL.
posted at: 22:39 | path: /rant | permanent link
user literals:
"Since
the introduction of user-defined literals, the code that uses format
macro constants for fixed-width integer types with no space after the
preceding string literal became invalid:
std::printf("%"PRId64"\n",INT64_MIN);
has to be replaced by
std::printf("%" PRId64"\n",INT64_MIN);
"
So you want me to insert a space now?
posted at: 13:12 | path: /rant | permanent link
C++ code compiles with release build, fails with debug build (/D_DEBUG); MSVC obviously
Expectation: define _DEBUG (or switching between release and debug build) doesn’t change whether code is accepted; apparently Mircosoft has a different view...
// source code, x.cpp #include <cstdio> #include <string> static constexpr std::string s = “asdf”; int main() { printf(“%s\n”, s.c_str()); }Compile with debug:
cl /std:c++20 /D_DEBUG x.cpp Microsoft ® C/C++ Optimizing Compiler Version 19.39.33520 for x64 Copyright © Microsoft Corporation. All rights reserved. x.cpp x.cpp(4): error C2131: expression did not evaluate to a constant x.cpp(4): note: (sub-)object points to memory which was heap allocated during constant evaluationCompile as release:
cl /std:c++20 x.cpp Microsoft ® C/C++ Optimizing Compiler Version 19.39.33520 for x64 Copyright © Microsoft Corporation. All rights reserved. x.cpp Microsoft ® Incremental Linker Version 14.39.33520.0 Copyright © Microsoft Corporation. All rights reserved. /out:x.exe x.obj
Bonus: when the initializer string “asdf” is longer, e.g. “aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaasdf” also the release build fails (which is OK)
There's actually a very good and detailed technical explanation.
posted at: 10:00 | path: /programming | permanent link