pmeerw's blog

Sun, 21 Apr 2024

A long tale of an (inconvenient) bug: Firefox not working on Ubuntu 24.04 pre-beta

Sometimes one feels adventurous: so I made the hasty decision to upgrade from Ubuntu 22.04 to Ubuntu 24.04 (Noble Numbat) in alpha/pre-beta state -- I swear, this has worked nicely previously.

The update went semi-smooth, only the network/nameserver configuration was lost, but that was easy to fix by manually installing the network-manager package and editing /etc/resolv.conf. It's good if you can remember to manually bring up an network interface, add a default route, set a static DNS:

ifconfig -a # to discover available network interfaces
ifconfig enp5s0 192.168.1.123 # set static IP
route add default gw 192.168.1.1 # add default route
nano /etc/resolv.conf # set static DNS

All looks good, except: Firefox cannot resolve any host name. Also Chromium. Also Slack is dead. Uugh, the machine is pretty useless without the Web. However, network connectivity is good: ping works, nslookup works. Turned out only snap applications are affected, but that was not so clear in the beginning.

Initially, I thought some browser-related package is at fault. Here are my notes of the debugging process and the train of thought; in any case, I learned quite a bit...

System calls

I used strace to look for failing syscalls. I also tried ltrace, but that did not really work. So strace showed that the /etc/nsswitch.conf and the /etc/hosts files are being read, a connect() call to establish a connection to the DNS resolver/nameserver (to 127.0.0.53, port 53 in my case) given in /etc/resolv.conf, the DNS query being sent, and the DNS response being received, but the recvfrom() syscall returning an error code (EINVAL). Inspecting the query traffic with wireshark, everthing looked good on the network layer.

I made a couple of experiments: the browser had no issue connecting to IP addresses directly, also a connection to a hostname in the /etc/hosts file was no problem. So DNS...

Firefox debugging

Next, I tried to corner the issue from above and learned about Firefox's built-in debugging and logging capabilities. Firefox is my main browser; it exposes a wide range of information. Entering about:networking in the browser's address bar, one can observe recent HTTP connections, network socket information, a glimpse on the DNS cache (see screenshot). Using the 'DNS Lookup' functionality, it is easy to confirm that Firefox cannot resolve any hostname.

good
bad

Entering about:logging presents Firefox's logging manger. One restrict logging to certain modules and log levels (supposedly, that is what '5' means). The Firefox Profiler is some online capability to diagnose log files and obviously not very helpful to track down network issues, hence I selected 'Logging to a file'. Since Firefox is usually run in a Snap container, the log file actually ends up inthe /tmp/snap-private-tmp/snap.firefox/tmp/ directory of the file system.

A successful DNS lookup for 'example.com' produces the following lines:

Parent 17655: Main Thread]: D/nsHostResolver Resolving host
Parent 17655: Main Thread]: D/nsHostResolver   No usable record in cache for host
Parent 17655: Main Thread]: D/nsHostResolver NameLookup host:example.com af:0
Parent 17655: Main Thread]: D/nsHostResolver NameLookup: example.com effectiveTRRmode: 1 flags: 0
Parent 17655: Main Thread]: D/nsHostResolver TRR service not enabled - off or disabled
Parent 17655: Main Thread]: D/nsHostResolver NativeLookup host:example.com af:0
Parent 17655: Main Thread]: D/nsHostResolver   DNS thread counters: total=6 any-live=0 idle=6 pending=1
Parent 17655: Main Thread]: D/nsHostResolver   DNS lookup for host
Parent 17655: DNS Resolver #123]: E/nsHostResolver DNS lookup thread - Calling getaddrinfo for host
Parent 17655: Main Thread]: D/nsHostResolver Resolving host
Parent 17655: Main Thread]: D/nsHostResolver   No usable record in cache for host
Parent 17655: Main Thread]: D/nsHostResolver NameLookup host:example.com af:0
Parent 17655: Main Thread]: D/nsHostResolver NameLookup: example.com effectiveTRRmode: 1 flags: 0
Parent 17655: Main Thread]: D/nsHostResolver TRR service not enabled - off or disabled
Parent 17655: DNS Resolver #123]: E/nsHostResolver DNS lookup thread - lookup completed for host
Parent 17655: DNS Resolver #123]: D/nsHostResolver nsHostResolver::CompleteLookup example.com 7650d9e24650 0 resolver=0 stillResolving=0
Parent 17655: DNS Resolver #123]: D/nsHostResolver nsHostResolver record 7650de6a1060 new gencnt
Parent 17655: DNS Resolver #123]: D/nsHostResolver Caching host
Parent 17655: DNS Resolver #123]: D/nsHostResolver CompleteLookup: example.com has 93.184.215.14
Parent 17655: DNS Resolver #123]: D/nsHostResolver CompleteLookup: example.com has 2606:2800:21f:cb07:6820:80da:af6b:8b2c
There's a couple of interesting things: (1) it calling getaddrinfo(), which lives in the C runtime; (2) nsHostResolver is a module in Firefox's code base, which is online browsable, e.g. using searchfox; (3) it ultimately calls PR_GetAddrInfoByName(), which lives in he libnspr4 package/library, it seemed like a good candidate to investigate further. Again, it's browsable online and indeed, it calls getaddrinfo().

Digging deeper, IDA Pro

The strace tool would have been really helpful here to trace at runtime the arguments passed and results returned, however, it didn't work for me. So I tried attaching the IDA Pro 8.3 disassembler/debugger to the Firefox process, a free IDA Pro version is available from Hex-Rays.

In IDA, I looked for the libnspr4.so shared object in 'Modules' list, then searched for the PR_GetAddrInfoByName() symbol in the module's function list to get a disassembly of the function.

Using the basic block graph of the function (and having the almost matching, corresponding source code), it's relatively easy to locate the potential getaddrinfo() call (right after the checks for "localhost" domain). I've set a breakpoint (press F2) on the call and on the next instruction.

Having the debugger interrupt and block the Firefox process is tedious: those functions are called too often for manual inspection of the call arguments and return code. A better way is conditional breakpoints and logging. In IDA, edit the breakpoints and enter a condition:

msg("!!!1 %s\n", get_strlit_contents(rdi, -1, STRTYPE_C)), 0
and
msg("-> %08x\n", eax), 0
respectively. Both output a message, and at the end (", 0") return 0 as the results of the conditional expression (i.e. the breakpoint's action is NOT triggered). The first message outputs the null-terminated string pointed to by the rdi register, that is the host name to resolve. The second message outputs the value of the eax registers, i.e. the result value of the getaddrinfo() function. After instrumenting the Firefox process as described, press F9 to continue execution of the debuggee. Perform a DNS query in Firefox, and the logging -- thanks to the breakpoints -- will be shown in IDA's "Output" window.

I tried to step into (press F7 in IDA) the getaddrinfo() implementation, but quickly gave up on the code. It will somehow call connect() to query the DNS resolver as seen in strace...

Showdown & conclusion

After a long time deducing all the steps, and ignoring the fact that a fresh install might have been the quicker option, here are the finding:

I tried to implement the PR_GetAddrInfoByName() in a small test program, of course no issue there:

// compile: gcc -g -Wall nspr.c -lnspr4

#include "nspr/nspr.h"
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
  PR_Init(0, PR_PRIORITY_NORMAL, 0);

  PRAddrInfo* ai = PR_GetAddrInfoByName("example.com", PR_AF_INET, PR_AI_ADDRCONFIG | PR_AI_NOCANONNAME);
  if (!ai) {
    printf("PR_GetAddrInfoByName() = %d\n", PR_GetError());
    goto error;
  }

  PR_FreeAddrInfo(ai);

error:
  PR_Cleanup();

  return EXIT_SUCCESS;
}

My suspicion was that either apparmor or the Snap containerization is to blame. Apparmor can be easily disabled, and hence ruled out. Snap is not so easy and I know little about it and I had to defer further investigation. Plan was to run the above test program under Snapand hope that the error would reproduce...

Aftermath

A couple of days later, I learned that other people experienced the same problem: "Network problems with snap apps" (on askubuntu.com), "DNS for snaps like Firefox and Chromium fails" (on launchpad), "Snaps unable to connect to network under linux-lowlatency", "Noble kernel regression with new apparmor profiles/features"

And also a solution became known: kernel update from Linux 6.8.0-25-lowlatency to Linux 6.8.0-28-lowlatency -- and the issue is gone! Somehow disappointing, I've tried to track down the actual fix, maybe it's this:

diff -u linux-6.8.0/security/apparmor/af_inet.c linux-6.8.0/security/apparmor/af_inet.c
--- linux-6.8.0/security/apparmor/af_inet.c
+++ linux-6.8.0/security/apparmor/af_inet.c
@@ -103,14 +103,12 @@
 	AA_BUG(!maddr);
 
 	maddr->addrtype = addrtype;
-	if (!addr) {
+	if (!addr || addrlen < offsetofend(struct sockaddr, sa_family)) {
 		maddr->addrp = NULL;
 		maddr->port = 0;
 		maddr->len = 0;
 		return 0;
 	}
-	if (addrlen < offsetofend(struct sockaddr, sa_family))
-		return -EINVAL;
 
 	/*
 	 * its possibly to have sk->sk_family == PF_INET6 and
This somehow would match the initial finding that recvfrom() returns -EINVAL.

posted at: 22:39 | path: /rant | permanent link

Wed, 06 Mar 2024

C++ - WTF user literals?!

user literals: "Since the introduction of user-defined literals, the code that uses format macro constants for fixed-width integer types with no space after the preceding string literal became invalid: std::printf("%"PRId64"\n",INT64_MIN); has to be replaced by std::printf("%" PRId64"\n",INT64_MIN);"

So you want me to insert a space now?

posted at: 13:12 | path: /rant | permanent link

Mon, 05 Feb 2024

Phishing awareness? Received from!

Does your organization ask to look for phishing cues as part of security awareness training?

Find misspelled domain names in the From: line, etc? (that can easily be faked)

It's pathetic to blame users for the phishing misery, which by and large stems from the IT industry's failure to deploy secure software and safe communication solutions.

Here's a more reliable and (easy) check of the email's "header lines" to see if the sender's email address matches the sending email server (SMTP server, specified in RFC 5321).

Look for the first Received: from line. Here's an abridged example (pmeerw@gmail.com is messaging pmeerw@pmeerw.net):

X-Original-To: pmeerw@pmeerw.net
Delivered-To: pmeerw@pmeerw.net
Received: from mail-ot1-x32e.google.com (mail-ot1-x32e.google.com [IPv6:2607:f8b0:4864:20::32e])
    (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
     key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
     client-signature RSA-PSS (2048 bits) client-digest SHA256)
    (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (not verified))
    by ns.pmeerw.net (Postfix) with ESMTPS id F1252E02CD
    for ; Tue,  5 Mar 2024 16:32:48 +0100 (CET)
Received: by mail-ot1-x32e.google.com with SMTP id 46e09a7af769-6e2b466d213so1283153a34.0
        for ; Tue, 05 Mar 2024 07:32:48 -0800 (PST)
MIME-Version: 1.0
From: Peter Meerwald-Stadler 
Date: Tue, 5 Mar 2024 16:32:36 +0100
Message-ID: 
Subject: bla
To: Peter Meerwald-Stadler 

blub
So the SMTP server contacting pmeerw.net's SMTP is mail-ot1-x32e.google.com. Hence it's plausible that it's Gmail that is delivering an email (from a Gmail address). The "Received: from" line is put there by the receiving SMTP server, a trusted machine. On the other hand, the sender may put arbitrary things in the From: and To: lines, these values do not affect the delivery of the email and hence cannot be trusted.

Need to wait for some plausible spam/phishing email to have a more interesting example... :-)
Update (March 6, 2024): Didn't take long, here's an example using ovhcloud.com:

Received: from vps2361714.servdiscount-customer.com (vm4945647.1nvme.had.wf [45.88.77.100])
    by ns.pmeerw.net (Postfix) with ESMTP id C6A5FE0177
From: =?UTF-8?B?T1ZIY2xvdWQ=?=
To: pmeerw@pmeerw.net
Subject: =?UTF-8?B?Vm90cmUgbm9tIGRlIGRvbWFpbmU=?= "pmeerw.net" =?UTF-8?B?ZXN0IHRlbXBvcmFpcmVtZW50IHN1c3BlbmR1?=
Message-ID: <20240306031559.DA8051C773833DB1@news.ovhcloud.com>
I doubt ovhcloud sends their emails using vps2361714.servdiscount-customer.com (vm4945647.1nvme.had.wf [45.88.77.100]) and if they do I don't want to receive their sh*t anyway...

Email clients make it notoriously difficult to see this information (in Outlook it is hidded under ... / View / View Message details).

posted at: 22:00 | path: /rant | permanent link

Wed, 31 Jan 2024

GitLab, srly?!

GitLab is a popular git repo platform with integrated CI and whatnot. It can be self-hosted.

Annoying limitations:

How do people cope with these things?

posted at: 14:15 | path: /rant | permanent link

Mon, 05 Dec 2022

Trying openai's ChatGPT with an MBA expression..

OpenAI's chat is all the rage currently, so I have it a try with a well-known MBA expression: E=(x^y)+2*(x&y). This should simplify (spoiler alert) to x+y, however...

I'm not so convinced about the result, but nevertheless impressed by the answer. Also, I didn't quite get what the 'open' part in openai.com is...

posted at: 11:51 | path: /rant | permanent link

Made with PyBlosxom