pmeerw's blog

Wed, 17 Jan 2024

No newline before EOF

Configuring editors to not append a newline at the end (before the end-of-file, EOF):

(see here also)

posted at: 23:13 | path: /programming | permanent link

Mon, 26 Dec 2022

S1144 LED name badge

Got a 11x44 LED badge labelled S1144. It identifies as

usb 1-2: new full-speed USB device number 61 using xhci_hcd
usb 1-2: New USB device found, idVendor=0416, idProduct=5020, bcdDevice= 1.00
usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-2: Product: CH546
usb 1-2: Manufacturer: wch.cn
hid-generic 0003:0416:5020.0090: hiddev1,hidraw2: USB HID v1.00 Device [wch.cn CH546] on usb-0000:02:00.0-2/input0
The CH546 is a 8051 MCU. It uses a USB HID interface. There is some Windows software to program it.

Here's what lsusb -v -v -v has to say about it:

Bus 001 Device 062: ID 0416:5020 Winbond Electronics Corp. CH546
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            0 
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x0416 Winbond Electronics Corp.
  idProduct          0x5020 
  bcdDevice            1.00
  iManufacturer           1 wch.cn
  iProduct                2 CH546
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0029
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          4 wch.cn
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower               70mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 
      iInterface              5 wch.cn
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.00
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      34
          Report Descriptor: (length is 34)
            Item(Global): Usage Page, data= [ 0x00 0xff ] 65280
                            (null)
            Item(Local ): Usage, data= [ 0x01 ] 1
                            (null)
            Item(Main  ): Collection, data= [ 0x01 ] 1
                            Application
            Item(Local ): Usage, data= [ 0x02 ] 2
                            (null)
            Item(Global): Logical Minimum, data= [ 0x00 ] 0
            Item(Global): Logical Maximum, data= [ 0x00 0xff ] 65280
            Item(Global): Report Size, data= [ 0x08 ] 8
            Item(Global): Report Count, data= [ 0x40 ] 64
            Item(Main  ): Input, data= [ 0x06 ] 6
                            Data Variable Relative No_Wrap Linear
                            Preferred_State No_Null_Position Non_Volatile Bitfield
            Item(Local ): Usage, data= [ 0x02 ] 2
                            (null)
            Item(Global): Logical Minimum, data= [ 0x00 ] 0
            Item(Global): Logical Maximum, data= [ 0x00 0xff ] 65280
            Item(Global): Report Size, data= [ 0x08 ] 8
            Item(Global): Report Count, data= [ 0x40 ] 64
            Item(Main  ): Output, data= [ 0x06 ] 6
                            Data Variable Relative No_Wrap Linear
                            Preferred_State No_Null_Position Non_Volatile Bitfield
            Item(Main  ): End Collection, data=none
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
Device Status:     0x0000
  (Bus Powered)

posted at: 21:22 | path: /programming | permanent link

Wed, 19 Oct 2022

xchg eax,eax -> nop?

On x86 (32-bit), a no-operation (nop) can be encoded as a CPU instruction 0x90 (among other choices). 0x90 can also be interpreted as xchg eax,eax.

On x86-64, xchg eax, eax is not a nop, as it clear the upper-half of the rax register; hence, it must be encoded as 0x87 0xc0. xchg rax, rax could be translated into a nop.

radare's rasm2 allows to easily experiment with different assembler engines for x86 (.nz is default):

rasm2 -a x86.nz -b 64 "xchg eax,eax" // .nz .. handmade assembler
87c0
rasm2 -a x86.nz -b 32 "xchg eax,eax"
90
rasm2 -a x86.nasm -b 64 "xchg rax,rax" // using NASM, notice the extra override byte 0x48
4890
rasm2 -a x86.as -b 64 "xchg rax,rax" // using GNU assembler
90

At least the following libraries/tools get this wrong:

As you might have guessed, these are my Hacktoberfest 2022 contributions.

posted at: 12:54 | path: /programming | permanent link

Thu, 26 Nov 2020

QEMU user-mode emulation

qemu can emulate all kind of architectures and processors, including x86 and x86_64, it has presets for a long list of CPUs ([1], 486, pentium, Haswell, etc.)

I've tried this using qemu 4.2.1 on Ubuntu 20.04, latest is 5.1.0.

qemu does full-system emulation AND user-mode emulation. While the former allows to run a wide range of operating systems on any supported architecture [2], the later runs programs for another Linux or BSD target.

       Full-system                     User-mode
+---------------------+         +---------------------+
| Userspace emulation |         | Userspace emulation |
+----------+----------+         +----------+----------+
           |                               |
 +---------+--------+              +-------+-------+
 | Kernel emulation |              | Kernel native |
 +---------+--------+              +-------+-------+
           |                               |
+----------+---------+            +--------+--------+
| Hardware emulation |            | Hardware native |
+--------------------+            +-----------------+

Let's compile the following simple program (hello.c):

#include <stdio.h>
int main() {
  printf("hello world %p\n", main);
  return 0;
}
And link statically to be self-contained; qemu can handle dynamically linked executables just fine as well.

To compile and link for 32-bit ARM [3]: arm-linux-gnueabihf-gcc -static -o hello-arm hello.c
For 64-bit x86: gcc -static -o hello-x86_x64 hello.c

Let's check:
$ file hello-arm
hello-arm: ELF 32-bit LSB executable, ARM, EABI5 version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, not stripped
$ file hello-x86_x64
hello-x86_x64: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, not stripped

On Ubuntu, we need qemu-user [4], and can then execute both binaries:
$ qemu-arm -- ./hello-arm
hello world 0x10425
$ qemu-x86_64 -- ./hello-x86_64
hello world 0x401ce5

qemu translates the input binary to run on the native CPU, also in case the architectures match. It uses internal micro ops (some intermediate representation), these can be observed before and after optimization:
qemu-x86_64 -d op -- ./hello-x86_64
qemu-x86_64 -d op_opt -- ./hello-x86_64

For example:

 mov_i64 tmp0,r13
 mov_i64 tmp1,r13
 and_i64 cc_dst,tmp0,tmp1
 discard cc_src
 discard loc10

Also the input and output assembler code can be seen:
qemu-x86_64 -d in_asm -- ./hello-x86_x64
qemu-x86_64 -d out_asm -- ./hello-x86_x64

[1] qemu -cpu help
[2] arm, m64k, mips, mips64, ppc, sparc, sparc64, etc.
[3] apt install gcc-arm-linux-gnueabihf
[4] apt install qemu-user
[5] To show log items: qemu-x86_64 -d help

posted at: 23:45 | path: /programming | permanent link

Tue, 05 May 2020

Statically checking C/C++ for unused return values

A seemingly simple problem: check C/C++ code statically for unused return values, but surprisingly here is no easily available tooling. Let's look at some options:

  1. C++-17 has annotation [[nodiscard]], e.g. the following code (unused-return.cpp)
    int foo() {
      return 42;
    }
    
    [[nodiscard]]
    int bar() {
      return 23;
    }
    
    int main() {
      foo();
      bar();
    }
    
    when compiled with g++-8 unused-return.cpp, will result in
    unused-return.cpp: In function ‘int main()’:
    unused-return.cpp:12:6: warning: ignoring return value of ‘int bar()’, declared with attribute nodiscard [-Wunused-result]
       bar();
       ~~~^~
    unused-return.cpp:6:5: note: declared here
     int bar() {
         ^~~
    
    (tested with GCC 8.4 / Ubuntu)

    No warning will printed (foo()), unless [[nodiscard]] is annotated (bar()).

  2. With GCC and clang, an attribute can be added to the function declaration, e.g. unused-return.c:
    __attribute__ ((warn_unused_result))
    int bar() {
      return 23;
    }
    
    resulting in a warning
    unused-return.c: In function ‘main’:
    unused-return.c:12:3: warning: ignoring return value of ‘bar’, declared with attribute warn_unused_result [-Wunused-result]
       bar();
       ^~~~~
    
    when compiled with gcc unused-return.c (GCC 8.4/Ubuntu). It doesn't help to enable warnings to get a similar warning for function foo().
  3. Synopsys Coverity can be used, at least it will report a warning when the return value of a function is checked inconsistently. The tool is costly and probably a bit overkill...
  4. A linter can be used, e.g the free splint tool, splint unused-return.c, but the output is quite verbose and doesn't cover C++:
    Splint 3.1.2 --- 20 Feb 2018
    
    unused-return.c: (in function main)
    unused-return.c:11:3: Return value (type int) ignored: foo()
      Result returned by function call is not used. If this is intended, can cast
      result to (void) to eliminate message. (Use -retvalint to inhibit warning)
    unused-return.c:12:3: Return value (type int) ignored: bar()
    
    Finished checking --- 2 code warnings
    
  5. The clang-query tool can be used to moreless interactively query the AST of the program. This is expored in more detail below...

Stackoverflow provides all the basics: a clang-query script which matches call expressions in the abstract syntax tree (AST) of the program, then restricting to 'intersting cases'.

For a nice intro to clang-query, see this devblog article.

I've added the -w switch to suppress clang warnings when processing the input program, and some bind trickery to make the output a bit nicer.

#!/bin/sh
# unused-return.sh: Run clang-query to report unused return values.

# When --dump, print the AST of matching syntax.
if [ "x$1" = "x--dump" ]; then
  dump="set output dump"
  shift
fi

query='m
  callExpr(
    isExpansionInMainFile(),
    hasParent(anyOf(
      compoundStmt(),
      ifStmt(hasCondition(expr().bind("cond"))),
      whileStmt(hasCondition(expr().bind("cond"))),
      doStmt(hasCondition(expr().bind("cond")))
    )),
    unless(hasType(voidType())),
    unless(isTypeDependent()),
    unless(cxxOperatorCallExpr()),
    unless(callee(namedDecl(anyOf(
      hasName("memset"),
      hasName("setlength"),
      hasName("flags"),
      hasName("width"),
      hasName("__builtin_memcpy")
    )))),
    unless(equalsBoundNode("cond"))).bind("unused-return")'

clang-query-9 -extra-arg="-w" -c="set bind-root false" -c="$dump" -c="$query" "$@" --
The output should look like
Match #1:

unused-return.c:11:3: note: "unused-return" binds here
  foo();
  ^~~~~

Match #2:

unused-return.c:12:3: note: "unused-return" binds here
  bar();
  ^~~~~
2 matches.
A recent clang version is needed, tested with clang 9 / Ubuntu; clang 6 did not work.

All files for download: unused-return.zip.

Update (2020-05-05): MSVC has _Check_return_ and _Must_inspect_result_, for good measure.
Update (2020-05-06): clang-tidy has bugprone-unused-return-value to check for missing return values of certain configured functions, such as std::async(), std::unique_ptr::release(), std::remove()
Update (2020-05-06): see reddit

posted at: 01:30 | path: /programming | permanent link

Made with PyBlosxom