pmeerw's blog

Jan 2012

## Mon, 09 Jan 2012

### Convert float-to-int with ARM NEON intrinsics

The following code converts float values to 16-bit signed integer values using ARM NEON intrinsics (assuming n is a multiple of 4) -- for instance audio samples.

The `vcvtq_s32_f32` instruction rounds towards zero, not towards the nearest integer. In C, the semantics would be `trunc()` instead of `lrintf()`.

To overcome the issue, one could implement:

```float a;
short b = trunc(a + ((a > 0) ? 0.5 : - 0.5));
```
To get rid of the condition, the trick is to get the sign bit (the MSB of a float) and `or` it to the constant `0.5` before adding it to `a`. In C:
```float a;
short b = trunc(a + float((uint32(a) & 0x8000000) | uint32(0.5)));
```

The complete code using ARM NEON intrinsics looks as follows:

```void conv_s16_from_float(unsigned n, const float *a, short *b) {
unsigned i;

const float32x4_t plusone4 = vdupq_n_f32(1.0f);
const float32x4_t minusone4 = vdupq_n_f32(-1.0f);
const float32x4_t half4 = vdupq_n_f32(0.5f);
const float32x4_t scale4 = vdupq_n_f32(32767.0f);

for (i = 0; i < n/4; i++) {
float32x4_t v4 = ((float32x4_t *)a)[i];
v4 = vmulq_f32(vmaxq_f32(vminq_f32(v4, plusone4) , minusone4), scale4);

const float32x4_t w4 = vreinterpretq_f32_u32(vorrq_u32(vandq_u32(