Among the things I tried, I added the following flags
Code:
-O2 -ftree-vectorize -ftree-slp-vectorize -ftree-loop-vectorize -ffast-math -ftree-vectorizer-verbose=7 -funsafe-loop-optimizations -funsafe-math-optimizations
This doesn't apply to asm, only to vectorize C code. There are flags that can be set to enable and disable ASM for various architectures.
23 bit ARM asm is likely disabled on AArch64.