Is it possible to use AES?
At present, the library has special code paths for AVX, AVX2 and AVX512f. I hope to include some additional optimizations, but they will probably be along the SSE path; I haven't evaluated if AES instructions would offer any speedup.