This may be nonsense but please bear with me:
Blake2b seems to be optimized for AVX2. Power2b just supports SSE2.
Assuming that power2b uses blake2b internally, and the latter is/may be optimized with AVX2, my question is couldn't power2b use blake2b's already existing AVX2 optimizations, and not fall back as a whole to SSE2?