VanitySearch may not compute a good gridsize for your GPU, so make several tries using -g options in order to find best performances.
What is the best and most efficient way to determine and use the optimum gridsize for a GPU?
The author was asked similar question in past and he only recommend to try different number[1]. Personally i'd recommend you to see benchmark list created by DaveF[2] as reference. Of course there are general guide to choose total block/grid on CUDA[3-4], but i've no idea whether it can be applied to VanitySearch's gridsize.