To what corresponds this 768, is this the number of cuda core of the 750ti ? (need to see how this can be updated to the 780ti).
Launching a CUDA kernel uses the following syntax (ignoring optional parameters for now):
kernel_name<<>>(kernel_function_args...)
768 is the number of threads launched per block. The 750 Ti has 640 cores (128/SM (multiprocessor), 5 SMs/card). The 780 Ti has 2880 cores (192/SM, 15 SMs/card). I used very a basic calculation, essentially choosing a block count that is some multiple of the number of cores. In the case of the 780 Ti, 100 * SM count, or 100 * 15 == 1500. I haven't looked closely at the 780's specs, so one might run into a limitation on how many blocks per grid the card can support. You should be able to glean additional information from the following references: