if it's correct to have float64 used here, possibly it could throw some error when reaching the limits for allocating for the array.
I don't know another way to test. This is the only one I could find online.
https[Suspicious link removed]cuting-a-python-script-on-gpu-using-cuda-and-numba-in-windows-10-1a1b10c29c9
And I barely managed to set up the drivers for numba to work through python. Reinstallation of the entire system and drivers again in a circle until I succeeded.