There has been a significant change to the Windows build in RC2. PyOpenCL has been updated from 0.92 to 2011.2. The included kernels have been modified to work correctly under both 2011.2 and 0.92. Kernel developers should target PyOpenCL 2011.2 now.
In short:
Any kernels running under 2011.2 need to use is_blocking = False in order to prevent large delays getting work.
PyOpenCL 2011.2 has a built-in kernel binary caching system that is enabled by default. The included kernels disable this and continue to use their own system.