PyCuda 0.91

I’m happy to announce the availability of PyCuda 0.91. There is full, up-to-date documentation available.

The following exciting stuff is in PyCuda 0.91:

Support for Windows and MacOS X, in addition to Linux. (Gert Wohlgemuth, Cosmin Stejerean, Znah on the Nvidia forums, and David Gadling)
Support more arithmetic operators on pycuda.gpuarray.GPUArray. (Gert Wohlgemuth)
Add pycuda.gpuarray.arange(). (Gert Wohlgemuth)
Add pycuda.curandom. (Gert Wohlgemuth)
Add pycuda.cumath. (Gert Wohlgemuth)
Add pycuda.autoinit.
Add pycuda.tools.
Add pycuda.tools.DeviceData and pycuda.tools.OccupancyRecord. pycuda.gpuarray.
GPUArray parallelizes properly on GTX200-generation devices.
Add support for compiling on CUDA 1.1. Added version query pycuda.driver.get_version(). Updated documentation to show 2.0-only functionality.
Make pycuda.driver.Function resource usage available to the program. (See, e.g. pycuda.driver.Function.registers.) Cache kernels compiled by pycuda.driver.SourceModule.
Allow for faster, prepared kernel invocation. See pycuda.driver.Function.prepare().
Added memory pools, at pycuda.tools.DeviceMemoryPool as experimental, undocumented functionality. For some workloads, this can cure the slowness of pycuda.driver.mem_alloc().
Fix the memset family of functions.
Improve Error Reporting.

Check the docs change list for a fully hyperlinked version of the above.

Have fun, Andreas