I’m happy to announce the availability of PyCuda 0.91. There is full, up-to-date documentation available.
The following exciting stuff is in PyCuda 0.91:
- Support for Windows and MacOS X, in addition to Linux. (Gert Wohlgemuth, Cosmin Stejerean, Znah on the Nvidia forums, and David Gadling)
- Support more arithmetic operators on pycuda.gpuarray.GPUArray. (Gert Wohlgemuth)
- Add pycuda.gpuarray.arange(). (Gert Wohlgemuth)
- Add pycuda.curandom. (Gert Wohlgemuth)
- Add pycuda.cumath. (Gert Wohlgemuth)
- Add pycuda.autoinit.
- Add pycuda.tools.
- Add pycuda.tools.DeviceData and pycuda.tools.OccupancyRecord. pycuda.gpuarray.
- GPUArray parallelizes properly on GTX200-generation devices.
- Add support for compiling on CUDA 1.1. Added version query pycuda.driver.get_version(). Updated documentation to show 2.0-only functionality.
- Make pycuda.driver.Function resource usage available to the program. (See, e.g. pycuda.driver.Function.registers.) Cache kernels compiled by pycuda.driver.SourceModule.
- Allow for faster, prepared kernel invocation. See pycuda.driver.Function.prepare().
- Added memory pools, at pycuda.tools.DeviceMemoryPool as experimental, undocumented functionality. For some workloads, this can cure the slowness of pycuda.driver.mem_alloc().
- Fix the memset family of functions.
- Improve Error Reporting.
Check the docs change list for a fully hyperlinked version of the above.
Have fun, Andreas