PyCUDA lets you access Nvidia‘s CUDA parallel computation API from Python. Several wrappers of the CUDA API already exist–so what’s so special about PyCUDA?

  • Object cleanup tied to lifetime of objects. This idiom, often called RAII in C++, makes it much easier to write correct, leak- and crash-free code. PyCUDA knows about dependencies, too, so (for example) it won’t detach from a context before all memory allocated in it is also freed.
  • Convenience. Abstractions like pycuda.driver.SourceModule and pycuda.gpuarray.GPUArray make CUDA programming even more convenient than with Nvidia’s C-based runtime.
  • Completeness. PyCUDA puts the full power of CUDA’s driver API at your disposal, if you wish.
  • Automatic Error Checking. All CUDA errors are automatically translated into Python exceptions.
  • Speed. PyCUDA’s base layer is written in C++, so all the niceties above are virtually free.
  • Helpful Documentation.


See the PyCUDA Documentation.

If you’d like to get an impression what PyCUDA is being used for in the real world, head over to the PyCUDA showcase.


Having trouble with PyCUDA? Maybe the nice people on the PyCUDA mailing list can help.


PyCUDA may be downloaded from its Python Package Index page or obtained directly from my source code repository by typing

git clone --recursive

You may also browse the source.


  • Boost (any recent version should work)
  • CUDA (version 2.0 beta or newer)
  • Numpy (version 1.0.4 or newer)