PyOpenCL: Arrays¶

Setup code¶

import pyopencl as cl
import numpy as np
import numpy.linalg as la

a = np.random.rand(1024, 1024).astype(np.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

Creating arrays¶

This notebook demonstrates working with PyOpenCL's arrays, which provide a friendlier (and more numpy-like) face on OpenCL's buffers. This is the module where they live:

import pyopencl.array

Now transfer to a device array.

a_dev = cl.array.to_device(queue, a)

Works like a numpy array! (shape, dtype, strides)

a_dev.shape

/home/andreas/src/env-3.5/lib/python3.5/site-packages/IPython/core/formatters.py:92: DeprecationWarning: DisplayFormatter._ipython_display_formatter_default is deprecated: use @default decorator instead.
  def _ipython_display_formatter_default(self):
/home/andreas/src/env-3.5/lib/python3.5/site-packages/IPython/core/formatters.py:669: DeprecationWarning: PlainTextFormatter._singleton_printers_default is deprecated: use @default decorator instead.
  def _singleton_printers_default(self):

(1024, 1024)

a_dev.dtype

dtype('float32')

a_dev.strides

(4096, 4)

Working with arrays¶

Goal: Wanted to double all entries.

twice_a_dev = 2*a_dev

Easy to turn back into a numpy array.

twice_a = twice_a_dev.get()

Check!

print(la.norm(twice_a - 2*a))

0.0

Can just print the array, too.

print(twice_a_dev)

[[ 0.77836961  0.28050834  0.6613102  ...,  1.91516626  0.61054963
   1.56502569]
 [ 1.53493118  0.50901324  0.00827558 ...,  1.19049335  0.48224956
   0.1369826 ]
 [ 0.50581717  1.01614654  0.32951528 ...,  1.3467046   1.45456564
   1.40221345]
 ..., 
 [ 1.47264338  1.11805999  1.55873811 ...,  1.87507105  1.08121443
   1.99759185]
 [ 0.43069166  1.68386734  0.92028683 ...,  0.6744886   1.33184588
   1.66233599]
 [ 1.22339284  1.3037529   0.3637082  ...,  0.82762784  0.23160546
   1.58330226]]

Easy to evaluate arbitrary (elementwise) expressions.

import pyopencl.clmath

cl.clmath.sin(a_dev)**2 - (1./a_dev) + 5

array([[ 2.10573769,  2.50124764,  0.42696333, ...,  3.63703895,
         1.92534614,  3.95568466],
       [ 4.62456608,  3.88678145,  1.25435662, ...,  1.32198358,
        -5.26086712,  4.57042027],
       [ 4.30697775,  2.99277115,  2.60830212, ...,  4.4082365 ,
         2.17496896,  2.49961734],
       ..., 
       [ 3.41792631,  0.50407267, -2.78950453, ...,  3.58545685,
         4.49730206,  4.1767683 ],
       [ 3.05713558,  4.324893  ,  4.29508495, ..., -9.8753109 ,
         4.46689415,  3.88825035],
       [ 1.48695421, -2.0454402 ,  3.78699446, ...,  2.17892647,
         3.81082869,  3.16286278]], dtype=float32)

Low-level Access¶

Can still do everything manually though!

prg = cl.Program(ctx, """
    __kernel void twice(__global float *a)
    {
      int gid0 = get_global_id(0);
      int gid1 = get_global_id(1);
      int i = gid1 * 1024 + gid0;
      a[i] = 2*a[i];
    }
    """).build()
twice = prg.twice

twice(queue, a_dev.shape, None, a_dev.data)

<pyopencl.cffi_cl.Event at 0x7f434a534c50>

print(la.norm(a_dev.get() - 2*a), la.norm(a))

0.0 591.074

But the hardcoded 1024 is ... inelegant. So fix that!

(Also with arg dtype setting.)