A more advanced, but also less lightweight, alternative is the usage of a so-called templating engine, as it is being used to generate web pages.
This offers tremendous flexibility in generation, including the possibility for full flow control, allowing applications such as loop unrolling.
In the example below, we use a templating engine called 'Mako':
from mako.template import Template
tpl = Template(r"""
__kernel void ${name}(${arguments})
{
int lid = get_local_id(0);
int gsize = get_global_size(0);
int work_group_start = get_local_size(0)*get_group_id(0);
long i;
for (i = work_group_start + lid; i < n; i += gsize)
{
%for i_unroll in range(n_unroll):
${operation};
%if i_unroll + 1 < n_unroll:
i += gsize;
%endif
%endfor
}
}
""", strict_undefined=True)
print(tpl.render(
name="scale",
arguments="float *y, float a, float *x",
operation="y[i] = a*x[i]",
n_unroll=2,
))