{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Practice: Orthogonality of Hermite Polynomials\n", "\n", "In this exercise, modify the Monte-Carlo example to demonstrate the orthonormality of the two [Hermite polynomials](https://en.wikipedia.org/wiki/Hermite_polynomials)\n", "\n", "* $1$ and\n", "* $x^2-1$\n", "\n", "with respect to the weight $e^{-\\frac{x^2}2}$, i.e. show (numerically, using a Monte Carlo method) that\n", "\n", "$$\n", "\\int_{-\\infty}^\\infty 1 \\cdot (x^2-1) \\cdot e^{-\\frac{x^2}2}dx = 0\n", "$$\n", "\n", "and that\n", "\n", "$$\n", "\\int_{-\\infty}^\\infty (x^2-1)^2 \\cdot e^{-\\frac{x^2}2}dx = 2\\sqrt{2\\pi}.\n", "$$\n", "\n", "Realize that\n", "$$\n", "\\int_{-\\infty}^\\infty \\dots \\cdot \\frac{e^{-\\frac{x^2}2}}{\\sqrt{2\\pi}}dx\n", "$$\n", "can be evaluated by Monte-Carlo summation of $\\dots$ where the $x$ are normally distributed.\n", "\n", "Use the [Box-Muller transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) to obtain normally-distributed random numbers from the uniformly distributed ones returned by PyOpenCL's random number generator.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialization" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pyopencl as cl\n", "import pyopencl.array\n", "import pyopencl.clrandom\n", "\n", "ctx = cl.create_some_context()\n", "queue = cl.CommandQueue(ctx)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boilerplate for Random Number Generator" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "generator_preamble = \"\"\"\n", "#include \n", "\n", "typedef union {\n", " uint4 v;\n", " philox4x32_ctr_t c;\n", "} philox4x32_ctr_vec_union;\n", "\n", "\n", "uint4 philox4x32_bump(uint4 ctr)\n", "{\n", " if (++ctr.x == 0)\n", " if (++ctr.y == 0)\n", " ++ctr.z;\n", " return ctr;\n", "}\n", "\n", "uint4 philox4x32_gen(\n", " uint4 ctr,\n", " uint2 key,\n", " uint4 *new_ctr)\n", "{\n", " philox4x32_ctr_vec_union result;\n", " result.c = philox4x32(\n", " *(philox4x32_ctr_t *) &ctr,\n", " *(philox4x32_key_t *) &key);\n", " *new_ctr = philox4x32_bump(ctr);\n", " return result.v;\n", "}\n", "\n", "float4 philox4x32_f32(\n", " uint4 ctr,\n", " uint2 key,\n", " uint4 *new_ctr)\n", "{\n", " *new_ctr = ctr;\n", " return\n", " convert_float4(philox4x32_gen(*new_ctr, key, new_ctr))\n", " * 2.3283064365386963e-10f;\n", "}\n", "\"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Monte-Carlo code" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [], "source": [ "\n", "from mako.template import Template\n", "\n", "mc_preamble_src = Template(\"\"\"\n", "\n", "#include \n", "\n", "float compute_sample(int i, unsigned int k1)\n", "{\n", " uint4 ctr = { 0, 1, 2, 3 };\n", " uint2 key2 = { i, k1 };\n", " float4 rng_res = philox4x32_f32(ctr, key2, &(ctr));\n", " \n", " float r0 = sqrt(-2*log(rng_res.s0));\n", " float v0 = r0*cos((float) (2*M_PI) * rng_res.s1);\n", " float v1 = r0*sin((float) (2*M_PI) * rng_res.s1);\n", "\n", " float r2 = sqrt(-2*log(rng_res.s2));\n", " float v2 = r2*cos((float) (2*M_PI) * rng_res.s3);\n", " float v3 = r2*sin((float) (2*M_PI) * rng_res.s3);\n", " \n", " float result = 0;\n", " \n", " %for x in [\"v0\", \"v1\", \"v2\", \"v3\"]:\n", " {\n", " float x = ${x};\n", " float H2 = x*x - 1;\n", " result += H2;\n", " }\n", " %endfor\n", " \n", " return result;\n", "}\n", "\"\"\", strict_undefined=True).render()" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [], "source": [ "\n", "from pyopencl.reduction import ReductionKernel\n", "\n", "rknl = ReductionKernel(ctx, np.float32,\n", " neutral=\"0\",\n", " reduce_expr=\"a+b\", map_expr=\"compute_sample(i, k1)\",\n", " arguments=\"unsigned int k1\", preamble=generator_preamble+mc_preamble_src)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.000107572192383\n" ] } ], "source": [ "n = 10000000\n", "\n", "nsamples = 4*n\n", "result = rknl(15, range=slice(n), queue=queue).get() / nsamples\n", "\n", "print(result)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1+" } }, "nbformat": 4, "nbformat_minor": 0 }