{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Practice: Orthogonality of Hermite Polynomials\n",
        "\n",
        "In this exercise, modify the Monte-Carlo example to demonstrate the orthonormality of the two [Hermite polynomials](https://en.wikipedia.org/wiki/Hermite_polynomials)\n",
        "\n",
        "* $1$ and\n",
        "* $x^2-1$\n",
        "\n",
        "with respect to the weight $e^{-\\frac{x^2}2}$, i.e. show (numerically, using a Monte Carlo method) that\n",
        "\n",
        "$$\n",
        "\\int_{-\\infty}^\\infty 1 \\cdot (x^2-1) \\cdot e^{-\\frac{x^2}2}dx = 0\n",
        "$$\n",
        "\n",
        "and that\n",
        "\n",
        "$$\n",
        "\\int_{-\\infty}^\\infty (x^2-1)^2  \\cdot e^{-\\frac{x^2}2}dx = 2\\sqrt{2\\pi}.\n",
        "$$\n",
        "\n",
        "Realize that\n",
        "$$\n",
        "\\int_{-\\infty}^\\infty \\dots  \\cdot \\frac{e^{-\\frac{x^2}2}}{\\sqrt{2\\pi}}dx\n",
        "$$\n",
        "can be evaluated by Monte-Carlo summation of $\\dots$ where the $x$ are normally distributed.\n",
        "\n",
        "Use the [Box-Muller transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) to obtain normally-distributed random numbers from the uniformly distributed ones returned by PyOpenCL's random number generator.\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Initialization"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "collapsed": true
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "import pyopencl as cl\n",
        "import pyopencl.array\n",
        "import pyopencl.clrandom\n",
        "\n",
        "ctx = cl.create_some_context()\n",
        "queue = cl.CommandQueue(ctx)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Boilerplate for Random Number Generator"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "collapsed": true
      },
      "outputs": [],
      "source": [
        "generator_preamble = \"\"\"\n",
        "#include <pyopencl-random123/philox.cl>\n",
        "\n",
        "typedef union {\n",
        "    uint4 v;\n",
        "    philox4x32_ctr_t c;\n",
        "} philox4x32_ctr_vec_union;\n",
        "\n",
        "\n",
        "uint4 philox4x32_bump(uint4 ctr)\n",
        "{\n",
        "    if (++ctr.x == 0)\n",
        "        if (++ctr.y == 0)\n",
        "            ++ctr.z;\n",
        "    return ctr;\n",
        "}\n",
        "\n",
        "uint4 philox4x32_gen(\n",
        "        uint4 ctr,\n",
        "        uint2 key,\n",
        "        uint4 *new_ctr)\n",
        "{\n",
        "    philox4x32_ctr_vec_union result;\n",
        "    result.c = philox4x32(\n",
        "        *(philox4x32_ctr_t *) &ctr,\n",
        "        *(philox4x32_key_t *) &key);\n",
        "    *new_ctr = philox4x32_bump(ctr);\n",
        "    return result.v;\n",
        "}\n",
        "\n",
        "float4 philox4x32_f32(\n",
        "        uint4 ctr,\n",
        "        uint2 key,\n",
        "        uint4 *new_ctr)\n",
        "{\n",
        "    *new_ctr = ctr;\n",
        "    return\n",
        "        convert_float4(philox4x32_gen(*new_ctr, key, new_ctr))\n",
        "        * 2.3283064365386963e-10f;\n",
        "}\n",
        "\"\"\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monte-Carlo code"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 36,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "\n",
        "from mako.template import Template\n",
        "\n",
        "mc_preamble_src = Template(\"\"\"\n",
        "\n",
        "#include <pyopencl-complex.h>\n",
        "\n",
        "float compute_sample(int i, unsigned int k1)\n",
        "{\n",
        "    uint4 ctr = { 0, 1, 2, 3 };\n",
        "    uint2 key2 = { i, k1 };\n",
        "    float4 rng_res = philox4x32_f32(ctr, key2, &(ctr));\n",
        "    \n",
        "    float r0 = sqrt(-2*log(rng_res.s0));\n",
        "    float v0 = r0*cos((float) (2*M_PI) * rng_res.s1);\n",
        "    float v1 = r0*sin((float) (2*M_PI) * rng_res.s1);\n",
        "\n",
        "    float r2 = sqrt(-2*log(rng_res.s2));\n",
        "    float v2 = r2*cos((float) (2*M_PI) * rng_res.s3);\n",
        "    float v3 = r2*sin((float) (2*M_PI) * rng_res.s3);\n",
        "    \n",
        "    float result = 0;\n",
        "    \n",
        "    %for x in [\"v0\", \"v1\", \"v2\", \"v3\"]:\n",
        "    {\n",
        "        float x = ${x};\n",
        "        float H2 = x*x - 1;\n",
        "        result += H2;\n",
        "    }\n",
        "    %endfor\n",
        "    \n",
        "    return result;\n",
        "}\n",
        "\"\"\", strict_undefined=True).render()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "\n",
        "from pyopencl.reduction import ReductionKernel\n",
        "\n",
        "rknl = ReductionKernel(ctx, np.float32,\n",
        "        neutral=\"0\",\n",
        "        reduce_expr=\"a+b\", map_expr=\"compute_sample(i, k1)\",\n",
        "        arguments=\"unsigned int k1\", preamble=generator_preamble+mc_preamble_src)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "metadata": {
        "collapsed": false
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "0.000107572192383\n"
          ]
        }
      ],
      "source": [
        "n = 10000000\n",
        "\n",
        "nsamples = 4*n\n",
        "result = rknl(15, range=slice(n), queue=queue).get() / nsamples\n",
        "\n",
        "print(result)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": true
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.5.1+"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}