Skip to content

Numerical Integration

Haden Smith edited this page Oct 21, 2024 · 15 revisions

Numerical integration, also known as numerical quadrature, is a fundamental technique for approximating definite integrals. It has wide-ranging applications in various scientific and engineering fields. For example, in statistics, the expected value of a random variable is calculated using an integral, and numerical integration can be employed to approximate this expected value.

Single Dimension Integration

The Numerics library provides several methods for performing numerical integration on single dimensional integrands. Each algorithm computes an approximation to a definite integral of the form:

$$I = \int\limits_{a}^{b}f(x) \cdot dx$$

For the first example, let's consider a simple function with a single variable:

$$f(x)=x^3$$

Integrating from $a=0$ to $b=1$ yields the exact solution:

$$\int\limits_{0}^{1}f(x) \cdot dx = \frac{1}{4}x^4 \biggr|_0^1 = \frac{1}{4} \cdot 1^4 - 0 = 0.25$$

Definite integrals can be numerically solved using Riemann sums, such as the trapezoidal rule. This method works by approximating the region under the function $f(x)$ as a trapezoid and calculating its area:

$$I =\int\limits_{0}^{1}f(x) \cdot dx \approx \left(\frac{f(a) + f(b)}{2} \right)\cdot(b-a)$$

This approximation can be improved by partitioning (or binning) the integration interval $[a,b]$ and then applying the trapezoidal rule to each subinterval and summing the results:

$$I =\int\limits_{0}^{1}f(x) \cdot dx \approx \sum_{i=1}^{N} \left(\frac{f(x_{i-1}) + f(x_i)}{2} \right)\cdot(x_i-x_i)$$

Now, let's implement this in Numerics. First, we need to reference the Integration namespace:

using Numerics.Mathematics.Integration;

Next, create the test function:

/// <summary>
/// Test function: f(x) = x^3
/// </summary>
public double FX(double x)
{
    return Math.Pow(x, 3);
}

The Integration class is a static class that contains the Midpoint Rule, Trapezoidal Rule, Simpson's Rule, and the 10-point Gauss-Legendre integration methods. Let's first compute the integral using the Trapezoidal Rule shown above method with 10 bins (or steps):

double result = Integration.TrapezoidalRule(FX, 0, 1, 10); // 0.25249999999999995

Increasing the number of steps will increase the accuracy. Let's compute it again using 1,000 steps:

double result = Integration.TrapezoidalRule(FX, 0, 1, 1000); // 0.25000025000000053

We can see that this is much more precise.

Adaptive Integration

The challenge with static numerical integration methods, such as the trapezoidal rule mentioned above, is that the user must specify both the limits of integration and the number of integration bins. If the integrand function has subregions with high variance, this approach can lead to large approximation errors. Many real-world integrand functions have substantial weight concentrated in narrow subregions, resulting in wasted integration bins in areas that contribute little to the total weight.

Adaptive integration, a more refined numerical integration method, adjusts subintervals within the integration bounds based on the behavior of the function. These methods concentrate subintervals in regions that contribute the most to the integral, overcoming the limitations of static approaches.

The Numerics library provides two adaptive integration routines: the Adaptive Simpson's Rule and the Adaptive Gauss-Lobatto method.

The Adaptive Simpson’s Rule (ASR) algorithm subdivides the integration interval recursively until a user-defined tolerance is achieved. In each subinterval, Simpson’s Rule is used to approximate the region under the function $f(x)$ as a weighted average of the trapezoidal and midpoint methods:

$$I =\int\limits_{a}^{b}f(x) \cdot dx \approx \left[f(a) + 4 \cdot f \left(\frac{a+b}{2} \right) + f(b) \right]\cdot \left(\frac{b-a}{6} \right)$$

The criterion for determining when to stop subdividing an interval is:

$$\frac{1}{15} \cdot \left| S(a, m) + S(m, b) - S(a,b) \right| \leq \epsilon + \epsilon \cdot \left| S(a, b) \right|$$

where $[a,b]$ is the integration interval, $m = \frac{a+b}{2}$, $S(\cdot)$ represents Simpson's Rule evaluated at those intervals, and $\epsilon$ is the absolute and relative error tolerance for the interval. Each subinterval is recursively subdivided and evaluated until the specified tolerance is met.

More details on the ASR and Adaptive Gauss-Lobatto (AGL) methods can be found in [1] and [2], respectively.

To use the ASR method, follow these steps:

var asr = new AdaptiveSimpsonsRule(FX3, 0, 1);
asr.Integrate();
double result = asr.Result; // 0.25

For this simple test function, the ASR method requires only $5$ function evaluations to converge with an absolute and relative tolerance of $1 \times 10^{-8}$. It should be noted that the ASR method gives exact results for 3rd degree (or less) polynomials.

Alternatively, we can use the AGL method as follows:

var agl = new AdaptiveGaussLobatto(FX3, 0, 1);
agl.Integrate();
double result = agl.Result; // 0.24999999999999997

The AGL method requires 18 function evaluations to converge given an absolute and relative tolerance of $1 \times 10^{-8}$.

For a more challenging test problem, let's compute the mean of a Gamma distribution with a scale of $\theta = 10$ and shape $\kappa = 5$. The true mean of the distribution is given by:

$$\mu = \theta \cdot \kappa = 50$$

The probability density function (PDF) of the Gamma distribution is:

$$f(x) = \frac{1}{\Gamma(\kappa)\theta^{\kappa}}x^{\kappa-1}e^{-\frac{x}{\theta}}$$

The mean of a continuous probability distribution is computed as:

$$\mu = \mathbb{E} [X] = \int\limits_{-\infty}^{\infty} x \cdot f(x) \cdot dx$$

Now, let's implement this in Numerics. First, we need to reference the Integration and Distributions namespaces:

using Numerics.Mathematics.Integration;
using Numerics.Distributions;

Then, using the ASR method, follow these steps:

// Create the Gamma distribution and set the integration limits
var gamma = new GammaDistribution(10, 5);
double a = gamma.InverseCDF(1E-16); // Lower limit based on a very small cumulative probability
double b = gamma.InverseCDF(1 - 1E-16); // Upper limit based on a near-1 cumulative probability

// Define the integrand function
double I(double x)
{
    return x * gamma.PDF(x);
}

// Perform the integration
var asr = new AdaptiveSimpsonsRule(I, a, b);
asr.Integrate();
double result = asr.Result; // 50.000000004866415

The ASR method requires $365$ function evaluations to reach convergence.

Multidimensional Integration

Multidimensional integration, also known as multiple or multivariate integration, involves evaluating integrals over functions of more than one variable. Instead of integrating over a single interval, as in one-dimensional integration, you integrate over a region in a multidimensional space. This is commonly used in fields like physics, engineering, and statistics where systems often depend on multiple variables.

Solving multidimensional integrals is computationally demanding. If traditional, nonadaptive numerical integration techniques were used, the solution would require $K^D$ iterations, where $K$ is the number of integration steps (or bins) and $D$ is the number of dimensions. If there were $100$ integration steps and $5$ dimensions, the solution would need $10$ billion iterations.

To avoid these computation limitations, the Numerics library provides three multidimensional integration routines: Monte Carlo, Miser, and VEGAS. Each algorithm computes an approximation to a definite integral of the form:

$$I = \int\limits_{a_1}^{b_1} \cdots \int\limits_{a_D}^{b_D} f(x_1, \cdots ,x_D) \cdot dx_1 \cdots dx_D$$

Monte Carlo Integration

The basic idea of Monte Carlo integration is to approximate the value of a multidimensional integral by randomly sampling points in the domain and averaging the function values at those points. Consider a two-dimensional integral over a square region $[a_1, b_1] \times [a_2, b_2]$:

$$I = \int\limits_{a_1}^{b_1} \int\limits_{a_2}^{b_2} f(x,y) \cdot dx \cdot dy$$

The Monte Carlo method proceeds as follows:

  1. Generate Random Points: Randomly sample $N$ points $x_i, y_i$ within the square. For uniform sampling, these points will be evenly distributed across the integration domain $\Omega$.
  2. Evaluate the Function: Compute value of the integrand $f(x_i, y_i)$ at each of the randomly sampled points.
  3. Estimate the integral: Average the function values over the $N$ samples and multiply by the volume $V=(b_1-a_1)\cdot(b_2-a_2)$ of the integration domain to approximate the integral:
$$I = \frac{1}{N} \sum_{i=1}^{N} f(x,y) \cdot V$$

The standard error of the Monte Carlo estimate is given by:

$$\sigma = V \cdot \sqrt{ \frac{1}{N} \sum_{i=1}^{N} \left( f(x,y)^2 - I^2 \right) }$$

The error of the Monte Carlo estimate decreases inversely with the square root of the number of samples $N$, meaning that as $N$ increases, the estimate becomes more accurate.

A classic example is computing $\pi$ using Monte Carlo integration. The integrand function is:

$$f(x, y) = \begin{cases} 1, & x^2 + y^2 \lt 1 \\\ 0, & x^2 + y^2 \ge 1 \end{cases}$$

Integrating over the region from $a=[-1,-1]$ to $b=[1,1]$ gives the exact result of $\pi=3.141593\dots$.

Here's how we can implement this in the Numerics library:

/// <summary>
/// Test function: The integral of Pi should equal ~3.14
/// </summary>
/// <param name="vals">Array of values.</param>
public double PI(double[] vals)
{
    var x = vals[0];
    var y = vals[1];
    return (x * x + y * y < 1) ? 1 : 0;
}

Now, let's solve this using Monte Carlo integration. First, we run it with $N=100,000$ iterations:

var a = new double[] { -1, -1 };
var b = new double[] { 1, 1 };
var mc = new MonteCarloIntegration(PI, 2, a, b);
mc.Random = new MersenneTwister(12345); // Set the random number generator for repeatability
mc.MaxIterations = 100000;
mc.Integrate();
var result = mc.Result; // 3.15512
// Standard Error: 0.005163039594657394

We see that the result is close but still has a noticeable error. Now, let’s run it again with the default setting, where the maximum iterations are $N=100,000,000$.

var a = new double[] { -1, -1 };
var b = new double[] { 1, 1 };
var mc = new MonteCarloIntegration(PI, 2, a, b);
mc.Random = new MersenneTwister(12345); // Set the random number generator for repeatability
mc.Integrate();
var result = mc.Result; // 3.1412028
// Standard Error: 0.00051938965808650454

This result is much closer to the true value of $\pi$.

Unlike traditional methods, the complexity of Monte Carlo integration grows slowly with the number of dimensions, making it particularly useful for high-dimensional problems. The Monte Carlo approach is simple to implement in higher dimensions and can handle irregular domains and complex integrands. However, it converges slowly; the error decreases as $O \left( \frac{1}{\sqrt{N}} \right)$, meaning to halve the error, you need to quadruple the number of samples.

Adaptive Stratified Sampling

The Miser integration algorithm is a type of adaptive Monte Carlo method designed for efficient evaluation of multidimensional integrals. It is particularly well-suited for integrands that exhibit regions of high variance, as it allocates more samples to areas where the integrand contributes more to the total integral. The algorithm combines the flexibility of Monte Carlo integration with adaptive subdivision techniques to enhance accuracy and efficiency in complex, high-dimensional problems.

Key Concepts of the Miser Algorithm:

  1. Adaptive Subdivision: Miser improves upon basic Monte Carlo integration by recursively subdividing the integration domain into smaller regions. The algorithm then allocates more samples to the subregions where the integrand has higher variance, focusing computational resources where they are most needed.

  2. Variance-Based Sampling: The Miser algorithm estimates the variance of the integrand in different subregions. Subregions with higher variance are given a greater proportion of the total samples. This reduces the error by refining the integral in the parts of the domain that contribute the most to the integral’s value.

For more details on the stratified sampling and the Miser algorithm, see [2] and [3].

Now, let's solve the $\pi$ test function using Miser with $N=100,000$ iterations:

var a = new double[] { -1, -1 };
var b = new double[] { 1, 1 };
var miser = new Miser(PI, 2, a, b);
miser.Random = new MersenneTwister(12345); // Set the random number generator for repeatability
miser.MaxIterations = 100000;
miser.Integrate();
var result = miser.Result; // 3.1420673978501474
// Standard Error: 0.0011021039112025584

With the same number of samples, Miser produces a more accurate result with smaller variance than basic Monte Carlo integration.

Adaptive Importance Sampling

The VEGAS integration method is a Monte Carlo-based numerical integration technique designed for efficiently evaluating high-dimensional integrals, particularly when dealing with functions that have significant variability in certain regions of the integration space [4] [5]. It is widely used in computational physics and other fields requiring the evaluation of complex integrals.

Key Features of the VEGAS Algorithm:

  1. Importance Sampling: VEGAS employs importance sampling to focus the integration effort on regions where the integrand contributes most significantly to the integral. This helps to improve the accuracy of the integral estimate while reducing variance.

  2. Adaptive Grid: The algorithm adapts the sampling grid based on the characteristics of the integrand. It divides the integration domain into smaller subregions, and the sampling density is adjusted according to the estimated contribution of each region to the overall integral.

  3. Iterative Approach: VEGAS works in iterations, refining the sampling strategy with each pass. In the first iteration, a uniform grid is typically used. After evaluating the integrand, the method estimates the probability distribution of the function values, allowing the grid to be adjusted in subsequent iterations to better capture areas with higher contributions.

For more details on the importance sampling and the VEGAS algorithm, see [2] and [3].

Now, let's solve the $\pi$ test function using VEGAS. The VEGAS method requires the integrand function to take a point $x$ and an importance sampling weight $w$ as inputs. For this example, we will reuse the previous test function without utilizing the weight value:

var a = new double[] { -1, -1 };
var b = new double[] { 1, 1 };
var vegas = new Vegas((x, w) => { return PI(x); }, 2, a, b);
vegas.Random = new MersenneTwister(12345); // Set the random number generator for repeatability
vegas.Integrate();
var result = vegas.Result; // 3.1418009008273735
// Standard Error: 0.0021254440306027445

In Numerics, the VEGAS method iteratively adapts and refines the grid until convergence to a relative tolerance of $1 \times 10^{-3}$ is achieved. For this test problem, only $19,600$ function evaluations are required to reach convergence.

For a more challenging test, let's compute the mean of the sum of independent Normal distributions. The multidimensional integrand function can be written as

$$f(x_1, \cdots ,x_D) = \sum_{k=1}^{D} x_k \cdot \prod_{k=1}^{D} \phi (x_k | \mu_k, \sigma_k)$$

where $\phi(\cdot)$ is the PDF of the $k$-th Normal distribution with a mean $\mu_k$ and standard deviation $\sigma_k$. The exact solution for the mean of the sum of these random variables is:

$$E[X] = \sum_{k=1}^{D} \mu_k$$

For this test, we use five Normal distributions with means $\mu = [10, 30, 17, 99, 68]$ and standard deviations $\sigma = [2, 15, 5, 14, 7]$. Therefore, the exact solution is:

$$E[X] = 10+30+17+99+68=224$$

Here's how we can implement this in the Numerics library. First, we need to reference the Integration and Distributions namespaces:

using Numerics.Mathematics.Integration;
using Numerics.Distributions;

Then, follow these steps:

// Create the Normal distributions and set the integration limits
var mu = new double[] { 10, 30, 17, 99, 68 };
var sigma = new double[] { 2, 15, 5, 14, 7 };
var dists = new Normal[5];
var min = new double[5];
var max = new double[5];
for (int i = 0; i < 5; i++)
{
    dists[i] = new Normal(mu[i], sigma[i]);
    min[i] = dists[i].InverseCDF(1E-16); // Lower limit based on a very small cumulative probability
    max[i] = dists[i].InverseCDF(1 - 1E-16); // Upper limit based on a near-1 cumulative probability
}

// Define the integrand function
double SumOfNormals(double[] x, double w)
{
    double sum = 0;
    double prod = 1;
    for (int i = 0; i < mu.Length; i++)
    {
        sum += x[i];
        prod *= dists[i].PDF(x[i]);
    }
    return sum * prod;
}

// Perform the integration
var vegas = new Vegas(SumOfNormals, 5, min, max);
vegas.Integrate();
var result = vegas.Result; // 224.07455771892427
// Standard Error: 0.222433294342273

For this more complex test problem, $468,750$ function evaluations are required to achieve convergence.

References

[1] P. J. Davis and P. Rabinowitz, Methods of Numerical Integration, 2nd ed., Mineola, New York: Dover Publications, Inc., 2007.

[2] W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing., 3rd ed., Cambridge, UK: Cambridge University Press, 2017.

[3] A. Ciric, A Guide to Monte Carlo & Quantum Monte Carlo Methods, Createspace Independent Publishing Platform, 2016.

[4] G. Lepage, "A New Algorithm for Adaptive Multidimensional Integration," Journal of Computational Physics, vol. 27, no. 1, pp. 192-203, 1978.

[5] G. Lepage, "VEGAS: An Adaptive Multidimensional Integration Program," Cornell University, 1980.

Clone this wiki locally