Box-Muller Transformation
The Box-Muller transformation is a method to construct a random variable that follows a Gaussian distribution by using a random variable that follows a uniform distribution.
Content
The Box-Muller transform is a method used to generate normally distributed random variables from uniformly distributed random variables.
Specifically, it involves selecting two random variables, and , which are uniformly distributed over the interval ([0,1]). The random variables and are constructed as follows:
Then, and follow a normal distribution with a mean of 0 and a variance of 1.
Derivation
Assume that and are independent, normally distributed random variables with a mean of 0 and a variance of 1. Let and be their probability density functions. Thus, we have:
Since and are independent, their joint probability density function is:
We now perform a coordinate transformation by defining:
Thus, the joint distribution in polar coordinates becomes:
From this, we can derive the distribution functions for and , denoted as and :
It is evident that follows a uniform distribution over . Let
The inverse function is given by:
When is uniformly distributed over , the distribution function of is . Therefore, we can select two random variables and , uniformly distributed over , such that:
Substituting these into the earlier expressions, we obtain:
Thus, the original expressions for and are recovered, where they follow a normal distribution with mean 0 and variance 1.
Verification
To generate a random variable following a standard normal distribution, we note that the inverse of the error function is not elementary. Using a series expansion to approximate this inverse can introduce significant computational error.
Elementary functions, on the other hand, have well-established high-precision algorithms. Thus, we aim to construct a normally distributed random variable from a uniformly distributed one using elementary functions.
Let be uniformly distributed random variables. Consider the following transformations:
Next, we verify that and are independent and follow a standard normal distribution.
Define:
Note that and . We now show that and , under the polar coordinate transformation, are independent and that they yield standard normal random variables.
Consider the probability density function of one variable when the other
is fixed. The probability density function of is:
which is straightforward. For , the cumulative distribution function is:
Thus, the probability density function of is:
Hence, the joint probability density of and is:
Since the polar coordinates yield:
we obtain the joint probability density of and :
This is precisely the probability density function of two independent, normally distributed random variables and , each following a standard normal distribution.
Code Implementation
function randInterval(): number {
return Math.random();
}
function boxMuller(mu: number, sigma: number): [number, number] {
const u = randInterval();
const v = randInterval();
const x = Math.cos(2 * Math.PI * u) * Math.sqrt(-2 * Math.log(v));
const y = Math.sin(2 * Math.PI * u) * Math.sqrt(-2 * Math.log(v));
return [x * sigma + mu, y * sigma + mu];
}
// Usage example
const [value1, value2] = boxMuller(0, 1);
console.log(value1, value2);
// Outputs two normally distributed random numbers
To verify that the boxMuller
function generates numbers that follow a normal distribution, one can generate a large set of random numbers and plot a histogram to compare with a standard normal distribution. You can use libraries such as plotly
to visualize the histogram and evaluate the output.
Here is a test program that generates a dataset using the boxMuller
function and performs basic statistical tests, along with plotting the histogram. Assuming you are running in a Node.js environment, you can use plotly-nodejs
or directly use plotly.js
in the browser to generate the graph.
First, install the required dependencies:
npm i plotly.js-dist
Then, execute the following code:
import * as fs from "fs";
import * as plotly from "plotly.js-dist";
// Generate random numbers following a normal distribution using Box-Muller
function randInterval(): number {
return Math.random();
}
function boxMuller(mu: number, sigma: number): [number, number] {
const u = randInterval();
const v = randInterval();
const x = Math.cos(2 * Math.PI * u) * Math.sqrt(-2 * Math.log(v));
const y = Math.sin(2 * Math.PI * u) * Math.sqrt(-2 * Math.log(v));
return [x * sigma + mu, y * sigma + mu];
}
// Generate n normally distributed random numbers
function generateNormalDistribution(
mu: number,
sigma: number,
n: number
): number[] {
const values: number[] = [];
for (let i = 0; i < n / 2; i++) {
const [value1, value2] = boxMuller(mu, sigma);
values.push(value1, value2);
}
return values;
}
// Generate the data
const mu = 0;
const sigma = 1;
const sampleSize = 10000; // Sample size
const values = generateNormalDistribution(mu, sigma, sampleSize);
// Calculate the mean and variance of the generated random numbers
const mean = values.reduce((acc, val) => acc + val, 0) / values.length;
const variance =
values.reduce((acc, val) => acc + (val - mean) ** 2, 0) / values.length;
console.log(`Mean: ${mean}`);
console.log(`Variance: ${variance}`);
// Plot the histogram
const trace = {
x: values,
type: "histogram",
xbins: {
size: 0.1, // Bin size for the histogram
},
marker: {
color: "blue",
},
opacity: 0.7,
name: "Box-Muller Distribution",
};
const data = [trace];
const layout = {
title: "Generated Normal Distribution",
xaxis: { title: "Value" },
yaxis: { title: "Frequency" },
bargap: 0.05,
};
const graphOptions = {
filename: "box-muller-distribution",
fileopt: "overwrite",
};
fs.writeFileSync("box-muller.html", plotly.plot(data, layout, graphOptions));
console.log("Box-Muller distribution data saved as box-muller.html.");
This program uses the boxMuller
function to generate 10,000 random numbers that follow a normal distribution. It calculates the mean and variance of the generated dataset. The random numbers are visualized using a histogram, allowing for a direct comparison with a standard normal distribution. The histogram is saved as a file named box-muller.html
, which can be opened in a browser for viewing.