Finite discrete approximation to the normal distribution
up vote
3
down vote
favorite
I wish to derive a finite (that is, which has a finite support) discrete approximation to a normal distribution, with the following considerations:
- It should have exactly the same mean and variance
- It must be symmetric
- It should resemble the normal distribution in some sense (unimodal pmf, etc.)
- It should be discrete and finite (having finite support), with a pre-determined set which contains the support (for example, the integers).
Naive attempt
Here's a naive attempt. Suppose we wish to give an approximation to $mathcal{N}(mu,sigma^2)$. Let the support be $S=left[lfloor mu-dsigma rfloor,lceil mu+dsigmarceilright]$ (for some natural $d$, perhaps $3$), and define the following pmf:
begin{equation*}
f(k) =
begin{cases}
Phi_{mu,sigma^2}left(k+frac{1}{2}right) & k = min{S}\
Phi_{mu,sigma^2}left(k+frac{1}{2}right)
- Phi_{mu,sigma^2}left(k-frac{1}{2}right) & min{S} < k < max{S}\
1 - Phi_{mu,sigma^2}left(k-frac{1}{2}right) & k = max{S}
end{cases}
end{equation*}
where $Phi_{mu,sigma^2}$ is the cdf of $mathcal{N}(mu,sigma^2)$. This is a legitimate pmf (sums to $1$), it is symmetric, unimodal, discrete and finite, and has mean $mu$ -- but it does not have variance $sigma^2$ (I think it always has a larger variance).
Can you fix this naive solution somehow?
probability-distributions normal-distribution
add a comment |
up vote
3
down vote
favorite
I wish to derive a finite (that is, which has a finite support) discrete approximation to a normal distribution, with the following considerations:
- It should have exactly the same mean and variance
- It must be symmetric
- It should resemble the normal distribution in some sense (unimodal pmf, etc.)
- It should be discrete and finite (having finite support), with a pre-determined set which contains the support (for example, the integers).
Naive attempt
Here's a naive attempt. Suppose we wish to give an approximation to $mathcal{N}(mu,sigma^2)$. Let the support be $S=left[lfloor mu-dsigma rfloor,lceil mu+dsigmarceilright]$ (for some natural $d$, perhaps $3$), and define the following pmf:
begin{equation*}
f(k) =
begin{cases}
Phi_{mu,sigma^2}left(k+frac{1}{2}right) & k = min{S}\
Phi_{mu,sigma^2}left(k+frac{1}{2}right)
- Phi_{mu,sigma^2}left(k-frac{1}{2}right) & min{S} < k < max{S}\
1 - Phi_{mu,sigma^2}left(k-frac{1}{2}right) & k = max{S}
end{cases}
end{equation*}
where $Phi_{mu,sigma^2}$ is the cdf of $mathcal{N}(mu,sigma^2)$. This is a legitimate pmf (sums to $1$), it is symmetric, unimodal, discrete and finite, and has mean $mu$ -- but it does not have variance $sigma^2$ (I think it always has a larger variance).
Can you fix this naive solution somehow?
probability-distributions normal-distribution
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I wish to derive a finite (that is, which has a finite support) discrete approximation to a normal distribution, with the following considerations:
- It should have exactly the same mean and variance
- It must be symmetric
- It should resemble the normal distribution in some sense (unimodal pmf, etc.)
- It should be discrete and finite (having finite support), with a pre-determined set which contains the support (for example, the integers).
Naive attempt
Here's a naive attempt. Suppose we wish to give an approximation to $mathcal{N}(mu,sigma^2)$. Let the support be $S=left[lfloor mu-dsigma rfloor,lceil mu+dsigmarceilright]$ (for some natural $d$, perhaps $3$), and define the following pmf:
begin{equation*}
f(k) =
begin{cases}
Phi_{mu,sigma^2}left(k+frac{1}{2}right) & k = min{S}\
Phi_{mu,sigma^2}left(k+frac{1}{2}right)
- Phi_{mu,sigma^2}left(k-frac{1}{2}right) & min{S} < k < max{S}\
1 - Phi_{mu,sigma^2}left(k-frac{1}{2}right) & k = max{S}
end{cases}
end{equation*}
where $Phi_{mu,sigma^2}$ is the cdf of $mathcal{N}(mu,sigma^2)$. This is a legitimate pmf (sums to $1$), it is symmetric, unimodal, discrete and finite, and has mean $mu$ -- but it does not have variance $sigma^2$ (I think it always has a larger variance).
Can you fix this naive solution somehow?
probability-distributions normal-distribution
I wish to derive a finite (that is, which has a finite support) discrete approximation to a normal distribution, with the following considerations:
- It should have exactly the same mean and variance
- It must be symmetric
- It should resemble the normal distribution in some sense (unimodal pmf, etc.)
- It should be discrete and finite (having finite support), with a pre-determined set which contains the support (for example, the integers).
Naive attempt
Here's a naive attempt. Suppose we wish to give an approximation to $mathcal{N}(mu,sigma^2)$. Let the support be $S=left[lfloor mu-dsigma rfloor,lceil mu+dsigmarceilright]$ (for some natural $d$, perhaps $3$), and define the following pmf:
begin{equation*}
f(k) =
begin{cases}
Phi_{mu,sigma^2}left(k+frac{1}{2}right) & k = min{S}\
Phi_{mu,sigma^2}left(k+frac{1}{2}right)
- Phi_{mu,sigma^2}left(k-frac{1}{2}right) & min{S} < k < max{S}\
1 - Phi_{mu,sigma^2}left(k-frac{1}{2}right) & k = max{S}
end{cases}
end{equation*}
where $Phi_{mu,sigma^2}$ is the cdf of $mathcal{N}(mu,sigma^2)$. This is a legitimate pmf (sums to $1$), it is symmetric, unimodal, discrete and finite, and has mean $mu$ -- but it does not have variance $sigma^2$ (I think it always has a larger variance).
Can you fix this naive solution somehow?
probability-distributions normal-distribution
probability-distributions normal-distribution
edited Sep 5 '14 at 9:00
asked Sep 5 '14 at 8:46
Bach
1,062823
1,062823
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59
add a comment |
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
Intuitively, two distributions which closely approximate one another would have similar PDF (or PMF).
$$P(X=k)$$
A literal one to one numerical comparison cannot strictly be made between a PMF and a PDF, yet a simple comparison can be made for statistical measures: mean, variance, skewness, and kurtosis.
$$mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}$$
A given Normal distribution $X sim mathcal{N}(mu, {sigma}^2)$ corresponds to the following measures, respectively.
$$mu, {sigma}^2, 0, 3$$
General form
In exploring possible solutions, I arrived at this general template.
Given $m_0, m_n in mathbb{Z}$, $mu_0 in mathbb{R}$ and $sigma_0 in mathbb{R}^{+}$ as inputs, where $m_0 lt mu lt m_n$.
Construct a function $f(k) ge 0$ defined over a finite support $S$ of size $n+1$.
$$k in S = {m_0, m_1, ldots, m_n}$$
$$ m_i = m_0 + i $$
And then define a discrete distribution with a PMF proportional to $f(k)$ as follows.
$$P(X=k) = frac{f(k)}{C_S}$$
$$C_S = sum_{k in S}f(k)$$
As desired, the inputs may be restricted further as follows. Given $d in mathbb{R}^{+}$ choose $m_0, m_n$ such that
$$m_0 = lfloor mu_0 - sigma_0 d rfloor$$
$$m_n = lceil mu_0 + sigma_0 d rceil$$
Acceptance testing of $f(k)$
This general form clearly meets question requirement #4, so for each $f(k)$ I want to evaluate how well the associated distribution meets the question requirements #1, #2 and #3. I am interpreting requirement #2 more specifically as follows:
- It must be symmetric where possible ($2mu in mathbb{Z}$) or at least have skewness $frac{mu_3}{sigma^3} approx 0$
Fixing the naive attempt
With this general form, some improvements to the naive attempt become apparent.
- There is a free parameter $d$. So instead of a conservative $d=3$, choose to broaden the support to at least $d=6$
- The general form preserves the unit sum of a PMF. So having the tails of the normal distribution capped into the endpoints of the PMF by a piecewise function is unnecessary.
- The inputs $mu_0, sigma_0$ may not exactly equal $mu, sigma$, but an appropriate choice of inputs may target the desired $mu, sigma$ correctly.
Discrete Normal PMF ($f_1(k)$)
Summary: (Fix the naive attempt:) Create a PMF over a finite support that is exactly proportional to the continuity correction values obtained from a Normal CDF of the same mean. The variance increases by this approximation, so find a $sigma_0$ that gives the desired $sigma$.
Given the general form, let $f_1(k)$ be defined as follows.
$$f_1(k) = Phi_{mu_0,sigma_0}(k+frac{1}{2})-Phi_{mu_0,sigma_0}(k-frac{1}{2})$$
In testing a range of inputs, I have found the observed variance is too big. Specifically, I find
$${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0^2+frac{1}{12}, 0, 3}$$
So I require inputs such that $mu_0=mu$ and $sigma_0=sqrt{sigma^2 - frac{1}{12}}$.
The approximation appears to improve as $sigma d to infty$.
Now, does $f_1(k)$ meet the requirements?
- Yes, except variance is not exact (but could be improved by refinements to $sigma_0$)
- Yes (as reposed in this answer)
- Yes (but one Normal is employed to approximate the statistics of another, begging the question, "Which Normal is being approximated?")
And does $f_1(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Improving on $f_1(k)$
I mentioned that a continuity correction of the Normal distribution gives us $f_1(k)$, but I might have arrived at the same function a different way. It turns out if I integrate the PDF of $mathcal{N}$, and then approximate the derivative using $h=1$ and the following formula, that I will get the same function.
$$g'(a)approxfrac{g(a+frac{h}{2})-g(a-frac{h}{2})}{h}$$
This reposing begs a question of, "Why not create the PMF directly from the PDF?"
Discrete Normal PMF ($f_2(k)$)
Summary: Create a PMF over a finite support that is exactly in proportion to a Normal PDF and which (in consequence) has statistical measures that match (allowing some margin of error).
Given the general form, let $f_2(k)$ be defined as follows.
$$f_2(k) = e^{frac{-(x-mu_0)^2}{2{sigma_0}^2}}$$
In testing a range of inputs, I have found decent agreement. Specifically, the statistical measures of the PMF are approximate to that of the Normal PDF from which they are derived.
${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0, 0, 3}$
The approximation appears to improve as $dsigma to infty$.
Now, does $f_2(k)$ meet the requirements?
- Approximately, but not exact (but could be improved by numerical methods)
- Yes (as reposed in this answer)
- Yes (the PMF is in exact proportion to the PDF of the Normal being approximated)
And does $f_2(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Intuitively, two distributions which closely approximate one another would have similar PDF (or PMF).
$$P(X=k)$$
A literal one to one numerical comparison cannot strictly be made between a PMF and a PDF, yet a simple comparison can be made for statistical measures: mean, variance, skewness, and kurtosis.
$$mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}$$
A given Normal distribution $X sim mathcal{N}(mu, {sigma}^2)$ corresponds to the following measures, respectively.
$$mu, {sigma}^2, 0, 3$$
General form
In exploring possible solutions, I arrived at this general template.
Given $m_0, m_n in mathbb{Z}$, $mu_0 in mathbb{R}$ and $sigma_0 in mathbb{R}^{+}$ as inputs, where $m_0 lt mu lt m_n$.
Construct a function $f(k) ge 0$ defined over a finite support $S$ of size $n+1$.
$$k in S = {m_0, m_1, ldots, m_n}$$
$$ m_i = m_0 + i $$
And then define a discrete distribution with a PMF proportional to $f(k)$ as follows.
$$P(X=k) = frac{f(k)}{C_S}$$
$$C_S = sum_{k in S}f(k)$$
As desired, the inputs may be restricted further as follows. Given $d in mathbb{R}^{+}$ choose $m_0, m_n$ such that
$$m_0 = lfloor mu_0 - sigma_0 d rfloor$$
$$m_n = lceil mu_0 + sigma_0 d rceil$$
Acceptance testing of $f(k)$
This general form clearly meets question requirement #4, so for each $f(k)$ I want to evaluate how well the associated distribution meets the question requirements #1, #2 and #3. I am interpreting requirement #2 more specifically as follows:
- It must be symmetric where possible ($2mu in mathbb{Z}$) or at least have skewness $frac{mu_3}{sigma^3} approx 0$
Fixing the naive attempt
With this general form, some improvements to the naive attempt become apparent.
- There is a free parameter $d$. So instead of a conservative $d=3$, choose to broaden the support to at least $d=6$
- The general form preserves the unit sum of a PMF. So having the tails of the normal distribution capped into the endpoints of the PMF by a piecewise function is unnecessary.
- The inputs $mu_0, sigma_0$ may not exactly equal $mu, sigma$, but an appropriate choice of inputs may target the desired $mu, sigma$ correctly.
Discrete Normal PMF ($f_1(k)$)
Summary: (Fix the naive attempt:) Create a PMF over a finite support that is exactly proportional to the continuity correction values obtained from a Normal CDF of the same mean. The variance increases by this approximation, so find a $sigma_0$ that gives the desired $sigma$.
Given the general form, let $f_1(k)$ be defined as follows.
$$f_1(k) = Phi_{mu_0,sigma_0}(k+frac{1}{2})-Phi_{mu_0,sigma_0}(k-frac{1}{2})$$
In testing a range of inputs, I have found the observed variance is too big. Specifically, I find
$${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0^2+frac{1}{12}, 0, 3}$$
So I require inputs such that $mu_0=mu$ and $sigma_0=sqrt{sigma^2 - frac{1}{12}}$.
The approximation appears to improve as $sigma d to infty$.
Now, does $f_1(k)$ meet the requirements?
- Yes, except variance is not exact (but could be improved by refinements to $sigma_0$)
- Yes (as reposed in this answer)
- Yes (but one Normal is employed to approximate the statistics of another, begging the question, "Which Normal is being approximated?")
And does $f_1(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Improving on $f_1(k)$
I mentioned that a continuity correction of the Normal distribution gives us $f_1(k)$, but I might have arrived at the same function a different way. It turns out if I integrate the PDF of $mathcal{N}$, and then approximate the derivative using $h=1$ and the following formula, that I will get the same function.
$$g'(a)approxfrac{g(a+frac{h}{2})-g(a-frac{h}{2})}{h}$$
This reposing begs a question of, "Why not create the PMF directly from the PDF?"
Discrete Normal PMF ($f_2(k)$)
Summary: Create a PMF over a finite support that is exactly in proportion to a Normal PDF and which (in consequence) has statistical measures that match (allowing some margin of error).
Given the general form, let $f_2(k)$ be defined as follows.
$$f_2(k) = e^{frac{-(x-mu_0)^2}{2{sigma_0}^2}}$$
In testing a range of inputs, I have found decent agreement. Specifically, the statistical measures of the PMF are approximate to that of the Normal PDF from which they are derived.
${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0, 0, 3}$
The approximation appears to improve as $dsigma to infty$.
Now, does $f_2(k)$ meet the requirements?
- Approximately, but not exact (but could be improved by numerical methods)
- Yes (as reposed in this answer)
- Yes (the PMF is in exact proportion to the PDF of the Normal being approximated)
And does $f_2(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
add a comment |
up vote
0
down vote
Intuitively, two distributions which closely approximate one another would have similar PDF (or PMF).
$$P(X=k)$$
A literal one to one numerical comparison cannot strictly be made between a PMF and a PDF, yet a simple comparison can be made for statistical measures: mean, variance, skewness, and kurtosis.
$$mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}$$
A given Normal distribution $X sim mathcal{N}(mu, {sigma}^2)$ corresponds to the following measures, respectively.
$$mu, {sigma}^2, 0, 3$$
General form
In exploring possible solutions, I arrived at this general template.
Given $m_0, m_n in mathbb{Z}$, $mu_0 in mathbb{R}$ and $sigma_0 in mathbb{R}^{+}$ as inputs, where $m_0 lt mu lt m_n$.
Construct a function $f(k) ge 0$ defined over a finite support $S$ of size $n+1$.
$$k in S = {m_0, m_1, ldots, m_n}$$
$$ m_i = m_0 + i $$
And then define a discrete distribution with a PMF proportional to $f(k)$ as follows.
$$P(X=k) = frac{f(k)}{C_S}$$
$$C_S = sum_{k in S}f(k)$$
As desired, the inputs may be restricted further as follows. Given $d in mathbb{R}^{+}$ choose $m_0, m_n$ such that
$$m_0 = lfloor mu_0 - sigma_0 d rfloor$$
$$m_n = lceil mu_0 + sigma_0 d rceil$$
Acceptance testing of $f(k)$
This general form clearly meets question requirement #4, so for each $f(k)$ I want to evaluate how well the associated distribution meets the question requirements #1, #2 and #3. I am interpreting requirement #2 more specifically as follows:
- It must be symmetric where possible ($2mu in mathbb{Z}$) or at least have skewness $frac{mu_3}{sigma^3} approx 0$
Fixing the naive attempt
With this general form, some improvements to the naive attempt become apparent.
- There is a free parameter $d$. So instead of a conservative $d=3$, choose to broaden the support to at least $d=6$
- The general form preserves the unit sum of a PMF. So having the tails of the normal distribution capped into the endpoints of the PMF by a piecewise function is unnecessary.
- The inputs $mu_0, sigma_0$ may not exactly equal $mu, sigma$, but an appropriate choice of inputs may target the desired $mu, sigma$ correctly.
Discrete Normal PMF ($f_1(k)$)
Summary: (Fix the naive attempt:) Create a PMF over a finite support that is exactly proportional to the continuity correction values obtained from a Normal CDF of the same mean. The variance increases by this approximation, so find a $sigma_0$ that gives the desired $sigma$.
Given the general form, let $f_1(k)$ be defined as follows.
$$f_1(k) = Phi_{mu_0,sigma_0}(k+frac{1}{2})-Phi_{mu_0,sigma_0}(k-frac{1}{2})$$
In testing a range of inputs, I have found the observed variance is too big. Specifically, I find
$${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0^2+frac{1}{12}, 0, 3}$$
So I require inputs such that $mu_0=mu$ and $sigma_0=sqrt{sigma^2 - frac{1}{12}}$.
The approximation appears to improve as $sigma d to infty$.
Now, does $f_1(k)$ meet the requirements?
- Yes, except variance is not exact (but could be improved by refinements to $sigma_0$)
- Yes (as reposed in this answer)
- Yes (but one Normal is employed to approximate the statistics of another, begging the question, "Which Normal is being approximated?")
And does $f_1(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Improving on $f_1(k)$
I mentioned that a continuity correction of the Normal distribution gives us $f_1(k)$, but I might have arrived at the same function a different way. It turns out if I integrate the PDF of $mathcal{N}$, and then approximate the derivative using $h=1$ and the following formula, that I will get the same function.
$$g'(a)approxfrac{g(a+frac{h}{2})-g(a-frac{h}{2})}{h}$$
This reposing begs a question of, "Why not create the PMF directly from the PDF?"
Discrete Normal PMF ($f_2(k)$)
Summary: Create a PMF over a finite support that is exactly in proportion to a Normal PDF and which (in consequence) has statistical measures that match (allowing some margin of error).
Given the general form, let $f_2(k)$ be defined as follows.
$$f_2(k) = e^{frac{-(x-mu_0)^2}{2{sigma_0}^2}}$$
In testing a range of inputs, I have found decent agreement. Specifically, the statistical measures of the PMF are approximate to that of the Normal PDF from which they are derived.
${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0, 0, 3}$
The approximation appears to improve as $dsigma to infty$.
Now, does $f_2(k)$ meet the requirements?
- Approximately, but not exact (but could be improved by numerical methods)
- Yes (as reposed in this answer)
- Yes (the PMF is in exact proportion to the PDF of the Normal being approximated)
And does $f_2(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
add a comment |
up vote
0
down vote
up vote
0
down vote
Intuitively, two distributions which closely approximate one another would have similar PDF (or PMF).
$$P(X=k)$$
A literal one to one numerical comparison cannot strictly be made between a PMF and a PDF, yet a simple comparison can be made for statistical measures: mean, variance, skewness, and kurtosis.
$$mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}$$
A given Normal distribution $X sim mathcal{N}(mu, {sigma}^2)$ corresponds to the following measures, respectively.
$$mu, {sigma}^2, 0, 3$$
General form
In exploring possible solutions, I arrived at this general template.
Given $m_0, m_n in mathbb{Z}$, $mu_0 in mathbb{R}$ and $sigma_0 in mathbb{R}^{+}$ as inputs, where $m_0 lt mu lt m_n$.
Construct a function $f(k) ge 0$ defined over a finite support $S$ of size $n+1$.
$$k in S = {m_0, m_1, ldots, m_n}$$
$$ m_i = m_0 + i $$
And then define a discrete distribution with a PMF proportional to $f(k)$ as follows.
$$P(X=k) = frac{f(k)}{C_S}$$
$$C_S = sum_{k in S}f(k)$$
As desired, the inputs may be restricted further as follows. Given $d in mathbb{R}^{+}$ choose $m_0, m_n$ such that
$$m_0 = lfloor mu_0 - sigma_0 d rfloor$$
$$m_n = lceil mu_0 + sigma_0 d rceil$$
Acceptance testing of $f(k)$
This general form clearly meets question requirement #4, so for each $f(k)$ I want to evaluate how well the associated distribution meets the question requirements #1, #2 and #3. I am interpreting requirement #2 more specifically as follows:
- It must be symmetric where possible ($2mu in mathbb{Z}$) or at least have skewness $frac{mu_3}{sigma^3} approx 0$
Fixing the naive attempt
With this general form, some improvements to the naive attempt become apparent.
- There is a free parameter $d$. So instead of a conservative $d=3$, choose to broaden the support to at least $d=6$
- The general form preserves the unit sum of a PMF. So having the tails of the normal distribution capped into the endpoints of the PMF by a piecewise function is unnecessary.
- The inputs $mu_0, sigma_0$ may not exactly equal $mu, sigma$, but an appropriate choice of inputs may target the desired $mu, sigma$ correctly.
Discrete Normal PMF ($f_1(k)$)
Summary: (Fix the naive attempt:) Create a PMF over a finite support that is exactly proportional to the continuity correction values obtained from a Normal CDF of the same mean. The variance increases by this approximation, so find a $sigma_0$ that gives the desired $sigma$.
Given the general form, let $f_1(k)$ be defined as follows.
$$f_1(k) = Phi_{mu_0,sigma_0}(k+frac{1}{2})-Phi_{mu_0,sigma_0}(k-frac{1}{2})$$
In testing a range of inputs, I have found the observed variance is too big. Specifically, I find
$${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0^2+frac{1}{12}, 0, 3}$$
So I require inputs such that $mu_0=mu$ and $sigma_0=sqrt{sigma^2 - frac{1}{12}}$.
The approximation appears to improve as $sigma d to infty$.
Now, does $f_1(k)$ meet the requirements?
- Yes, except variance is not exact (but could be improved by refinements to $sigma_0$)
- Yes (as reposed in this answer)
- Yes (but one Normal is employed to approximate the statistics of another, begging the question, "Which Normal is being approximated?")
And does $f_1(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Improving on $f_1(k)$
I mentioned that a continuity correction of the Normal distribution gives us $f_1(k)$, but I might have arrived at the same function a different way. It turns out if I integrate the PDF of $mathcal{N}$, and then approximate the derivative using $h=1$ and the following formula, that I will get the same function.
$$g'(a)approxfrac{g(a+frac{h}{2})-g(a-frac{h}{2})}{h}$$
This reposing begs a question of, "Why not create the PMF directly from the PDF?"
Discrete Normal PMF ($f_2(k)$)
Summary: Create a PMF over a finite support that is exactly in proportion to a Normal PDF and which (in consequence) has statistical measures that match (allowing some margin of error).
Given the general form, let $f_2(k)$ be defined as follows.
$$f_2(k) = e^{frac{-(x-mu_0)^2}{2{sigma_0}^2}}$$
In testing a range of inputs, I have found decent agreement. Specifically, the statistical measures of the PMF are approximate to that of the Normal PDF from which they are derived.
${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0, 0, 3}$
The approximation appears to improve as $dsigma to infty$.
Now, does $f_2(k)$ meet the requirements?
- Approximately, but not exact (but could be improved by numerical methods)
- Yes (as reposed in this answer)
- Yes (the PMF is in exact proportion to the PDF of the Normal being approximated)
And does $f_2(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Intuitively, two distributions which closely approximate one another would have similar PDF (or PMF).
$$P(X=k)$$
A literal one to one numerical comparison cannot strictly be made between a PMF and a PDF, yet a simple comparison can be made for statistical measures: mean, variance, skewness, and kurtosis.
$$mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}$$
A given Normal distribution $X sim mathcal{N}(mu, {sigma}^2)$ corresponds to the following measures, respectively.
$$mu, {sigma}^2, 0, 3$$
General form
In exploring possible solutions, I arrived at this general template.
Given $m_0, m_n in mathbb{Z}$, $mu_0 in mathbb{R}$ and $sigma_0 in mathbb{R}^{+}$ as inputs, where $m_0 lt mu lt m_n$.
Construct a function $f(k) ge 0$ defined over a finite support $S$ of size $n+1$.
$$k in S = {m_0, m_1, ldots, m_n}$$
$$ m_i = m_0 + i $$
And then define a discrete distribution with a PMF proportional to $f(k)$ as follows.
$$P(X=k) = frac{f(k)}{C_S}$$
$$C_S = sum_{k in S}f(k)$$
As desired, the inputs may be restricted further as follows. Given $d in mathbb{R}^{+}$ choose $m_0, m_n$ such that
$$m_0 = lfloor mu_0 - sigma_0 d rfloor$$
$$m_n = lceil mu_0 + sigma_0 d rceil$$
Acceptance testing of $f(k)$
This general form clearly meets question requirement #4, so for each $f(k)$ I want to evaluate how well the associated distribution meets the question requirements #1, #2 and #3. I am interpreting requirement #2 more specifically as follows:
- It must be symmetric where possible ($2mu in mathbb{Z}$) or at least have skewness $frac{mu_3}{sigma^3} approx 0$
Fixing the naive attempt
With this general form, some improvements to the naive attempt become apparent.
- There is a free parameter $d$. So instead of a conservative $d=3$, choose to broaden the support to at least $d=6$
- The general form preserves the unit sum of a PMF. So having the tails of the normal distribution capped into the endpoints of the PMF by a piecewise function is unnecessary.
- The inputs $mu_0, sigma_0$ may not exactly equal $mu, sigma$, but an appropriate choice of inputs may target the desired $mu, sigma$ correctly.
Discrete Normal PMF ($f_1(k)$)
Summary: (Fix the naive attempt:) Create a PMF over a finite support that is exactly proportional to the continuity correction values obtained from a Normal CDF of the same mean. The variance increases by this approximation, so find a $sigma_0$ that gives the desired $sigma$.
Given the general form, let $f_1(k)$ be defined as follows.
$$f_1(k) = Phi_{mu_0,sigma_0}(k+frac{1}{2})-Phi_{mu_0,sigma_0}(k-frac{1}{2})$$
In testing a range of inputs, I have found the observed variance is too big. Specifically, I find
$${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0^2+frac{1}{12}, 0, 3}$$
So I require inputs such that $mu_0=mu$ and $sigma_0=sqrt{sigma^2 - frac{1}{12}}$.
The approximation appears to improve as $sigma d to infty$.
Now, does $f_1(k)$ meet the requirements?
- Yes, except variance is not exact (but could be improved by refinements to $sigma_0$)
- Yes (as reposed in this answer)
- Yes (but one Normal is employed to approximate the statistics of another, begging the question, "Which Normal is being approximated?")
And does $f_1(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
Improving on $f_1(k)$
I mentioned that a continuity correction of the Normal distribution gives us $f_1(k)$, but I might have arrived at the same function a different way. It turns out if I integrate the PDF of $mathcal{N}$, and then approximate the derivative using $h=1$ and the following formula, that I will get the same function.
$$g'(a)approxfrac{g(a+frac{h}{2})-g(a-frac{h}{2})}{h}$$
This reposing begs a question of, "Why not create the PMF directly from the PDF?"
Discrete Normal PMF ($f_2(k)$)
Summary: Create a PMF over a finite support that is exactly in proportion to a Normal PDF and which (in consequence) has statistical measures that match (allowing some margin of error).
Given the general form, let $f_2(k)$ be defined as follows.
$$f_2(k) = e^{frac{-(x-mu_0)^2}{2{sigma_0}^2}}$$
In testing a range of inputs, I have found decent agreement. Specifically, the statistical measures of the PMF are approximate to that of the Normal PDF from which they are derived.
${mu, sigma^2, frac{mu_3}{sigma^3}, frac{mu_4}{sigma^4}} approx {mu_0, sigma_0, 0, 3}$
The approximation appears to improve as $dsigma to infty$.
Now, does $f_2(k)$ meet the requirements?
- Approximately, but not exact (but could be improved by numerical methods)
- Yes (as reposed in this answer)
- Yes (the PMF is in exact proportion to the PDF of the Normal being approximated)
And does $f_2(k)$ have any limitations?
- $sigma_0 ge 1$ and $d ge 6$ (so that the variance is approximately correct)
edited Aug 18 '17 at 3:29
answered Aug 18 '17 at 3:21
pvtrs
112
112
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f920252%2ffinite-discrete-approximation-to-the-normal-distribution%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Do you mean it has finite support? (i.e.$f(k)=0$ for $|k|geq K$ for some $K>0$)
– Dan
Sep 5 '14 at 8:59
Indeed, I'll clarify this.
– Bach
Sep 5 '14 at 9:00
If you wish to change the variance for a specified support you will have to change the shape of your distribution away from a Gaussian over that interval. An easy way (but almost certainly not what you're after) is to add two delta functions with suitable coefficients to each end of the support.
– Dan
Sep 5 '14 at 9:05
Your requirements cannot be met for arbitrary mean $mu$. This is because a symmetric approximating distribution with support ${ a_1, ... a_n} $ will have a mean that's necessarily either $a_{(n+1)/2}$ (odd $n$) or $frac{1}{2}(a_{n/2} + a_{n/2 + 1})$ (even $n$); e.g., if the support is a set of consecutive integers, then the mean must be a multiple of $frac{1}{2}$.
– r.e.s.
Sep 7 '14 at 16:59
(cont'd) In fact, your "naive solution" is not symmetric, and yet does not exactly match an arbitrary mean. For example, with $mu = 4.9, sigma = 1$, and support ${lfloor mu- 3 sigma rfloor, ..., lceil mu + 3 sigmarceil}= {1,...,8}$, it gives $f(1)=0.0003369... ne f(8)=0.004661...$ (not symmetric) and mean $=4.8998... ne 4.9$ (not an exact match).
– r.e.s.
Sep 7 '14 at 16:59