The derivation of the Wald interval












2












$begingroup$


I'm asking about the binomial proportion confidence interval, also known as the Wald interval.



Recall that
$$lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha, $$
with $sigma = sqrt{p(1-p)}$.



Starting from the expression above, and the fact that for $hat{p}=dfrac{sum X_i}{n}, hat{sigma}=sqrt{hat{p}(1-hat{p})}$ is consistent for $sigma$ ($X_i sim rm Bin(1,p) )$, what argument can I use to show that
$$left[-z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n} , z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n}right]$$
is a confidence interval?










share|cite|improve this question











$endgroup$












  • $begingroup$
    thx for the edit
    $endgroup$
    – kcesc04
    Sep 23 '15 at 16:09










  • $begingroup$
    I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
    $endgroup$
    – Ilmari Karonen
    Sep 23 '15 at 17:29










  • $begingroup$
    no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:48






  • 1




    $begingroup$
    knowing the variace of the binomial is dependant on p
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:49
















2












$begingroup$


I'm asking about the binomial proportion confidence interval, also known as the Wald interval.



Recall that
$$lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha, $$
with $sigma = sqrt{p(1-p)}$.



Starting from the expression above, and the fact that for $hat{p}=dfrac{sum X_i}{n}, hat{sigma}=sqrt{hat{p}(1-hat{p})}$ is consistent for $sigma$ ($X_i sim rm Bin(1,p) )$, what argument can I use to show that
$$left[-z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n} , z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n}right]$$
is a confidence interval?










share|cite|improve this question











$endgroup$












  • $begingroup$
    thx for the edit
    $endgroup$
    – kcesc04
    Sep 23 '15 at 16:09










  • $begingroup$
    I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
    $endgroup$
    – Ilmari Karonen
    Sep 23 '15 at 17:29










  • $begingroup$
    no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:48






  • 1




    $begingroup$
    knowing the variace of the binomial is dependant on p
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:49














2












2








2


1



$begingroup$


I'm asking about the binomial proportion confidence interval, also known as the Wald interval.



Recall that
$$lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha, $$
with $sigma = sqrt{p(1-p)}$.



Starting from the expression above, and the fact that for $hat{p}=dfrac{sum X_i}{n}, hat{sigma}=sqrt{hat{p}(1-hat{p})}$ is consistent for $sigma$ ($X_i sim rm Bin(1,p) )$, what argument can I use to show that
$$left[-z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n} , z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n}right]$$
is a confidence interval?










share|cite|improve this question











$endgroup$




I'm asking about the binomial proportion confidence interval, also known as the Wald interval.



Recall that
$$lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha, $$
with $sigma = sqrt{p(1-p)}$.



Starting from the expression above, and the fact that for $hat{p}=dfrac{sum X_i}{n}, hat{sigma}=sqrt{hat{p}(1-hat{p})}$ is consistent for $sigma$ ($X_i sim rm Bin(1,p) )$, what argument can I use to show that
$$left[-z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n} , z_{1-frac{alpha}{2}}frac{hat{sigma}}{sqrt{n}}+bar{X_n}right]$$
is a confidence interval?







statistics statistical-inference






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Sep 23 '15 at 17:31









Ilmari Karonen

19.8k25186




19.8k25186










asked Sep 23 '15 at 15:51









kcesc04kcesc04

289




289












  • $begingroup$
    thx for the edit
    $endgroup$
    – kcesc04
    Sep 23 '15 at 16:09










  • $begingroup$
    I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
    $endgroup$
    – Ilmari Karonen
    Sep 23 '15 at 17:29










  • $begingroup$
    no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:48






  • 1




    $begingroup$
    knowing the variace of the binomial is dependant on p
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:49


















  • $begingroup$
    thx for the edit
    $endgroup$
    – kcesc04
    Sep 23 '15 at 16:09










  • $begingroup$
    I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
    $endgroup$
    – Ilmari Karonen
    Sep 23 '15 at 17:29










  • $begingroup$
    no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:48






  • 1




    $begingroup$
    knowing the variace of the binomial is dependant on p
    $endgroup$
    – kcesc04
    Sep 23 '15 at 20:49
















$begingroup$
thx for the edit
$endgroup$
– kcesc04
Sep 23 '15 at 16:09




$begingroup$
thx for the edit
$endgroup$
– kcesc04
Sep 23 '15 at 16:09












$begingroup$
I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
$endgroup$
– Ilmari Karonen
Sep 23 '15 at 17:29




$begingroup$
I've edited your question a bit more to tidy it up, but I'm not sure exactly what you're asking. Technically speaking, any interval is a confidence interval, some just have a higher confidence level than others. Are you asking how to show that the given interval matches the specified error level $alpha$?
$endgroup$
– Ilmari Karonen
Sep 23 '15 at 17:29












$begingroup$
no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
$endgroup$
– kcesc04
Sep 23 '15 at 20:48




$begingroup$
no , by this definition of CI $lim_{n to infty}{P_p left( -z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} leq p leq z_{1-frac{alpha}{2}}frac{sigma}{sqrt{n}}+bar{X_n} right)} = 1-alpha$
$endgroup$
– kcesc04
Sep 23 '15 at 20:48




1




1




$begingroup$
knowing the variace of the binomial is dependant on p
$endgroup$
– kcesc04
Sep 23 '15 at 20:49




$begingroup$
knowing the variace of the binomial is dependant on p
$endgroup$
– kcesc04
Sep 23 '15 at 20:49










1 Answer
1






active

oldest

votes


















0












$begingroup$

The Wald confidence interval for binomial success probability $p$ depends on two approximations.



(1) That $Z = frac{hat p - p}{sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have
$P(-1.96 < Z < 1.96) approx 0.95.$ This is a good approximation
if $n$ is large and $p$ is not too far from $1/2.$
[A common rule of thumb is that $np$ and $n(1-p)$ should both
exceed. 5.]



From there, simple algebra gives
$$Pleft(hat p - 1.96sqrt{p(1-p)/n} < p < hat p + 1.96sqrt{p(1-p)/n} right) approx .95.$$
This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $sqrt{p(1-p)/n}$ is unknown.



(2) This leads to the second assumption, that if $n$ is sufficiently
large, then $hat p$ will be sufficiently close to $p$ that we can
write



$$Pleft(hat p - 1.96sqrt{hat p(1-hat p)/n} < p < hat p + 1.96sqrt{hat p(1- hat p)/n} right) approx .95.$$



So that an approximate 95% confidence interval for $p$ is
of the form $hat p pm 1.96sqrt{hat p(1- hat p)/n}.$
Similarly for other confidence levels with an appropriate
number from standard normal tables replacing 1.96. (For
example, 1.645 for a 90% CI and 2.576 for a 99% CI.)



$Notes:;$
Unfortunately, as shown by intensive computations for various values of
$n$ and $p,$ the actual 'coverage probability'
of the Wald interval can be far from 95% (and what is worse,
often far $below$ 95%) with 1.96. Similarly for other 'target'
confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)



If $n$ is several hundred or thousand (as in a public opinion poll
or a large-scale
simulation) the Wald interval is tolerably accurate.
Otherwise, for a 95% CI with smaller $n$ a considerable
improvement is artificially to introduce two extra successes
and two extra failures into the data before finding $hat p$ and $n$.
This adjustment (due to Agresti and Coull, 1998) is now widely
used instead of the Wald interval. (See Wikipedia.)



The Wilson interval (again, Wikipedia) results from taking the square
and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the
95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.



The plots below show $actual$ coverage probabilities
of Wald and Agresti "95%" CIs for 2000 values of $p$
between 0 and 1 for $n = 100$. The rapid oscillation of
coverage probabilities for even small changes in $p$ is due
to the discreteness of the binomial distribution.



enter image description here






share|cite|improve this answer











$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1448233%2fthe-derivation-of-the-wald-interval%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    The Wald confidence interval for binomial success probability $p$ depends on two approximations.



    (1) That $Z = frac{hat p - p}{sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have
    $P(-1.96 < Z < 1.96) approx 0.95.$ This is a good approximation
    if $n$ is large and $p$ is not too far from $1/2.$
    [A common rule of thumb is that $np$ and $n(1-p)$ should both
    exceed. 5.]



    From there, simple algebra gives
    $$Pleft(hat p - 1.96sqrt{p(1-p)/n} < p < hat p + 1.96sqrt{p(1-p)/n} right) approx .95.$$
    This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $sqrt{p(1-p)/n}$ is unknown.



    (2) This leads to the second assumption, that if $n$ is sufficiently
    large, then $hat p$ will be sufficiently close to $p$ that we can
    write



    $$Pleft(hat p - 1.96sqrt{hat p(1-hat p)/n} < p < hat p + 1.96sqrt{hat p(1- hat p)/n} right) approx .95.$$



    So that an approximate 95% confidence interval for $p$ is
    of the form $hat p pm 1.96sqrt{hat p(1- hat p)/n}.$
    Similarly for other confidence levels with an appropriate
    number from standard normal tables replacing 1.96. (For
    example, 1.645 for a 90% CI and 2.576 for a 99% CI.)



    $Notes:;$
    Unfortunately, as shown by intensive computations for various values of
    $n$ and $p,$ the actual 'coverage probability'
    of the Wald interval can be far from 95% (and what is worse,
    often far $below$ 95%) with 1.96. Similarly for other 'target'
    confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)



    If $n$ is several hundred or thousand (as in a public opinion poll
    or a large-scale
    simulation) the Wald interval is tolerably accurate.
    Otherwise, for a 95% CI with smaller $n$ a considerable
    improvement is artificially to introduce two extra successes
    and two extra failures into the data before finding $hat p$ and $n$.
    This adjustment (due to Agresti and Coull, 1998) is now widely
    used instead of the Wald interval. (See Wikipedia.)



    The Wilson interval (again, Wikipedia) results from taking the square
    and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the
    95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.



    The plots below show $actual$ coverage probabilities
    of Wald and Agresti "95%" CIs for 2000 values of $p$
    between 0 and 1 for $n = 100$. The rapid oscillation of
    coverage probabilities for even small changes in $p$ is due
    to the discreteness of the binomial distribution.



    enter image description here






    share|cite|improve this answer











    $endgroup$


















      0












      $begingroup$

      The Wald confidence interval for binomial success probability $p$ depends on two approximations.



      (1) That $Z = frac{hat p - p}{sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have
      $P(-1.96 < Z < 1.96) approx 0.95.$ This is a good approximation
      if $n$ is large and $p$ is not too far from $1/2.$
      [A common rule of thumb is that $np$ and $n(1-p)$ should both
      exceed. 5.]



      From there, simple algebra gives
      $$Pleft(hat p - 1.96sqrt{p(1-p)/n} < p < hat p + 1.96sqrt{p(1-p)/n} right) approx .95.$$
      This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $sqrt{p(1-p)/n}$ is unknown.



      (2) This leads to the second assumption, that if $n$ is sufficiently
      large, then $hat p$ will be sufficiently close to $p$ that we can
      write



      $$Pleft(hat p - 1.96sqrt{hat p(1-hat p)/n} < p < hat p + 1.96sqrt{hat p(1- hat p)/n} right) approx .95.$$



      So that an approximate 95% confidence interval for $p$ is
      of the form $hat p pm 1.96sqrt{hat p(1- hat p)/n}.$
      Similarly for other confidence levels with an appropriate
      number from standard normal tables replacing 1.96. (For
      example, 1.645 for a 90% CI and 2.576 for a 99% CI.)



      $Notes:;$
      Unfortunately, as shown by intensive computations for various values of
      $n$ and $p,$ the actual 'coverage probability'
      of the Wald interval can be far from 95% (and what is worse,
      often far $below$ 95%) with 1.96. Similarly for other 'target'
      confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)



      If $n$ is several hundred or thousand (as in a public opinion poll
      or a large-scale
      simulation) the Wald interval is tolerably accurate.
      Otherwise, for a 95% CI with smaller $n$ a considerable
      improvement is artificially to introduce two extra successes
      and two extra failures into the data before finding $hat p$ and $n$.
      This adjustment (due to Agresti and Coull, 1998) is now widely
      used instead of the Wald interval. (See Wikipedia.)



      The Wilson interval (again, Wikipedia) results from taking the square
      and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the
      95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.



      The plots below show $actual$ coverage probabilities
      of Wald and Agresti "95%" CIs for 2000 values of $p$
      between 0 and 1 for $n = 100$. The rapid oscillation of
      coverage probabilities for even small changes in $p$ is due
      to the discreteness of the binomial distribution.



      enter image description here






      share|cite|improve this answer











      $endgroup$
















        0












        0








        0





        $begingroup$

        The Wald confidence interval for binomial success probability $p$ depends on two approximations.



        (1) That $Z = frac{hat p - p}{sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have
        $P(-1.96 < Z < 1.96) approx 0.95.$ This is a good approximation
        if $n$ is large and $p$ is not too far from $1/2.$
        [A common rule of thumb is that $np$ and $n(1-p)$ should both
        exceed. 5.]



        From there, simple algebra gives
        $$Pleft(hat p - 1.96sqrt{p(1-p)/n} < p < hat p + 1.96sqrt{p(1-p)/n} right) approx .95.$$
        This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $sqrt{p(1-p)/n}$ is unknown.



        (2) This leads to the second assumption, that if $n$ is sufficiently
        large, then $hat p$ will be sufficiently close to $p$ that we can
        write



        $$Pleft(hat p - 1.96sqrt{hat p(1-hat p)/n} < p < hat p + 1.96sqrt{hat p(1- hat p)/n} right) approx .95.$$



        So that an approximate 95% confidence interval for $p$ is
        of the form $hat p pm 1.96sqrt{hat p(1- hat p)/n}.$
        Similarly for other confidence levels with an appropriate
        number from standard normal tables replacing 1.96. (For
        example, 1.645 for a 90% CI and 2.576 for a 99% CI.)



        $Notes:;$
        Unfortunately, as shown by intensive computations for various values of
        $n$ and $p,$ the actual 'coverage probability'
        of the Wald interval can be far from 95% (and what is worse,
        often far $below$ 95%) with 1.96. Similarly for other 'target'
        confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)



        If $n$ is several hundred or thousand (as in a public opinion poll
        or a large-scale
        simulation) the Wald interval is tolerably accurate.
        Otherwise, for a 95% CI with smaller $n$ a considerable
        improvement is artificially to introduce two extra successes
        and two extra failures into the data before finding $hat p$ and $n$.
        This adjustment (due to Agresti and Coull, 1998) is now widely
        used instead of the Wald interval. (See Wikipedia.)



        The Wilson interval (again, Wikipedia) results from taking the square
        and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the
        95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.



        The plots below show $actual$ coverage probabilities
        of Wald and Agresti "95%" CIs for 2000 values of $p$
        between 0 and 1 for $n = 100$. The rapid oscillation of
        coverage probabilities for even small changes in $p$ is due
        to the discreteness of the binomial distribution.



        enter image description here






        share|cite|improve this answer











        $endgroup$



        The Wald confidence interval for binomial success probability $p$ depends on two approximations.



        (1) That $Z = frac{hat p - p}{sqrt{p(1-p)/n}}$ is approximately standard normal, $Norm(0, 1)$. Thus one would have
        $P(-1.96 < Z < 1.96) approx 0.95.$ This is a good approximation
        if $n$ is large and $p$ is not too far from $1/2.$
        [A common rule of thumb is that $np$ and $n(1-p)$ should both
        exceed. 5.]



        From there, simple algebra gives
        $$Pleft(hat p - 1.96sqrt{p(1-p)/n} < p < hat p + 1.96sqrt{p(1-p)/n} right) approx .95.$$
        This is promising because $p$ is 'isolated' (after a fashion) between two 'bounds', but not useful in practice for making a confidence interval because $sqrt{p(1-p)/n}$ is unknown.



        (2) This leads to the second assumption, that if $n$ is sufficiently
        large, then $hat p$ will be sufficiently close to $p$ that we can
        write



        $$Pleft(hat p - 1.96sqrt{hat p(1-hat p)/n} < p < hat p + 1.96sqrt{hat p(1- hat p)/n} right) approx .95.$$



        So that an approximate 95% confidence interval for $p$ is
        of the form $hat p pm 1.96sqrt{hat p(1- hat p)/n}.$
        Similarly for other confidence levels with an appropriate
        number from standard normal tables replacing 1.96. (For
        example, 1.645 for a 90% CI and 2.576 for a 99% CI.)



        $Notes:;$
        Unfortunately, as shown by intensive computations for various values of
        $n$ and $p,$ the actual 'coverage probability'
        of the Wald interval can be far from 95% (and what is worse,
        often far $below$ 95%) with 1.96. Similarly for other 'target'
        confidence levels. (A key reference is Brown, Cai, and DasGupta, 2001.)



        If $n$ is several hundred or thousand (as in a public opinion poll
        or a large-scale
        simulation) the Wald interval is tolerably accurate.
        Otherwise, for a 95% CI with smaller $n$ a considerable
        improvement is artificially to introduce two extra successes
        and two extra failures into the data before finding $hat p$ and $n$.
        This adjustment (due to Agresti and Coull, 1998) is now widely
        used instead of the Wald interval. (See Wikipedia.)



        The Wilson interval (again, Wikipedia) results from taking the square
        and then solving a quadratic equation to (truly) isolate $p$ in $-1.96 < Z < 1.96$ without making assumption (2). Equating 1.96 and 2 in the
        95% Wilson CI gives nearly the same result as the simpler Agresti-Coull interval.



        The plots below show $actual$ coverage probabilities
        of Wald and Agresti "95%" CIs for 2000 values of $p$
        between 0 and 1 for $n = 100$. The rapid oscillation of
        coverage probabilities for even small changes in $p$ is due
        to the discreteness of the binomial distribution.



        enter image description here







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Oct 3 '15 at 6:45

























        answered Oct 3 '15 at 6:00









        BruceETBruceET

        35.5k71440




        35.5k71440






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1448233%2fthe-derivation-of-the-wald-interval%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Bressuire

            Cabo Verde

            Gyllenstierna