How do you estimate the mean (average) of a histogram? [closed]












0












$begingroup$


I have some trouble finding tutorials of this topic. I understand that estimating the mean from a histogram is only an estimate, however, is there some sort of formula or process to acquire the mean?










share|cite|improve this question











$endgroup$



closed as off-topic by JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus Jan 14 at 4:58


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is missing context or other details: Please provide additional context, which ideally explains why the question is relevant to you and our community. Some forms of context include: background and motivation, relevant definitions, source, possible strategies, your current progress, why the question is interesting or important, etc." – JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
    $endgroup$
    – JMoravitz
    Jan 13 at 23:51










  • $begingroup$
    cs.uni.edu/~campbell/stat/histrev2.html
    $endgroup$
    – D.B.
    Jan 14 at 0:14
















0












$begingroup$


I have some trouble finding tutorials of this topic. I understand that estimating the mean from a histogram is only an estimate, however, is there some sort of formula or process to acquire the mean?










share|cite|improve this question











$endgroup$



closed as off-topic by JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus Jan 14 at 4:58


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is missing context or other details: Please provide additional context, which ideally explains why the question is relevant to you and our community. Some forms of context include: background and motivation, relevant definitions, source, possible strategies, your current progress, why the question is interesting or important, etc." – JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus

If this question can be reworded to fit the rules in the help center, please edit the question.
















  • $begingroup$
    The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
    $endgroup$
    – JMoravitz
    Jan 13 at 23:51










  • $begingroup$
    cs.uni.edu/~campbell/stat/histrev2.html
    $endgroup$
    – D.B.
    Jan 14 at 0:14














0












0








0





$begingroup$


I have some trouble finding tutorials of this topic. I understand that estimating the mean from a histogram is only an estimate, however, is there some sort of formula or process to acquire the mean?










share|cite|improve this question











$endgroup$




I have some trouble finding tutorials of this topic. I understand that estimating the mean from a histogram is only an estimate, however, is there some sort of formula or process to acquire the mean?







statistics






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 14 at 0:23







9766Joe

















asked Jan 13 at 23:43









9766Joe9766Joe

84




84




closed as off-topic by JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus Jan 14 at 4:58


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is missing context or other details: Please provide additional context, which ideally explains why the question is relevant to you and our community. Some forms of context include: background and motivation, relevant definitions, source, possible strategies, your current progress, why the question is interesting or important, etc." – JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus

If this question can be reworded to fit the rules in the help center, please edit the question.







closed as off-topic by JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus Jan 14 at 4:58


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is missing context or other details: Please provide additional context, which ideally explains why the question is relevant to you and our community. Some forms of context include: background and motivation, relevant definitions, source, possible strategies, your current progress, why the question is interesting or important, etc." – JMoravitz, Xander Henderson, heropup, Clarinetist, Leucippus

If this question can be reworded to fit the rules in the help center, please edit the question.












  • $begingroup$
    The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
    $endgroup$
    – JMoravitz
    Jan 13 at 23:51










  • $begingroup$
    cs.uni.edu/~campbell/stat/histrev2.html
    $endgroup$
    – D.B.
    Jan 14 at 0:14


















  • $begingroup$
    The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
    $endgroup$
    – JMoravitz
    Jan 13 at 23:51










  • $begingroup$
    cs.uni.edu/~campbell/stat/histrev2.html
    $endgroup$
    – D.B.
    Jan 14 at 0:14
















$begingroup$
The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
$endgroup$
– JMoravitz
Jan 13 at 23:51




$begingroup$
The same way you estimate them if the values were displayed normally rather than as a histogram. For mean and median you just get a feel for it and try to guesstimate a number "somewhere in the middle." For mode, that is simply the most frequently occurring number. Be aware though that unless the values are labeled in a histogram, mode can be quite tricky. Imagine a histogram with the values $4,5,6,100,200,200$. It might look like there are three fives at a glance rather than the three small numbers all being different which would make your guess at a mode incorrect.
$endgroup$
– JMoravitz
Jan 13 at 23:51












$begingroup$
cs.uni.edu/~campbell/stat/histrev2.html
$endgroup$
– D.B.
Jan 14 at 0:14




$begingroup$
cs.uni.edu/~campbell/stat/histrev2.html
$endgroup$
– D.B.
Jan 14 at 0:14










2 Answers
2






active

oldest

votes


















0












$begingroup$

Taking an example from Wikipedia's histogram article, you might see something the following histogram



histogram



and you might guess a mean around $25$ (between the fifth and sixth bins) which has most of the data to the left, but counterbalanced by the more extreme values to the right



If you wanted the precise figure, you might look at the location and dimension of each bin. In this example, Wikipedia actually gives numbers, so let's use those



Bin-left    Bin-width   Bin-height
0 5 836
5 5 2737
10 5 3723
15 5 3926
20 5 3596
25 5 1438
30 5 3273
35 5 642
40 5 824
45 15 613
60 30 215
90 60 57


But to find the mean (the average moment or leverage about $0$) you need to know the area of each bin (width times height) and the midpoint of each bin, in order to multiply these together to give the leverage



Bin-left    Bin-width  Bin-height      Bin-area        Bin-midpoint     Bin-leverage

0 5 836 4180 2.5 10450.0
5 5 2737 13687 7.5 102652.5
10 5 3723 18618 12.5 232725.0
15 5 3926 19634 17.5 343595.0
20 5 3596 17981 22.5 404572.5
25 5 1438 7190 27.5 197725.0
30 5 3273 16369 32.5 531992.5
35 5 642 3212 37.5 120450.0
40 5 824 4122 42.5 175185.0
45 15 613 9200 52.5 483000.0
60 30 215 6461 75.0 484575.0
90 60 57 3435 120.0 412200.0


Adding up the areas to give $124089$ and the leverages to give $3488122.5$ and dividing the former by the latter gives a mean of about $28.2$



This might be a slight over estimate (a) because people tend to answer survey questions with round numbers so more to the left than to the right of these bins and (b) because among the extreme values on the right smaller values may be more likely than larger values, i.e. the bins may not actually be rectangles. Even ignoring those points, this calculated mean and the original guess of $25$ are not far apart






share|cite|improve this answer









$endgroup$





















    1












    $begingroup$

    Suppose (for example) that your histogram shows the weights of people in kilograms. The histogram has 3 columns -- one for people who weigh 50 to 60 kg, one for people who weigh 60 to 70 kg, and one for people who weigh 70 to 80 kg. Suppose the first column has height 2, the second column has height 3, and the third has height 1. In summary we have



    Weight 50 to 60   --  column height = 2
    Weight 60 to 70 -- column height = 3
    Weight 70 to 80 -- column height = 1


    If we don't have any further info, it's reasonable to assume that the two people in the 50-to-60 category both weigh 55 kg (the average of 50 and 60).



    Continuing this approach, we assume that our 6 people have weights 55, 55, 65, 65, 65, 75.



    I expect you know how to compute the mean of those 6 numbers.






    share|cite|improve this answer











    $endgroup$













    • $begingroup$
      Thank you for the answer!
      $endgroup$
      – 9766Joe
      Jan 14 at 0:44


















    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    Taking an example from Wikipedia's histogram article, you might see something the following histogram



    histogram



    and you might guess a mean around $25$ (between the fifth and sixth bins) which has most of the data to the left, but counterbalanced by the more extreme values to the right



    If you wanted the precise figure, you might look at the location and dimension of each bin. In this example, Wikipedia actually gives numbers, so let's use those



    Bin-left    Bin-width   Bin-height
    0 5 836
    5 5 2737
    10 5 3723
    15 5 3926
    20 5 3596
    25 5 1438
    30 5 3273
    35 5 642
    40 5 824
    45 15 613
    60 30 215
    90 60 57


    But to find the mean (the average moment or leverage about $0$) you need to know the area of each bin (width times height) and the midpoint of each bin, in order to multiply these together to give the leverage



    Bin-left    Bin-width  Bin-height      Bin-area        Bin-midpoint     Bin-leverage

    0 5 836 4180 2.5 10450.0
    5 5 2737 13687 7.5 102652.5
    10 5 3723 18618 12.5 232725.0
    15 5 3926 19634 17.5 343595.0
    20 5 3596 17981 22.5 404572.5
    25 5 1438 7190 27.5 197725.0
    30 5 3273 16369 32.5 531992.5
    35 5 642 3212 37.5 120450.0
    40 5 824 4122 42.5 175185.0
    45 15 613 9200 52.5 483000.0
    60 30 215 6461 75.0 484575.0
    90 60 57 3435 120.0 412200.0


    Adding up the areas to give $124089$ and the leverages to give $3488122.5$ and dividing the former by the latter gives a mean of about $28.2$



    This might be a slight over estimate (a) because people tend to answer survey questions with round numbers so more to the left than to the right of these bins and (b) because among the extreme values on the right smaller values may be more likely than larger values, i.e. the bins may not actually be rectangles. Even ignoring those points, this calculated mean and the original guess of $25$ are not far apart






    share|cite|improve this answer









    $endgroup$


















      0












      $begingroup$

      Taking an example from Wikipedia's histogram article, you might see something the following histogram



      histogram



      and you might guess a mean around $25$ (between the fifth and sixth bins) which has most of the data to the left, but counterbalanced by the more extreme values to the right



      If you wanted the precise figure, you might look at the location and dimension of each bin. In this example, Wikipedia actually gives numbers, so let's use those



      Bin-left    Bin-width   Bin-height
      0 5 836
      5 5 2737
      10 5 3723
      15 5 3926
      20 5 3596
      25 5 1438
      30 5 3273
      35 5 642
      40 5 824
      45 15 613
      60 30 215
      90 60 57


      But to find the mean (the average moment or leverage about $0$) you need to know the area of each bin (width times height) and the midpoint of each bin, in order to multiply these together to give the leverage



      Bin-left    Bin-width  Bin-height      Bin-area        Bin-midpoint     Bin-leverage

      0 5 836 4180 2.5 10450.0
      5 5 2737 13687 7.5 102652.5
      10 5 3723 18618 12.5 232725.0
      15 5 3926 19634 17.5 343595.0
      20 5 3596 17981 22.5 404572.5
      25 5 1438 7190 27.5 197725.0
      30 5 3273 16369 32.5 531992.5
      35 5 642 3212 37.5 120450.0
      40 5 824 4122 42.5 175185.0
      45 15 613 9200 52.5 483000.0
      60 30 215 6461 75.0 484575.0
      90 60 57 3435 120.0 412200.0


      Adding up the areas to give $124089$ and the leverages to give $3488122.5$ and dividing the former by the latter gives a mean of about $28.2$



      This might be a slight over estimate (a) because people tend to answer survey questions with round numbers so more to the left than to the right of these bins and (b) because among the extreme values on the right smaller values may be more likely than larger values, i.e. the bins may not actually be rectangles. Even ignoring those points, this calculated mean and the original guess of $25$ are not far apart






      share|cite|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        Taking an example from Wikipedia's histogram article, you might see something the following histogram



        histogram



        and you might guess a mean around $25$ (between the fifth and sixth bins) which has most of the data to the left, but counterbalanced by the more extreme values to the right



        If you wanted the precise figure, you might look at the location and dimension of each bin. In this example, Wikipedia actually gives numbers, so let's use those



        Bin-left    Bin-width   Bin-height
        0 5 836
        5 5 2737
        10 5 3723
        15 5 3926
        20 5 3596
        25 5 1438
        30 5 3273
        35 5 642
        40 5 824
        45 15 613
        60 30 215
        90 60 57


        But to find the mean (the average moment or leverage about $0$) you need to know the area of each bin (width times height) and the midpoint of each bin, in order to multiply these together to give the leverage



        Bin-left    Bin-width  Bin-height      Bin-area        Bin-midpoint     Bin-leverage

        0 5 836 4180 2.5 10450.0
        5 5 2737 13687 7.5 102652.5
        10 5 3723 18618 12.5 232725.0
        15 5 3926 19634 17.5 343595.0
        20 5 3596 17981 22.5 404572.5
        25 5 1438 7190 27.5 197725.0
        30 5 3273 16369 32.5 531992.5
        35 5 642 3212 37.5 120450.0
        40 5 824 4122 42.5 175185.0
        45 15 613 9200 52.5 483000.0
        60 30 215 6461 75.0 484575.0
        90 60 57 3435 120.0 412200.0


        Adding up the areas to give $124089$ and the leverages to give $3488122.5$ and dividing the former by the latter gives a mean of about $28.2$



        This might be a slight over estimate (a) because people tend to answer survey questions with round numbers so more to the left than to the right of these bins and (b) because among the extreme values on the right smaller values may be more likely than larger values, i.e. the bins may not actually be rectangles. Even ignoring those points, this calculated mean and the original guess of $25$ are not far apart






        share|cite|improve this answer









        $endgroup$



        Taking an example from Wikipedia's histogram article, you might see something the following histogram



        histogram



        and you might guess a mean around $25$ (between the fifth and sixth bins) which has most of the data to the left, but counterbalanced by the more extreme values to the right



        If you wanted the precise figure, you might look at the location and dimension of each bin. In this example, Wikipedia actually gives numbers, so let's use those



        Bin-left    Bin-width   Bin-height
        0 5 836
        5 5 2737
        10 5 3723
        15 5 3926
        20 5 3596
        25 5 1438
        30 5 3273
        35 5 642
        40 5 824
        45 15 613
        60 30 215
        90 60 57


        But to find the mean (the average moment or leverage about $0$) you need to know the area of each bin (width times height) and the midpoint of each bin, in order to multiply these together to give the leverage



        Bin-left    Bin-width  Bin-height      Bin-area        Bin-midpoint     Bin-leverage

        0 5 836 4180 2.5 10450.0
        5 5 2737 13687 7.5 102652.5
        10 5 3723 18618 12.5 232725.0
        15 5 3926 19634 17.5 343595.0
        20 5 3596 17981 22.5 404572.5
        25 5 1438 7190 27.5 197725.0
        30 5 3273 16369 32.5 531992.5
        35 5 642 3212 37.5 120450.0
        40 5 824 4122 42.5 175185.0
        45 15 613 9200 52.5 483000.0
        60 30 215 6461 75.0 484575.0
        90 60 57 3435 120.0 412200.0


        Adding up the areas to give $124089$ and the leverages to give $3488122.5$ and dividing the former by the latter gives a mean of about $28.2$



        This might be a slight over estimate (a) because people tend to answer survey questions with round numbers so more to the left than to the right of these bins and (b) because among the extreme values on the right smaller values may be more likely than larger values, i.e. the bins may not actually be rectangles. Even ignoring those points, this calculated mean and the original guess of $25$ are not far apart







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Jan 14 at 1:22









        HenryHenry

        101k482170




        101k482170























            1












            $begingroup$

            Suppose (for example) that your histogram shows the weights of people in kilograms. The histogram has 3 columns -- one for people who weigh 50 to 60 kg, one for people who weigh 60 to 70 kg, and one for people who weigh 70 to 80 kg. Suppose the first column has height 2, the second column has height 3, and the third has height 1. In summary we have



            Weight 50 to 60   --  column height = 2
            Weight 60 to 70 -- column height = 3
            Weight 70 to 80 -- column height = 1


            If we don't have any further info, it's reasonable to assume that the two people in the 50-to-60 category both weigh 55 kg (the average of 50 and 60).



            Continuing this approach, we assume that our 6 people have weights 55, 55, 65, 65, 65, 75.



            I expect you know how to compute the mean of those 6 numbers.






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              Thank you for the answer!
              $endgroup$
              – 9766Joe
              Jan 14 at 0:44
















            1












            $begingroup$

            Suppose (for example) that your histogram shows the weights of people in kilograms. The histogram has 3 columns -- one for people who weigh 50 to 60 kg, one for people who weigh 60 to 70 kg, and one for people who weigh 70 to 80 kg. Suppose the first column has height 2, the second column has height 3, and the third has height 1. In summary we have



            Weight 50 to 60   --  column height = 2
            Weight 60 to 70 -- column height = 3
            Weight 70 to 80 -- column height = 1


            If we don't have any further info, it's reasonable to assume that the two people in the 50-to-60 category both weigh 55 kg (the average of 50 and 60).



            Continuing this approach, we assume that our 6 people have weights 55, 55, 65, 65, 65, 75.



            I expect you know how to compute the mean of those 6 numbers.






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              Thank you for the answer!
              $endgroup$
              – 9766Joe
              Jan 14 at 0:44














            1












            1








            1





            $begingroup$

            Suppose (for example) that your histogram shows the weights of people in kilograms. The histogram has 3 columns -- one for people who weigh 50 to 60 kg, one for people who weigh 60 to 70 kg, and one for people who weigh 70 to 80 kg. Suppose the first column has height 2, the second column has height 3, and the third has height 1. In summary we have



            Weight 50 to 60   --  column height = 2
            Weight 60 to 70 -- column height = 3
            Weight 70 to 80 -- column height = 1


            If we don't have any further info, it's reasonable to assume that the two people in the 50-to-60 category both weigh 55 kg (the average of 50 and 60).



            Continuing this approach, we assume that our 6 people have weights 55, 55, 65, 65, 65, 75.



            I expect you know how to compute the mean of those 6 numbers.






            share|cite|improve this answer











            $endgroup$



            Suppose (for example) that your histogram shows the weights of people in kilograms. The histogram has 3 columns -- one for people who weigh 50 to 60 kg, one for people who weigh 60 to 70 kg, and one for people who weigh 70 to 80 kg. Suppose the first column has height 2, the second column has height 3, and the third has height 1. In summary we have



            Weight 50 to 60   --  column height = 2
            Weight 60 to 70 -- column height = 3
            Weight 70 to 80 -- column height = 1


            If we don't have any further info, it's reasonable to assume that the two people in the 50-to-60 category both weigh 55 kg (the average of 50 and 60).



            Continuing this approach, we assume that our 6 people have weights 55, 55, 65, 65, 65, 75.



            I expect you know how to compute the mean of those 6 numbers.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Jan 14 at 0:45

























            answered Jan 14 at 0:42









            bubbabubba

            30.8k33188




            30.8k33188












            • $begingroup$
              Thank you for the answer!
              $endgroup$
              – 9766Joe
              Jan 14 at 0:44


















            • $begingroup$
              Thank you for the answer!
              $endgroup$
              – 9766Joe
              Jan 14 at 0:44
















            $begingroup$
            Thank you for the answer!
            $endgroup$
            – 9766Joe
            Jan 14 at 0:44




            $begingroup$
            Thank you for the answer!
            $endgroup$
            – 9766Joe
            Jan 14 at 0:44



            Popular posts from this blog

            Bressuire

            Cabo Verde

            Gyllenstierna