Variance of alternate flipping rounds











up vote
1
down vote

favorite












I did the following exercise, but I would like to extend the question to the variance of the variate.




Bob and Bub each has his own coin. Chance of coming up "heads" is $rho$ for Bob's coin and $tau$ for Bub's. They flip alternatively, first Bob, then Bub, then Bob again, etc. Let Bob's flip followed by Bub's flip constitute a round, and let $R$ denote the number of rounds until each gets "heads" at least once. For $rho = 1/3$, $tau = 2/5$, what is the expectation of $R$?




General answer for the expectation is:



$$mathbb{E}[R]=frac{1 + frac{rho}{tau} + frac{tau}{rho} - (rho + tau)}{rho + tau - rho , tau}$$



This agrees with Monte Carlo simulation I did (with $10^5$ repeats), which approximates expectation and variance (with $rho = 1/3$, and $tau = 2/5$) to $3.84647$ and $6.48666$ respectively.



Is anybody able to calculate variance symbolically?










share|cite|improve this question
























  • Out of curiosity, where does your expected value formula come from?
    – Remy
    Dec 1 '17 at 0:23










  • @Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
    – BoLe
    Dec 2 '17 at 12:51

















up vote
1
down vote

favorite












I did the following exercise, but I would like to extend the question to the variance of the variate.




Bob and Bub each has his own coin. Chance of coming up "heads" is $rho$ for Bob's coin and $tau$ for Bub's. They flip alternatively, first Bob, then Bub, then Bob again, etc. Let Bob's flip followed by Bub's flip constitute a round, and let $R$ denote the number of rounds until each gets "heads" at least once. For $rho = 1/3$, $tau = 2/5$, what is the expectation of $R$?




General answer for the expectation is:



$$mathbb{E}[R]=frac{1 + frac{rho}{tau} + frac{tau}{rho} - (rho + tau)}{rho + tau - rho , tau}$$



This agrees with Monte Carlo simulation I did (with $10^5$ repeats), which approximates expectation and variance (with $rho = 1/3$, and $tau = 2/5$) to $3.84647$ and $6.48666$ respectively.



Is anybody able to calculate variance symbolically?










share|cite|improve this question
























  • Out of curiosity, where does your expected value formula come from?
    – Remy
    Dec 1 '17 at 0:23










  • @Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
    – BoLe
    Dec 2 '17 at 12:51















up vote
1
down vote

favorite









up vote
1
down vote

favorite











I did the following exercise, but I would like to extend the question to the variance of the variate.




Bob and Bub each has his own coin. Chance of coming up "heads" is $rho$ for Bob's coin and $tau$ for Bub's. They flip alternatively, first Bob, then Bub, then Bob again, etc. Let Bob's flip followed by Bub's flip constitute a round, and let $R$ denote the number of rounds until each gets "heads" at least once. For $rho = 1/3$, $tau = 2/5$, what is the expectation of $R$?




General answer for the expectation is:



$$mathbb{E}[R]=frac{1 + frac{rho}{tau} + frac{tau}{rho} - (rho + tau)}{rho + tau - rho , tau}$$



This agrees with Monte Carlo simulation I did (with $10^5$ repeats), which approximates expectation and variance (with $rho = 1/3$, and $tau = 2/5$) to $3.84647$ and $6.48666$ respectively.



Is anybody able to calculate variance symbolically?










share|cite|improve this question















I did the following exercise, but I would like to extend the question to the variance of the variate.




Bob and Bub each has his own coin. Chance of coming up "heads" is $rho$ for Bob's coin and $tau$ for Bub's. They flip alternatively, first Bob, then Bub, then Bob again, etc. Let Bob's flip followed by Bub's flip constitute a round, and let $R$ denote the number of rounds until each gets "heads" at least once. For $rho = 1/3$, $tau = 2/5$, what is the expectation of $R$?




General answer for the expectation is:



$$mathbb{E}[R]=frac{1 + frac{rho}{tau} + frac{tau}{rho} - (rho + tau)}{rho + tau - rho , tau}$$



This agrees with Monte Carlo simulation I did (with $10^5$ repeats), which approximates expectation and variance (with $rho = 1/3$, and $tau = 2/5$) to $3.84647$ and $6.48666$ respectively.



Is anybody able to calculate variance symbolically?







probability expectation variance






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Mar 17 at 19:56

























asked Nov 30 '17 at 14:48









BoLe

230110




230110












  • Out of curiosity, where does your expected value formula come from?
    – Remy
    Dec 1 '17 at 0:23










  • @Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
    – BoLe
    Dec 2 '17 at 12:51




















  • Out of curiosity, where does your expected value formula come from?
    – Remy
    Dec 1 '17 at 0:23










  • @Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
    – BoLe
    Dec 2 '17 at 12:51


















Out of curiosity, where does your expected value formula come from?
– Remy
Dec 1 '17 at 0:23




Out of curiosity, where does your expected value formula come from?
– Remy
Dec 1 '17 at 0:23












@Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
– BoLe
Dec 2 '17 at 12:51






@Remy I derived it. In a nutshell, $R$ is distributed geometrically with parameter $p = rho + tau - rho , tau$ (probability of success of either one of them) plus a geometric variate with parameter $p = rho$ and weighted with probability $mathbb{P}{text{Bob didn't have success, but Bub did} , | , text{either had}}$ plus a similar variate for the reverse case.
– BoLe
Dec 2 '17 at 12:51












1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










$largetextbf{Outline}$



Setup the Notations : As titled.



Solution.1 : Direct application of the conditional decomposition of expectation. This is the foolproof approach if one wants a quick numeric evaluation and doesn't want to be bothered with analysis.



Solution.2 : A framework that provides perspectives and better calculation.



Appendix.A : Supplementary material to Solution.1.



Appendix.B.1 : In-site links to existing questions that are closely related.



Appendix.B.2 : Supplementary material to Solution.2.





$largetextbf{Setup the Notations}$



Let $X$ be the total number of trials of Bob's flips when his first head (success) appears.



Let $Y$ be that for Bub. We have $Xsim mathrm{Geo}[rho]$ independent to $Ysim mathrm{Geo}[tau]$.



The following basics for a Geometric distribution will be useful here:
begin{align*}
mu_{_X} &equiv mathbb{E}[X] = frac1{rho} & & tag*{Eq.(1)}
label{Eq01} \
A_X &equiv mathbb{E}[X(X+1)] = frac2{ rho^2 } & &tag*{Eq.(2)} label{Eq02} \
S_X &equiv mathbb{E}[X^2 ] = frac{2 - rho}{ rho^2 } & &tag*{Eq.(3)} label{Eq03} \
V_X &equiv mathbb{V}[X] = frac{1 - rho}{ rho^2 } & &tag*{Eq.(4)} label{Eq04} \
Q_X &equiv mathbb{E}[(X+1)^2] = A_X + mu_{_X} + 1
= frac{ 2 + rho + rho^2 }{ rho^2 } & &tag*{Eq.(5)} label{Eq05}
end{align*}

Recall that we often use $A_X$ to obtain $S_X$ because $A_X$ is easier to derive (it is a more natural quantity for the Geometric distribution).



The shorthands for $Y$ are the same $mu_{_Y}$ and $V_Y$ etc.





$largetextbf{Solution.1}$



Just like how the expectation can be derived (which apparently you know how to), the 2nd moment (thus variance) can be obtained by conditioning on the results of the round. Denote the events of ${ text{Bob head, Bub tail} }$ as just $HT$, and recall that $X$ is for Bob flipping alone as if Bub doesn't exist.
begin{align*}
mathbb{E}[ R^2 ] &= rho tau , mathbb{E}left[ R^2 ,middle|~ HHright]
+ (1 - rho)(1 - tau),mathbb{E}left[ R^2 ,middle|~ TTright] \
&hspace{36pt} + rho (1 - tau) , mathbb{E}left[ R^2 ,middle|~ HTright]
+ tau (1 - rho),mathbb{E}left[ R^2 ,middle|~THright] \
&= rho tau + (1 - rho)(1 - tau),mathbb{E} left[ (1+R)^2 right] + rho (1 - tau) , mathbb{E}left[ (1+Y)^2 right] + tau (1 - rho),mathbb{E}left[ (1+X)^2 right]
end{align*}

Please let me know if you need justification for $mathbb{E}[ R^2 ~|~~TH] = mathbb{E}[ (1+Y)^2 ]$ and the alike. Moving on, use the ref{Eq04} shorthand $Q_X$ and $Q_Y$ for now and rearrange.
$$ mathbb{E}[ R^2 ] = rho tau + (1 - rho)(1 - tau) left( mathbb{E}[ R^2 ] + 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X$$
Denote $lambda = 1 - (1 - rho)(1 - tau) = rho + tau - rho tau$, and collect $mathbb{E}[ R^2 ]$ on the left.
begin{equation*}
lambda,mathbb{E}[ R^2 ] = rho tau + (1 - lambda) left( 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X tag*{Eq.(6)} label{Eq06}
end{equation*}

Note that the denominator of $mathbb{E}[ R ]$ is just $lambda$. Along with the symmetry, this suggests that the numerator of $mathbb{E}[ R ]$ can be rewritten into a better form invoking the basic ref{Eq01}.
begin{align*}
1 + frac{ rho }{ tau } + frac{ tau }{ rho } - (rho + tau) &= bigl( 1 + frac{ rho }{ tau } - rho bigr)
+ bigl( 1 + frac{ tau }{ rho } - tau bigr) - 1 \
&= frac{ tau + rho - rhotau }{ tau } + frac{ rho + tau - rhotau }{ rho } - 1 \
implies mathbb{E}[ R ] = frac1{rho} + frac1{tau} &- frac1{lambda} = mu_{_X} + mu_{_Y} - frac1{lambda} tag*{Eq.(7)} label{Eq07}
end{align*}

It's not a coincidence that the expectation can be expressed as such. The reason will be elaborated in the next section for Solution.2.



For the sake of computing the numeric value, ref{Eq06} was a good place to stop. With the given parameters $rho = 1/3$ and $tau = 2/5$, we have $mathbb{E}[ R ] = 23/6$, $Q_X = 22$, and $Q_Y = 16$. That makes $mathbb{E}[ R^2 ] = 190/9$ and the variance $$V_R equiv mathbb{E}[ R^2 ] - mathbb{E}[R]^2 = frac{77}{12}~.$$ See the end of Solution.2 for a Mathematica code block for the numerical evaluation and more.



One can quickly check the value of $mathbb{E}[ R ] approx 3.8333$ relative to $mu_{_X} = 3$ and $mu_{_Y} = 2.5$, as well as $V_R approx 6.146667$ in relation to $V_X = 6$ and $V_Y = 15/4$. Both of the quantities for $R$ are slightly larger than the max of $X$ and $Y$, which is reasonable.



Now, if you have a strong inclination for algebraic manipulation, then the following is what you might arrive at, after some trial and error. Recall the shorthand for the 2nd moment ref{Eq03}:
begin{equation*}
mathbb{E}[ R^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } = S_X + S_Y - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8)} label{Eq08}
end{equation*}

Again, this is not a coincidence. Along with ref{Eq07}, their respective 3rd terms seem to be another Geometric random variable with the `success' parameter $lambda$. The proper probabilistic analysis is the subject of Solution.2 up next.



By the way, blindly shuffling the terms around is usually not the best thing one can do. Nonetheless, just for the record, Appendix.A shows one way of going from ref{Eq06} to ref{Eq08} mechanically.





$largetextbf{Solution.2}$



Denote $W equiv min(X,Y)$ as the smaller among the two and $Z equiv max(X,Y)$ as the larger. The key observation to solve this problem is that
$$Z overset{d}{=} R~, qquadqquad textbf{the maximum has the same distribution as the 'rounds'.}$$
This allows one to think about the whole scenario differently (not as rounds of a two-player game). For all $k in mathbb{N}$, since $X perp Y$ we have
begin{align*}
Pr{ Z = k } &= Pr{ X < Y = Z = k } \
&hspace{36pt} + Pr{ Y < X = Z = k } \
&hspace{72pt} + Pr{ X = Y = Z = k } \
&= tau (1 - tau)^{k-1} (1 - (1-rho)^k) \
&hspace{36pt} + rho (1 - rho)^{k-1} (1 - (1-tau)^k) \
&hspace{72pt} + rhotau (1 - rho)^{k-1} (1 - tau)^{k-1} tag*{Eq.(9)} label{Eq09}
end{align*}

In principle, now that the distribution of $Z$ (thus $R$) is completely specified, everything one would like to know can be calculated. This is indeed a valid approach (to obtain the mean and variance), and the terms involved are all basic series with good symmetry.



Note that $Z$ is not Geometric (while $X$, $Y$, and $W$ are), nor is it Negative Binomial. At this point, it is not really of interest that ref{Eq09} can be rearranged into a more compact and illuminating form ...... because we can do even better.



There are two more observations that allow one to not only to better calculate but also understand the whole picture.
begin{align*}
&X+Y = W + Z tag*{Eq.(10)} label{Eq10} \
&W sim mathrm{Geo[lambda]} tag*{Eq.(11)} label{Eq11}
end{align*}

This is the special case of the sum of the order statistics being equal to the original sum. In general with many summands this not very useful, but here with just two terms it is crucial.



Back to the two-player game scenario, the fact that $W$ is Geometric with success probability $lambda = 1 - (1 - rho)(1 - tau)$ is easy to see: a round 'fails' if and only if both flips 'fail', with a probability $(1 - rho)(1 - tau) = 1 - lambda$.



The contour of $W = k$ is L-shaped boundary of the '2-dim tail', which fits perfectly with the scaling nature (memoryless) of the joint of two independent Geometric distribution. Pleas picture in the figure below that $Prleft{W = k_0 + k ~ middle|~~ W > k_0 right} = Pr{ W = k}$ manifests itself as the L-shape scaling away.



The contour of $Z = k$ looks like $daleth$, and together with the L-contour of $W$ they make a cross. This is ref{Eq10} the identity $X+Y = W+Z$, visualized in the figures below. See Appendix.B.1 for the linked in-site posts of related topics.



enter image description here



Immediately we know the expectation to be:
begin{equation*}
mathbb{E}[ Z ] = mathbb{E}[ X + Y - W ] = mu_{_X} + mu_{_Y} - mu_{_W} = frac1{rho} + frac1{tau} - frac1{lambda} tag*{Eq.(7.better)}
end{equation*}



This derivation provides perspectives different from (or better, arguably) those obtained by conditioning on the first round.



Derivation of the 2nd moment reveals a more intriguing properties of the setting.
begin{align*}
mathbb{E}[ Z^2 ] &= mathbb{E}[ (X + Y - W)^2 ] \
&= mathbb{E}[ (X + Y)^2 ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (X + Y) W ] \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] + 2 mathbb{E}[ XY ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (W+Z) W ] qquadbecause X+Y = W+Z\
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] + 2mathbb{E}[ XY ] - 2mathbb{E}[ WZ ]
end{align*}

Here's the kicker: when $X neq Y$, by definition $W$ and $Z$ each take one of them so $WZ = XY$, and when $X = Y$ we have $WZ = XY$ just the same!! Consequently, $mathbb{E}[ ZW ] = mathbb{E}[ XY ] =mu_{_X} mu_{_Y}$ always, and they cancel.
begin{equation*}
mathbb{E}[ Z^2 ] = mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8.better)} label{Eq08better}
end{equation*}

This is the proper derivation, and in ref{Eq08} the 3rd term following $S_X + S_Y$ is indeed $-S_W$.



Keep in mind that both $W$ and $Z$ are clearly correlated with $X$ and $Y$, while our intuition also says that the correlation between $W$ and $Z$ is positive (see Appendix.B for a discussion).



While we're at it, this relation in fact holds true for any (higher) moments, $$mathbb{E}[ Z^n ] = mathbb{E}[ X^n ] + mathbb{E}[ Y^n ] - mathbb{E}[ W^n ]~.$$This is true due to the same argument that gave us $WZ = XY$, and using ref{Eq09} doing the explicit sums to derive this identity is also easy.



Without further ado, the variance of $Z$ (thus the variance of $R$), previously known only as the unsatisfying "ref{Eq06} minus the square of $mathbb{E}[ Z ]$'', can now be put in its proper form.
begin{align*}
V_Z &equiv mathbb{E}[ Z^2 ] - mathbb{E}[ Z ]^2 \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] - ( mu_{_X} + mu_{_Y} - mu_{_W} )^2 \
&= V_X + V_Y - V_W + 2( mu_{_W} mu_{_Z} - mu_{_X} mu_{_Y}) tag*{Eq.(12.a)} \
&= frac{ 1 - rho }{ rho^2 } + frac{ 1 - tau }{ tau^2 } - frac{ 1 - lambda }{ lambda^2 } + 2left( frac1{ lambda } bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) - frac1{ rho tau }right) tag*{Eq.(12.b)}
end{align*}

This expression (or the symbolic one just above) is a perfectly nice formula to me. You can rearrange it to your heart's content, for example, like this
begin{equation*}
V_Z = bigl( frac1{ rho } - frac1{ tau } bigr)^2 - frac1{ lambda^2 } + 2 (frac1{lambda} - frac12) bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) tag*{Eq.(12.c)}
end{equation*}

which emphasizes the role played by the difference between $X-Y$.



I'm not motivated to pursue a 'better' form beyond Eq.(12), and if anyone knows any alternative expressions that are significant either algebraically or probabilistically, please do share.



Let me conclude with a Mathematica code block verifying the numeric values and identities for the variance, the 2nd moment, as well as the earliest algebra of the conditioning of expectation.



(* t = tau, p = rho, m = E[R],  S = E[R^2] , h = lambda = 1-(1-p)(1-t) *) ClearAll[t , p, m, S, V, h, tmp, simp, sq, var, basics]; basics[p_] := {1/p, (1 - p)/p^2, (2 - p)/p^2, (2 + p + p^2)/ p^2};(* EX, VX, E[X^2], E[(1+X)^2]*) 
Row@{basics[1/3], Spacer@10, basics[2/5], Spacer@10, basics[1 - (1 - 1/3) (1 - 2/5)]}
simp = Simplify[#, 0 < p < 1 && 0 < t < 1] &; h = 1 - (1 - p) (1 - t); m = (1 + t/p + p/t - (t + p))/h; sq[a_] := (2 - a)/a^2;(* 2nd moment of a Geometric distribution *) var[a_] := (1 - a)/a^2;
S = tmp /. Part[#, 1]&@Solve[tmp == p t + (1 - p) (1 - t) (1 + 2 m + tmp) + p (1 - t) (1 + 2/t + sq@t) + (1 - p) t (1 + 2/p + sq@p), tmp] //simp (* this simplification doesn't matter ... *)
V = S - m^2 // simp (* neither does this. *)
{tmp = V /. {p -> 1/3, t -> 2/5}, tmp // N} (* below: veryfiy the various identities for 2nd moment and variance. Difference being zero means equal *)
{sq@t + sq@p - sq@h - S, var@p + var@t - var@h + 2 m 1/h - 2/(p t) - V, (1/p - 1/t)^2 - 1/h^2 + 2 (1/p + 1/t - 1/h) (1/h - 1/2) - V} // simp




$largetextbf{Appendix.A: }normalsizetext{an algebraic route from ref{Eq06} to ref{Eq08}}$



Consider ref{Eq06} one part at a time. First, the $2(1 - lambda), mathbb{E}[ R ]$. From the the first line to the 2nd line, $1 - lambda = (1 - rho)(1 - tau)$ is inserted for the 2nd term:
begin{align*}
2(1 - lambda), mathbb{E}[ R ] &= 2(1 - lambda )frac{ - 1}{lambda} + 2(1 - lambda) bigl( frac1{ rho } + frac1{ tau } bigr) \
&= 2frac{lambda - 1}{lambda} + frac{2(1 - rho ) }{ rho }(1 - tau) + (1 - rho ) frac{ 2(1 - tau) }{ rho } \
&= 2 - frac2{lambda} + bigl( frac{2 - rho}{ rho } - 1bigr) (1 - tau) + bigl( frac{2 - tau}{ tau } - 1 bigr) (1 - rho ) \
&= color{magenta}{2 - frac2{lambda} } + frac{2 - rho}{ rho^2 } (1 - tau)rho color{magenta}{- (1 - tau)} + frac{2 - tau}{ tau^2 } (1 - rho )tau color{magenta}{- (1 - rho )}
end{align*}

Next in line are the $Q_X$ terms.
begin{align*}
rho (1 - tau) Q_Y + tau (1 - rho),Q_X &= rho (1 - tau) bigl( frac{ 2 + tau }{ tau^2 } + 1bigr)
+ tau (1 - rho) bigl( frac{ 2 + rho }{ rho^2 } + 1 bigr) \
&= rho bigl( frac{ 2 - tau - tau^2 }{ tau^2 } + 1 - tau bigr) + tau bigl( frac{ 2 - rho - rho^2 }{ rho^2 } + 1 - rho bigr) \
&= rho frac{ 2 - tau}{ tau^2 } + tau frac{ 2 - rho }{ rho^2 } color{magenta}{- 2taurho}
end{align*}

Put things back together in ref{Eq06}, the magenta terms combine with $rhotau$ and $1 - lambda$ (multiplied by 1):
begin{align*}
lambda,mathbb{E}[ R^2 ] &= rho tau + (1 - lambda) + (1 - lambda) 2 mathbb{E}[ R ] + rho (1 - tau) Q_Y + tau (1 - rho),Q_X \
&= rho tau + (1 - lambda) color{magenta}{ {}+ 2 - frac2{lambda} - (1 - tau) - (1 - rho ) - 2taurho} \
&hphantom{{}= rho tau} + frac{2 - rho}{ rho^2 } (1 - tau)rho + frac{2 - tau}{ tau^2 } (1 - rho )tau tag{from $mathbb{E}[R]$}\
&hphantom{{}= rho tau } + tau frac{ 2 - rho }{ rho^2 } + rho frac{ 2 - tau}{ tau^2 } tag{from $Q_X$ etc}\
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } bigl( rho + tau - rhotau bigr) + frac{ 2 - tau }{ tau^2 } bigl( rho + tau - rhotau bigr) \
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } lambda + frac{ 2 - tau }{ tau^2 } lambda
end{align*}

Finally we can divide by the coefficient of $mathbb{E}[ R^2 ]$ on the left hand side and obtain ref{Eq08}.





$largetextbf{Appendix.B.1: }normalsizetext{In-Site Links to Related Questions}$



I found many existing posts about this topic (minimum of two independent non-identical Geometric distributions). In chronological order: 90782, 845706, 1040620, 1056296, 1169142, 1207241, and 2669667.



Unlike the min, only 2 posts about max is found so far: 971214, and 1983481. Neither of the posts go beyond the 1st moment, and only a couple of the answers address the 'real geometry' of the situation.



In particular, consider the trinomial random variable $T$ that splits the 2-dim plane into the 3 regions.
begin{equation*}
T equiv mathrm{Sign}(Y - X) = begin{cases}
hphantom{-{}} 1 & text{if}~~Y > X quad text{, with probability}~~ tau( 1 - rho) / lambda \
hphantom{-{}} 0 & text{if}~~X = Y quad text{, with probability}~~ rho tau / lambda \
-1 & text{if}~~Y < X quad text{, with probability}~~ rho( 1 - tau) / lambda
end{cases}
end{equation*}

This trisection of above-diagonal, diagonal, and below-diagonal is fundamental to the calculation of both $W$ and $Z$.



It can be easily shown that $T perp W$, that the trisection is independent of the minimum. For example, $Prleft{ T = 1~ middle|~~ W = kright} = Pr{ T = 1}$ for all $k$.



Note that $T$ is not independent to $Z$ (even with the corner $k=1$ removed).



In the continuous analogue, ${X,Y,W}$ are Exponential with all the nice properties, and $Z$ is neither Exponential nor Gamma (analogue of Negative Binomial).



The density of the $Z$ analogue is easy and commonly used, but its doesn't seem to have a name. At best one can categorize it as a special case of phase-type (PH distribution).





$largetextbf{Appendix.B.2: }normalsizetext{Covariance between $min(X,Y)$ and $max(X,Y)$, along with related discussions.}$



Recall the expectations: $mathbb{E}[X] = mu_{_X} = 1 / rho$, and the similar
begin{align*}
mu_{_Y} &= frac1{ tau }~, & mu_{_W} &= frac1{ lambda } = frac1{rho + tau - rhotau}~, & &text{and}quad
mu_{_Z} = mu_{_X} + mu_{_Y} - mu_{_W}
end{align*}

The covariance between the minimum and the maximum is
begin{equation*}
C_{WZ} equiv mathbb{E}left[ (W - mu_{_W} ) (Z - mu_{_Z} ) right] = mathbb{E}[ WZ ] - mu_{_W} mu_{_Z} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_Z} tag*{Eq.(B)} label{EqB}
end{equation*}

The last equal sign invokes the intriguing $WZ = XY$ mentioned just before ref{Eq08better}.



A positive covariance (thus correlation) expresses the intuitive idea that when the min is large then the max also tends to be large.



Here we do a 'verbal proof' of $C_{WZ} > 0$ for all combinations of parameters. The whole purpose is not really a technical proof of truth but more illustrating the relations between ${W, Z}$ and ${X, Y}$.



Since $mu_{_X} + mu_{_Y} = mu_{_W} mu_{_Z}$, the covariance $C_{WZ}$ is comparing products (of two non-negative numbers) given a fixed sum. We know that for a fixed sum, the closer the two 'sides' (think rectangle) are the larger the product.



Being the min and max, $mu_{_W}$ and $mu_{_Z}$ are more 'extreme' and can never be as close as $mu_{_X}$ and $mu_{_Y}$, throughout the entire parametric space ${ rho, tau } in (0, 1)^2$. Boom, QED.



(the extreme cases where at least one of ${rho, tau}$ is zero or one are all ill-defined when it comes to min and max)



An algebraic proof for $C_{WZ} > 0$ (or for everything in this appendix) is easy. Let me emphasize the descriptive aspect.



Consider what it means for $W$ as a Geometric distribution to be the minimum of $X$ and $Y$.
begin{align*}
frac1{lambda} &< min( frac1{rho}, frac1{tau} ) &
&text{faster to 'succeed' on average} \
lambda &> max( rho, tau ) & &text{better than the higher success parameter}
end{align*}

That is, the mean of the 'minimum flip' $mu_{_W} < min( mu_{_X}, mu_{_Y} )$ is more extreme at the lower end.



On the other hand, $Z$ is NOT a Geometric distribution. However, one can define an auxiliary $Z' sim mathrm{Geo}[1 / mu_{_Z}]$ with an equivalent mean $mu_{_{Z'}} = mu_{_Z}$. Once we have arrived at ref{EqB}, the role of $Z$ can be replaced: $C_{WZ} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_{Z'}} $



Now we can have similar descriptions for $mu_{_Z}$, while it is actually $mu_{_{Z'}}$ for the Geometric distribution that is being compared with $X$ and $Y$.
begin{align*}
mu_{_Z} &> max( mu_{_X}, mu_{_Y} ) &
&text{slower to 'succeed' on average} \
frac1{ mu_{_Z} } &< min( rho, tau ) & &text{worse than the lower success parameter}
end{align*}

That is, the mean of the 'maximum flip' is more extreme at the higher end.



This concludes the verbal argument for how $mu_{_W}$ and $mu_{_Z}$ are more dissimilar (than the relation between $mu_{_X}$ and $mu_{_Y}$) thus making a smaller product.






share|cite|improve this answer























  • Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
    – BoLe
    Mar 22 at 10:45











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2544529%2fvariance-of-alternate-flipping-rounds%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










$largetextbf{Outline}$



Setup the Notations : As titled.



Solution.1 : Direct application of the conditional decomposition of expectation. This is the foolproof approach if one wants a quick numeric evaluation and doesn't want to be bothered with analysis.



Solution.2 : A framework that provides perspectives and better calculation.



Appendix.A : Supplementary material to Solution.1.



Appendix.B.1 : In-site links to existing questions that are closely related.



Appendix.B.2 : Supplementary material to Solution.2.





$largetextbf{Setup the Notations}$



Let $X$ be the total number of trials of Bob's flips when his first head (success) appears.



Let $Y$ be that for Bub. We have $Xsim mathrm{Geo}[rho]$ independent to $Ysim mathrm{Geo}[tau]$.



The following basics for a Geometric distribution will be useful here:
begin{align*}
mu_{_X} &equiv mathbb{E}[X] = frac1{rho} & & tag*{Eq.(1)}
label{Eq01} \
A_X &equiv mathbb{E}[X(X+1)] = frac2{ rho^2 } & &tag*{Eq.(2)} label{Eq02} \
S_X &equiv mathbb{E}[X^2 ] = frac{2 - rho}{ rho^2 } & &tag*{Eq.(3)} label{Eq03} \
V_X &equiv mathbb{V}[X] = frac{1 - rho}{ rho^2 } & &tag*{Eq.(4)} label{Eq04} \
Q_X &equiv mathbb{E}[(X+1)^2] = A_X + mu_{_X} + 1
= frac{ 2 + rho + rho^2 }{ rho^2 } & &tag*{Eq.(5)} label{Eq05}
end{align*}

Recall that we often use $A_X$ to obtain $S_X$ because $A_X$ is easier to derive (it is a more natural quantity for the Geometric distribution).



The shorthands for $Y$ are the same $mu_{_Y}$ and $V_Y$ etc.





$largetextbf{Solution.1}$



Just like how the expectation can be derived (which apparently you know how to), the 2nd moment (thus variance) can be obtained by conditioning on the results of the round. Denote the events of ${ text{Bob head, Bub tail} }$ as just $HT$, and recall that $X$ is for Bob flipping alone as if Bub doesn't exist.
begin{align*}
mathbb{E}[ R^2 ] &= rho tau , mathbb{E}left[ R^2 ,middle|~ HHright]
+ (1 - rho)(1 - tau),mathbb{E}left[ R^2 ,middle|~ TTright] \
&hspace{36pt} + rho (1 - tau) , mathbb{E}left[ R^2 ,middle|~ HTright]
+ tau (1 - rho),mathbb{E}left[ R^2 ,middle|~THright] \
&= rho tau + (1 - rho)(1 - tau),mathbb{E} left[ (1+R)^2 right] + rho (1 - tau) , mathbb{E}left[ (1+Y)^2 right] + tau (1 - rho),mathbb{E}left[ (1+X)^2 right]
end{align*}

Please let me know if you need justification for $mathbb{E}[ R^2 ~|~~TH] = mathbb{E}[ (1+Y)^2 ]$ and the alike. Moving on, use the ref{Eq04} shorthand $Q_X$ and $Q_Y$ for now and rearrange.
$$ mathbb{E}[ R^2 ] = rho tau + (1 - rho)(1 - tau) left( mathbb{E}[ R^2 ] + 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X$$
Denote $lambda = 1 - (1 - rho)(1 - tau) = rho + tau - rho tau$, and collect $mathbb{E}[ R^2 ]$ on the left.
begin{equation*}
lambda,mathbb{E}[ R^2 ] = rho tau + (1 - lambda) left( 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X tag*{Eq.(6)} label{Eq06}
end{equation*}

Note that the denominator of $mathbb{E}[ R ]$ is just $lambda$. Along with the symmetry, this suggests that the numerator of $mathbb{E}[ R ]$ can be rewritten into a better form invoking the basic ref{Eq01}.
begin{align*}
1 + frac{ rho }{ tau } + frac{ tau }{ rho } - (rho + tau) &= bigl( 1 + frac{ rho }{ tau } - rho bigr)
+ bigl( 1 + frac{ tau }{ rho } - tau bigr) - 1 \
&= frac{ tau + rho - rhotau }{ tau } + frac{ rho + tau - rhotau }{ rho } - 1 \
implies mathbb{E}[ R ] = frac1{rho} + frac1{tau} &- frac1{lambda} = mu_{_X} + mu_{_Y} - frac1{lambda} tag*{Eq.(7)} label{Eq07}
end{align*}

It's not a coincidence that the expectation can be expressed as such. The reason will be elaborated in the next section for Solution.2.



For the sake of computing the numeric value, ref{Eq06} was a good place to stop. With the given parameters $rho = 1/3$ and $tau = 2/5$, we have $mathbb{E}[ R ] = 23/6$, $Q_X = 22$, and $Q_Y = 16$. That makes $mathbb{E}[ R^2 ] = 190/9$ and the variance $$V_R equiv mathbb{E}[ R^2 ] - mathbb{E}[R]^2 = frac{77}{12}~.$$ See the end of Solution.2 for a Mathematica code block for the numerical evaluation and more.



One can quickly check the value of $mathbb{E}[ R ] approx 3.8333$ relative to $mu_{_X} = 3$ and $mu_{_Y} = 2.5$, as well as $V_R approx 6.146667$ in relation to $V_X = 6$ and $V_Y = 15/4$. Both of the quantities for $R$ are slightly larger than the max of $X$ and $Y$, which is reasonable.



Now, if you have a strong inclination for algebraic manipulation, then the following is what you might arrive at, after some trial and error. Recall the shorthand for the 2nd moment ref{Eq03}:
begin{equation*}
mathbb{E}[ R^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } = S_X + S_Y - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8)} label{Eq08}
end{equation*}

Again, this is not a coincidence. Along with ref{Eq07}, their respective 3rd terms seem to be another Geometric random variable with the `success' parameter $lambda$. The proper probabilistic analysis is the subject of Solution.2 up next.



By the way, blindly shuffling the terms around is usually not the best thing one can do. Nonetheless, just for the record, Appendix.A shows one way of going from ref{Eq06} to ref{Eq08} mechanically.





$largetextbf{Solution.2}$



Denote $W equiv min(X,Y)$ as the smaller among the two and $Z equiv max(X,Y)$ as the larger. The key observation to solve this problem is that
$$Z overset{d}{=} R~, qquadqquad textbf{the maximum has the same distribution as the 'rounds'.}$$
This allows one to think about the whole scenario differently (not as rounds of a two-player game). For all $k in mathbb{N}$, since $X perp Y$ we have
begin{align*}
Pr{ Z = k } &= Pr{ X < Y = Z = k } \
&hspace{36pt} + Pr{ Y < X = Z = k } \
&hspace{72pt} + Pr{ X = Y = Z = k } \
&= tau (1 - tau)^{k-1} (1 - (1-rho)^k) \
&hspace{36pt} + rho (1 - rho)^{k-1} (1 - (1-tau)^k) \
&hspace{72pt} + rhotau (1 - rho)^{k-1} (1 - tau)^{k-1} tag*{Eq.(9)} label{Eq09}
end{align*}

In principle, now that the distribution of $Z$ (thus $R$) is completely specified, everything one would like to know can be calculated. This is indeed a valid approach (to obtain the mean and variance), and the terms involved are all basic series with good symmetry.



Note that $Z$ is not Geometric (while $X$, $Y$, and $W$ are), nor is it Negative Binomial. At this point, it is not really of interest that ref{Eq09} can be rearranged into a more compact and illuminating form ...... because we can do even better.



There are two more observations that allow one to not only to better calculate but also understand the whole picture.
begin{align*}
&X+Y = W + Z tag*{Eq.(10)} label{Eq10} \
&W sim mathrm{Geo[lambda]} tag*{Eq.(11)} label{Eq11}
end{align*}

This is the special case of the sum of the order statistics being equal to the original sum. In general with many summands this not very useful, but here with just two terms it is crucial.



Back to the two-player game scenario, the fact that $W$ is Geometric with success probability $lambda = 1 - (1 - rho)(1 - tau)$ is easy to see: a round 'fails' if and only if both flips 'fail', with a probability $(1 - rho)(1 - tau) = 1 - lambda$.



The contour of $W = k$ is L-shaped boundary of the '2-dim tail', which fits perfectly with the scaling nature (memoryless) of the joint of two independent Geometric distribution. Pleas picture in the figure below that $Prleft{W = k_0 + k ~ middle|~~ W > k_0 right} = Pr{ W = k}$ manifests itself as the L-shape scaling away.



The contour of $Z = k$ looks like $daleth$, and together with the L-contour of $W$ they make a cross. This is ref{Eq10} the identity $X+Y = W+Z$, visualized in the figures below. See Appendix.B.1 for the linked in-site posts of related topics.



enter image description here



Immediately we know the expectation to be:
begin{equation*}
mathbb{E}[ Z ] = mathbb{E}[ X + Y - W ] = mu_{_X} + mu_{_Y} - mu_{_W} = frac1{rho} + frac1{tau} - frac1{lambda} tag*{Eq.(7.better)}
end{equation*}



This derivation provides perspectives different from (or better, arguably) those obtained by conditioning on the first round.



Derivation of the 2nd moment reveals a more intriguing properties of the setting.
begin{align*}
mathbb{E}[ Z^2 ] &= mathbb{E}[ (X + Y - W)^2 ] \
&= mathbb{E}[ (X + Y)^2 ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (X + Y) W ] \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] + 2 mathbb{E}[ XY ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (W+Z) W ] qquadbecause X+Y = W+Z\
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] + 2mathbb{E}[ XY ] - 2mathbb{E}[ WZ ]
end{align*}

Here's the kicker: when $X neq Y$, by definition $W$ and $Z$ each take one of them so $WZ = XY$, and when $X = Y$ we have $WZ = XY$ just the same!! Consequently, $mathbb{E}[ ZW ] = mathbb{E}[ XY ] =mu_{_X} mu_{_Y}$ always, and they cancel.
begin{equation*}
mathbb{E}[ Z^2 ] = mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8.better)} label{Eq08better}
end{equation*}

This is the proper derivation, and in ref{Eq08} the 3rd term following $S_X + S_Y$ is indeed $-S_W$.



Keep in mind that both $W$ and $Z$ are clearly correlated with $X$ and $Y$, while our intuition also says that the correlation between $W$ and $Z$ is positive (see Appendix.B for a discussion).



While we're at it, this relation in fact holds true for any (higher) moments, $$mathbb{E}[ Z^n ] = mathbb{E}[ X^n ] + mathbb{E}[ Y^n ] - mathbb{E}[ W^n ]~.$$This is true due to the same argument that gave us $WZ = XY$, and using ref{Eq09} doing the explicit sums to derive this identity is also easy.



Without further ado, the variance of $Z$ (thus the variance of $R$), previously known only as the unsatisfying "ref{Eq06} minus the square of $mathbb{E}[ Z ]$'', can now be put in its proper form.
begin{align*}
V_Z &equiv mathbb{E}[ Z^2 ] - mathbb{E}[ Z ]^2 \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] - ( mu_{_X} + mu_{_Y} - mu_{_W} )^2 \
&= V_X + V_Y - V_W + 2( mu_{_W} mu_{_Z} - mu_{_X} mu_{_Y}) tag*{Eq.(12.a)} \
&= frac{ 1 - rho }{ rho^2 } + frac{ 1 - tau }{ tau^2 } - frac{ 1 - lambda }{ lambda^2 } + 2left( frac1{ lambda } bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) - frac1{ rho tau }right) tag*{Eq.(12.b)}
end{align*}

This expression (or the symbolic one just above) is a perfectly nice formula to me. You can rearrange it to your heart's content, for example, like this
begin{equation*}
V_Z = bigl( frac1{ rho } - frac1{ tau } bigr)^2 - frac1{ lambda^2 } + 2 (frac1{lambda} - frac12) bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) tag*{Eq.(12.c)}
end{equation*}

which emphasizes the role played by the difference between $X-Y$.



I'm not motivated to pursue a 'better' form beyond Eq.(12), and if anyone knows any alternative expressions that are significant either algebraically or probabilistically, please do share.



Let me conclude with a Mathematica code block verifying the numeric values and identities for the variance, the 2nd moment, as well as the earliest algebra of the conditioning of expectation.



(* t = tau, p = rho, m = E[R],  S = E[R^2] , h = lambda = 1-(1-p)(1-t) *) ClearAll[t , p, m, S, V, h, tmp, simp, sq, var, basics]; basics[p_] := {1/p, (1 - p)/p^2, (2 - p)/p^2, (2 + p + p^2)/ p^2};(* EX, VX, E[X^2], E[(1+X)^2]*) 
Row@{basics[1/3], Spacer@10, basics[2/5], Spacer@10, basics[1 - (1 - 1/3) (1 - 2/5)]}
simp = Simplify[#, 0 < p < 1 && 0 < t < 1] &; h = 1 - (1 - p) (1 - t); m = (1 + t/p + p/t - (t + p))/h; sq[a_] := (2 - a)/a^2;(* 2nd moment of a Geometric distribution *) var[a_] := (1 - a)/a^2;
S = tmp /. Part[#, 1]&@Solve[tmp == p t + (1 - p) (1 - t) (1 + 2 m + tmp) + p (1 - t) (1 + 2/t + sq@t) + (1 - p) t (1 + 2/p + sq@p), tmp] //simp (* this simplification doesn't matter ... *)
V = S - m^2 // simp (* neither does this. *)
{tmp = V /. {p -> 1/3, t -> 2/5}, tmp // N} (* below: veryfiy the various identities for 2nd moment and variance. Difference being zero means equal *)
{sq@t + sq@p - sq@h - S, var@p + var@t - var@h + 2 m 1/h - 2/(p t) - V, (1/p - 1/t)^2 - 1/h^2 + 2 (1/p + 1/t - 1/h) (1/h - 1/2) - V} // simp




$largetextbf{Appendix.A: }normalsizetext{an algebraic route from ref{Eq06} to ref{Eq08}}$



Consider ref{Eq06} one part at a time. First, the $2(1 - lambda), mathbb{E}[ R ]$. From the the first line to the 2nd line, $1 - lambda = (1 - rho)(1 - tau)$ is inserted for the 2nd term:
begin{align*}
2(1 - lambda), mathbb{E}[ R ] &= 2(1 - lambda )frac{ - 1}{lambda} + 2(1 - lambda) bigl( frac1{ rho } + frac1{ tau } bigr) \
&= 2frac{lambda - 1}{lambda} + frac{2(1 - rho ) }{ rho }(1 - tau) + (1 - rho ) frac{ 2(1 - tau) }{ rho } \
&= 2 - frac2{lambda} + bigl( frac{2 - rho}{ rho } - 1bigr) (1 - tau) + bigl( frac{2 - tau}{ tau } - 1 bigr) (1 - rho ) \
&= color{magenta}{2 - frac2{lambda} } + frac{2 - rho}{ rho^2 } (1 - tau)rho color{magenta}{- (1 - tau)} + frac{2 - tau}{ tau^2 } (1 - rho )tau color{magenta}{- (1 - rho )}
end{align*}

Next in line are the $Q_X$ terms.
begin{align*}
rho (1 - tau) Q_Y + tau (1 - rho),Q_X &= rho (1 - tau) bigl( frac{ 2 + tau }{ tau^2 } + 1bigr)
+ tau (1 - rho) bigl( frac{ 2 + rho }{ rho^2 } + 1 bigr) \
&= rho bigl( frac{ 2 - tau - tau^2 }{ tau^2 } + 1 - tau bigr) + tau bigl( frac{ 2 - rho - rho^2 }{ rho^2 } + 1 - rho bigr) \
&= rho frac{ 2 - tau}{ tau^2 } + tau frac{ 2 - rho }{ rho^2 } color{magenta}{- 2taurho}
end{align*}

Put things back together in ref{Eq06}, the magenta terms combine with $rhotau$ and $1 - lambda$ (multiplied by 1):
begin{align*}
lambda,mathbb{E}[ R^2 ] &= rho tau + (1 - lambda) + (1 - lambda) 2 mathbb{E}[ R ] + rho (1 - tau) Q_Y + tau (1 - rho),Q_X \
&= rho tau + (1 - lambda) color{magenta}{ {}+ 2 - frac2{lambda} - (1 - tau) - (1 - rho ) - 2taurho} \
&hphantom{{}= rho tau} + frac{2 - rho}{ rho^2 } (1 - tau)rho + frac{2 - tau}{ tau^2 } (1 - rho )tau tag{from $mathbb{E}[R]$}\
&hphantom{{}= rho tau } + tau frac{ 2 - rho }{ rho^2 } + rho frac{ 2 - tau}{ tau^2 } tag{from $Q_X$ etc}\
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } bigl( rho + tau - rhotau bigr) + frac{ 2 - tau }{ tau^2 } bigl( rho + tau - rhotau bigr) \
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } lambda + frac{ 2 - tau }{ tau^2 } lambda
end{align*}

Finally we can divide by the coefficient of $mathbb{E}[ R^2 ]$ on the left hand side and obtain ref{Eq08}.





$largetextbf{Appendix.B.1: }normalsizetext{In-Site Links to Related Questions}$



I found many existing posts about this topic (minimum of two independent non-identical Geometric distributions). In chronological order: 90782, 845706, 1040620, 1056296, 1169142, 1207241, and 2669667.



Unlike the min, only 2 posts about max is found so far: 971214, and 1983481. Neither of the posts go beyond the 1st moment, and only a couple of the answers address the 'real geometry' of the situation.



In particular, consider the trinomial random variable $T$ that splits the 2-dim plane into the 3 regions.
begin{equation*}
T equiv mathrm{Sign}(Y - X) = begin{cases}
hphantom{-{}} 1 & text{if}~~Y > X quad text{, with probability}~~ tau( 1 - rho) / lambda \
hphantom{-{}} 0 & text{if}~~X = Y quad text{, with probability}~~ rho tau / lambda \
-1 & text{if}~~Y < X quad text{, with probability}~~ rho( 1 - tau) / lambda
end{cases}
end{equation*}

This trisection of above-diagonal, diagonal, and below-diagonal is fundamental to the calculation of both $W$ and $Z$.



It can be easily shown that $T perp W$, that the trisection is independent of the minimum. For example, $Prleft{ T = 1~ middle|~~ W = kright} = Pr{ T = 1}$ for all $k$.



Note that $T$ is not independent to $Z$ (even with the corner $k=1$ removed).



In the continuous analogue, ${X,Y,W}$ are Exponential with all the nice properties, and $Z$ is neither Exponential nor Gamma (analogue of Negative Binomial).



The density of the $Z$ analogue is easy and commonly used, but its doesn't seem to have a name. At best one can categorize it as a special case of phase-type (PH distribution).





$largetextbf{Appendix.B.2: }normalsizetext{Covariance between $min(X,Y)$ and $max(X,Y)$, along with related discussions.}$



Recall the expectations: $mathbb{E}[X] = mu_{_X} = 1 / rho$, and the similar
begin{align*}
mu_{_Y} &= frac1{ tau }~, & mu_{_W} &= frac1{ lambda } = frac1{rho + tau - rhotau}~, & &text{and}quad
mu_{_Z} = mu_{_X} + mu_{_Y} - mu_{_W}
end{align*}

The covariance between the minimum and the maximum is
begin{equation*}
C_{WZ} equiv mathbb{E}left[ (W - mu_{_W} ) (Z - mu_{_Z} ) right] = mathbb{E}[ WZ ] - mu_{_W} mu_{_Z} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_Z} tag*{Eq.(B)} label{EqB}
end{equation*}

The last equal sign invokes the intriguing $WZ = XY$ mentioned just before ref{Eq08better}.



A positive covariance (thus correlation) expresses the intuitive idea that when the min is large then the max also tends to be large.



Here we do a 'verbal proof' of $C_{WZ} > 0$ for all combinations of parameters. The whole purpose is not really a technical proof of truth but more illustrating the relations between ${W, Z}$ and ${X, Y}$.



Since $mu_{_X} + mu_{_Y} = mu_{_W} mu_{_Z}$, the covariance $C_{WZ}$ is comparing products (of two non-negative numbers) given a fixed sum. We know that for a fixed sum, the closer the two 'sides' (think rectangle) are the larger the product.



Being the min and max, $mu_{_W}$ and $mu_{_Z}$ are more 'extreme' and can never be as close as $mu_{_X}$ and $mu_{_Y}$, throughout the entire parametric space ${ rho, tau } in (0, 1)^2$. Boom, QED.



(the extreme cases where at least one of ${rho, tau}$ is zero or one are all ill-defined when it comes to min and max)



An algebraic proof for $C_{WZ} > 0$ (or for everything in this appendix) is easy. Let me emphasize the descriptive aspect.



Consider what it means for $W$ as a Geometric distribution to be the minimum of $X$ and $Y$.
begin{align*}
frac1{lambda} &< min( frac1{rho}, frac1{tau} ) &
&text{faster to 'succeed' on average} \
lambda &> max( rho, tau ) & &text{better than the higher success parameter}
end{align*}

That is, the mean of the 'minimum flip' $mu_{_W} < min( mu_{_X}, mu_{_Y} )$ is more extreme at the lower end.



On the other hand, $Z$ is NOT a Geometric distribution. However, one can define an auxiliary $Z' sim mathrm{Geo}[1 / mu_{_Z}]$ with an equivalent mean $mu_{_{Z'}} = mu_{_Z}$. Once we have arrived at ref{EqB}, the role of $Z$ can be replaced: $C_{WZ} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_{Z'}} $



Now we can have similar descriptions for $mu_{_Z}$, while it is actually $mu_{_{Z'}}$ for the Geometric distribution that is being compared with $X$ and $Y$.
begin{align*}
mu_{_Z} &> max( mu_{_X}, mu_{_Y} ) &
&text{slower to 'succeed' on average} \
frac1{ mu_{_Z} } &< min( rho, tau ) & &text{worse than the lower success parameter}
end{align*}

That is, the mean of the 'maximum flip' is more extreme at the higher end.



This concludes the verbal argument for how $mu_{_W}$ and $mu_{_Z}$ are more dissimilar (than the relation between $mu_{_X}$ and $mu_{_Y}$) thus making a smaller product.






share|cite|improve this answer























  • Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
    – BoLe
    Mar 22 at 10:45















up vote
2
down vote



accepted










$largetextbf{Outline}$



Setup the Notations : As titled.



Solution.1 : Direct application of the conditional decomposition of expectation. This is the foolproof approach if one wants a quick numeric evaluation and doesn't want to be bothered with analysis.



Solution.2 : A framework that provides perspectives and better calculation.



Appendix.A : Supplementary material to Solution.1.



Appendix.B.1 : In-site links to existing questions that are closely related.



Appendix.B.2 : Supplementary material to Solution.2.





$largetextbf{Setup the Notations}$



Let $X$ be the total number of trials of Bob's flips when his first head (success) appears.



Let $Y$ be that for Bub. We have $Xsim mathrm{Geo}[rho]$ independent to $Ysim mathrm{Geo}[tau]$.



The following basics for a Geometric distribution will be useful here:
begin{align*}
mu_{_X} &equiv mathbb{E}[X] = frac1{rho} & & tag*{Eq.(1)}
label{Eq01} \
A_X &equiv mathbb{E}[X(X+1)] = frac2{ rho^2 } & &tag*{Eq.(2)} label{Eq02} \
S_X &equiv mathbb{E}[X^2 ] = frac{2 - rho}{ rho^2 } & &tag*{Eq.(3)} label{Eq03} \
V_X &equiv mathbb{V}[X] = frac{1 - rho}{ rho^2 } & &tag*{Eq.(4)} label{Eq04} \
Q_X &equiv mathbb{E}[(X+1)^2] = A_X + mu_{_X} + 1
= frac{ 2 + rho + rho^2 }{ rho^2 } & &tag*{Eq.(5)} label{Eq05}
end{align*}

Recall that we often use $A_X$ to obtain $S_X$ because $A_X$ is easier to derive (it is a more natural quantity for the Geometric distribution).



The shorthands for $Y$ are the same $mu_{_Y}$ and $V_Y$ etc.





$largetextbf{Solution.1}$



Just like how the expectation can be derived (which apparently you know how to), the 2nd moment (thus variance) can be obtained by conditioning on the results of the round. Denote the events of ${ text{Bob head, Bub tail} }$ as just $HT$, and recall that $X$ is for Bob flipping alone as if Bub doesn't exist.
begin{align*}
mathbb{E}[ R^2 ] &= rho tau , mathbb{E}left[ R^2 ,middle|~ HHright]
+ (1 - rho)(1 - tau),mathbb{E}left[ R^2 ,middle|~ TTright] \
&hspace{36pt} + rho (1 - tau) , mathbb{E}left[ R^2 ,middle|~ HTright]
+ tau (1 - rho),mathbb{E}left[ R^2 ,middle|~THright] \
&= rho tau + (1 - rho)(1 - tau),mathbb{E} left[ (1+R)^2 right] + rho (1 - tau) , mathbb{E}left[ (1+Y)^2 right] + tau (1 - rho),mathbb{E}left[ (1+X)^2 right]
end{align*}

Please let me know if you need justification for $mathbb{E}[ R^2 ~|~~TH] = mathbb{E}[ (1+Y)^2 ]$ and the alike. Moving on, use the ref{Eq04} shorthand $Q_X$ and $Q_Y$ for now and rearrange.
$$ mathbb{E}[ R^2 ] = rho tau + (1 - rho)(1 - tau) left( mathbb{E}[ R^2 ] + 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X$$
Denote $lambda = 1 - (1 - rho)(1 - tau) = rho + tau - rho tau$, and collect $mathbb{E}[ R^2 ]$ on the left.
begin{equation*}
lambda,mathbb{E}[ R^2 ] = rho tau + (1 - lambda) left( 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X tag*{Eq.(6)} label{Eq06}
end{equation*}

Note that the denominator of $mathbb{E}[ R ]$ is just $lambda$. Along with the symmetry, this suggests that the numerator of $mathbb{E}[ R ]$ can be rewritten into a better form invoking the basic ref{Eq01}.
begin{align*}
1 + frac{ rho }{ tau } + frac{ tau }{ rho } - (rho + tau) &= bigl( 1 + frac{ rho }{ tau } - rho bigr)
+ bigl( 1 + frac{ tau }{ rho } - tau bigr) - 1 \
&= frac{ tau + rho - rhotau }{ tau } + frac{ rho + tau - rhotau }{ rho } - 1 \
implies mathbb{E}[ R ] = frac1{rho} + frac1{tau} &- frac1{lambda} = mu_{_X} + mu_{_Y} - frac1{lambda} tag*{Eq.(7)} label{Eq07}
end{align*}

It's not a coincidence that the expectation can be expressed as such. The reason will be elaborated in the next section for Solution.2.



For the sake of computing the numeric value, ref{Eq06} was a good place to stop. With the given parameters $rho = 1/3$ and $tau = 2/5$, we have $mathbb{E}[ R ] = 23/6$, $Q_X = 22$, and $Q_Y = 16$. That makes $mathbb{E}[ R^2 ] = 190/9$ and the variance $$V_R equiv mathbb{E}[ R^2 ] - mathbb{E}[R]^2 = frac{77}{12}~.$$ See the end of Solution.2 for a Mathematica code block for the numerical evaluation and more.



One can quickly check the value of $mathbb{E}[ R ] approx 3.8333$ relative to $mu_{_X} = 3$ and $mu_{_Y} = 2.5$, as well as $V_R approx 6.146667$ in relation to $V_X = 6$ and $V_Y = 15/4$. Both of the quantities for $R$ are slightly larger than the max of $X$ and $Y$, which is reasonable.



Now, if you have a strong inclination for algebraic manipulation, then the following is what you might arrive at, after some trial and error. Recall the shorthand for the 2nd moment ref{Eq03}:
begin{equation*}
mathbb{E}[ R^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } = S_X + S_Y - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8)} label{Eq08}
end{equation*}

Again, this is not a coincidence. Along with ref{Eq07}, their respective 3rd terms seem to be another Geometric random variable with the `success' parameter $lambda$. The proper probabilistic analysis is the subject of Solution.2 up next.



By the way, blindly shuffling the terms around is usually not the best thing one can do. Nonetheless, just for the record, Appendix.A shows one way of going from ref{Eq06} to ref{Eq08} mechanically.





$largetextbf{Solution.2}$



Denote $W equiv min(X,Y)$ as the smaller among the two and $Z equiv max(X,Y)$ as the larger. The key observation to solve this problem is that
$$Z overset{d}{=} R~, qquadqquad textbf{the maximum has the same distribution as the 'rounds'.}$$
This allows one to think about the whole scenario differently (not as rounds of a two-player game). For all $k in mathbb{N}$, since $X perp Y$ we have
begin{align*}
Pr{ Z = k } &= Pr{ X < Y = Z = k } \
&hspace{36pt} + Pr{ Y < X = Z = k } \
&hspace{72pt} + Pr{ X = Y = Z = k } \
&= tau (1 - tau)^{k-1} (1 - (1-rho)^k) \
&hspace{36pt} + rho (1 - rho)^{k-1} (1 - (1-tau)^k) \
&hspace{72pt} + rhotau (1 - rho)^{k-1} (1 - tau)^{k-1} tag*{Eq.(9)} label{Eq09}
end{align*}

In principle, now that the distribution of $Z$ (thus $R$) is completely specified, everything one would like to know can be calculated. This is indeed a valid approach (to obtain the mean and variance), and the terms involved are all basic series with good symmetry.



Note that $Z$ is not Geometric (while $X$, $Y$, and $W$ are), nor is it Negative Binomial. At this point, it is not really of interest that ref{Eq09} can be rearranged into a more compact and illuminating form ...... because we can do even better.



There are two more observations that allow one to not only to better calculate but also understand the whole picture.
begin{align*}
&X+Y = W + Z tag*{Eq.(10)} label{Eq10} \
&W sim mathrm{Geo[lambda]} tag*{Eq.(11)} label{Eq11}
end{align*}

This is the special case of the sum of the order statistics being equal to the original sum. In general with many summands this not very useful, but here with just two terms it is crucial.



Back to the two-player game scenario, the fact that $W$ is Geometric with success probability $lambda = 1 - (1 - rho)(1 - tau)$ is easy to see: a round 'fails' if and only if both flips 'fail', with a probability $(1 - rho)(1 - tau) = 1 - lambda$.



The contour of $W = k$ is L-shaped boundary of the '2-dim tail', which fits perfectly with the scaling nature (memoryless) of the joint of two independent Geometric distribution. Pleas picture in the figure below that $Prleft{W = k_0 + k ~ middle|~~ W > k_0 right} = Pr{ W = k}$ manifests itself as the L-shape scaling away.



The contour of $Z = k$ looks like $daleth$, and together with the L-contour of $W$ they make a cross. This is ref{Eq10} the identity $X+Y = W+Z$, visualized in the figures below. See Appendix.B.1 for the linked in-site posts of related topics.



enter image description here



Immediately we know the expectation to be:
begin{equation*}
mathbb{E}[ Z ] = mathbb{E}[ X + Y - W ] = mu_{_X} + mu_{_Y} - mu_{_W} = frac1{rho} + frac1{tau} - frac1{lambda} tag*{Eq.(7.better)}
end{equation*}



This derivation provides perspectives different from (or better, arguably) those obtained by conditioning on the first round.



Derivation of the 2nd moment reveals a more intriguing properties of the setting.
begin{align*}
mathbb{E}[ Z^2 ] &= mathbb{E}[ (X + Y - W)^2 ] \
&= mathbb{E}[ (X + Y)^2 ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (X + Y) W ] \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] + 2 mathbb{E}[ XY ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (W+Z) W ] qquadbecause X+Y = W+Z\
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] + 2mathbb{E}[ XY ] - 2mathbb{E}[ WZ ]
end{align*}

Here's the kicker: when $X neq Y$, by definition $W$ and $Z$ each take one of them so $WZ = XY$, and when $X = Y$ we have $WZ = XY$ just the same!! Consequently, $mathbb{E}[ ZW ] = mathbb{E}[ XY ] =mu_{_X} mu_{_Y}$ always, and they cancel.
begin{equation*}
mathbb{E}[ Z^2 ] = mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8.better)} label{Eq08better}
end{equation*}

This is the proper derivation, and in ref{Eq08} the 3rd term following $S_X + S_Y$ is indeed $-S_W$.



Keep in mind that both $W$ and $Z$ are clearly correlated with $X$ and $Y$, while our intuition also says that the correlation between $W$ and $Z$ is positive (see Appendix.B for a discussion).



While we're at it, this relation in fact holds true for any (higher) moments, $$mathbb{E}[ Z^n ] = mathbb{E}[ X^n ] + mathbb{E}[ Y^n ] - mathbb{E}[ W^n ]~.$$This is true due to the same argument that gave us $WZ = XY$, and using ref{Eq09} doing the explicit sums to derive this identity is also easy.



Without further ado, the variance of $Z$ (thus the variance of $R$), previously known only as the unsatisfying "ref{Eq06} minus the square of $mathbb{E}[ Z ]$'', can now be put in its proper form.
begin{align*}
V_Z &equiv mathbb{E}[ Z^2 ] - mathbb{E}[ Z ]^2 \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] - ( mu_{_X} + mu_{_Y} - mu_{_W} )^2 \
&= V_X + V_Y - V_W + 2( mu_{_W} mu_{_Z} - mu_{_X} mu_{_Y}) tag*{Eq.(12.a)} \
&= frac{ 1 - rho }{ rho^2 } + frac{ 1 - tau }{ tau^2 } - frac{ 1 - lambda }{ lambda^2 } + 2left( frac1{ lambda } bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) - frac1{ rho tau }right) tag*{Eq.(12.b)}
end{align*}

This expression (or the symbolic one just above) is a perfectly nice formula to me. You can rearrange it to your heart's content, for example, like this
begin{equation*}
V_Z = bigl( frac1{ rho } - frac1{ tau } bigr)^2 - frac1{ lambda^2 } + 2 (frac1{lambda} - frac12) bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) tag*{Eq.(12.c)}
end{equation*}

which emphasizes the role played by the difference between $X-Y$.



I'm not motivated to pursue a 'better' form beyond Eq.(12), and if anyone knows any alternative expressions that are significant either algebraically or probabilistically, please do share.



Let me conclude with a Mathematica code block verifying the numeric values and identities for the variance, the 2nd moment, as well as the earliest algebra of the conditioning of expectation.



(* t = tau, p = rho, m = E[R],  S = E[R^2] , h = lambda = 1-(1-p)(1-t) *) ClearAll[t , p, m, S, V, h, tmp, simp, sq, var, basics]; basics[p_] := {1/p, (1 - p)/p^2, (2 - p)/p^2, (2 + p + p^2)/ p^2};(* EX, VX, E[X^2], E[(1+X)^2]*) 
Row@{basics[1/3], Spacer@10, basics[2/5], Spacer@10, basics[1 - (1 - 1/3) (1 - 2/5)]}
simp = Simplify[#, 0 < p < 1 && 0 < t < 1] &; h = 1 - (1 - p) (1 - t); m = (1 + t/p + p/t - (t + p))/h; sq[a_] := (2 - a)/a^2;(* 2nd moment of a Geometric distribution *) var[a_] := (1 - a)/a^2;
S = tmp /. Part[#, 1]&@Solve[tmp == p t + (1 - p) (1 - t) (1 + 2 m + tmp) + p (1 - t) (1 + 2/t + sq@t) + (1 - p) t (1 + 2/p + sq@p), tmp] //simp (* this simplification doesn't matter ... *)
V = S - m^2 // simp (* neither does this. *)
{tmp = V /. {p -> 1/3, t -> 2/5}, tmp // N} (* below: veryfiy the various identities for 2nd moment and variance. Difference being zero means equal *)
{sq@t + sq@p - sq@h - S, var@p + var@t - var@h + 2 m 1/h - 2/(p t) - V, (1/p - 1/t)^2 - 1/h^2 + 2 (1/p + 1/t - 1/h) (1/h - 1/2) - V} // simp




$largetextbf{Appendix.A: }normalsizetext{an algebraic route from ref{Eq06} to ref{Eq08}}$



Consider ref{Eq06} one part at a time. First, the $2(1 - lambda), mathbb{E}[ R ]$. From the the first line to the 2nd line, $1 - lambda = (1 - rho)(1 - tau)$ is inserted for the 2nd term:
begin{align*}
2(1 - lambda), mathbb{E}[ R ] &= 2(1 - lambda )frac{ - 1}{lambda} + 2(1 - lambda) bigl( frac1{ rho } + frac1{ tau } bigr) \
&= 2frac{lambda - 1}{lambda} + frac{2(1 - rho ) }{ rho }(1 - tau) + (1 - rho ) frac{ 2(1 - tau) }{ rho } \
&= 2 - frac2{lambda} + bigl( frac{2 - rho}{ rho } - 1bigr) (1 - tau) + bigl( frac{2 - tau}{ tau } - 1 bigr) (1 - rho ) \
&= color{magenta}{2 - frac2{lambda} } + frac{2 - rho}{ rho^2 } (1 - tau)rho color{magenta}{- (1 - tau)} + frac{2 - tau}{ tau^2 } (1 - rho )tau color{magenta}{- (1 - rho )}
end{align*}

Next in line are the $Q_X$ terms.
begin{align*}
rho (1 - tau) Q_Y + tau (1 - rho),Q_X &= rho (1 - tau) bigl( frac{ 2 + tau }{ tau^2 } + 1bigr)
+ tau (1 - rho) bigl( frac{ 2 + rho }{ rho^2 } + 1 bigr) \
&= rho bigl( frac{ 2 - tau - tau^2 }{ tau^2 } + 1 - tau bigr) + tau bigl( frac{ 2 - rho - rho^2 }{ rho^2 } + 1 - rho bigr) \
&= rho frac{ 2 - tau}{ tau^2 } + tau frac{ 2 - rho }{ rho^2 } color{magenta}{- 2taurho}
end{align*}

Put things back together in ref{Eq06}, the magenta terms combine with $rhotau$ and $1 - lambda$ (multiplied by 1):
begin{align*}
lambda,mathbb{E}[ R^2 ] &= rho tau + (1 - lambda) + (1 - lambda) 2 mathbb{E}[ R ] + rho (1 - tau) Q_Y + tau (1 - rho),Q_X \
&= rho tau + (1 - lambda) color{magenta}{ {}+ 2 - frac2{lambda} - (1 - tau) - (1 - rho ) - 2taurho} \
&hphantom{{}= rho tau} + frac{2 - rho}{ rho^2 } (1 - tau)rho + frac{2 - tau}{ tau^2 } (1 - rho )tau tag{from $mathbb{E}[R]$}\
&hphantom{{}= rho tau } + tau frac{ 2 - rho }{ rho^2 } + rho frac{ 2 - tau}{ tau^2 } tag{from $Q_X$ etc}\
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } bigl( rho + tau - rhotau bigr) + frac{ 2 - tau }{ tau^2 } bigl( rho + tau - rhotau bigr) \
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } lambda + frac{ 2 - tau }{ tau^2 } lambda
end{align*}

Finally we can divide by the coefficient of $mathbb{E}[ R^2 ]$ on the left hand side and obtain ref{Eq08}.





$largetextbf{Appendix.B.1: }normalsizetext{In-Site Links to Related Questions}$



I found many existing posts about this topic (minimum of two independent non-identical Geometric distributions). In chronological order: 90782, 845706, 1040620, 1056296, 1169142, 1207241, and 2669667.



Unlike the min, only 2 posts about max is found so far: 971214, and 1983481. Neither of the posts go beyond the 1st moment, and only a couple of the answers address the 'real geometry' of the situation.



In particular, consider the trinomial random variable $T$ that splits the 2-dim plane into the 3 regions.
begin{equation*}
T equiv mathrm{Sign}(Y - X) = begin{cases}
hphantom{-{}} 1 & text{if}~~Y > X quad text{, with probability}~~ tau( 1 - rho) / lambda \
hphantom{-{}} 0 & text{if}~~X = Y quad text{, with probability}~~ rho tau / lambda \
-1 & text{if}~~Y < X quad text{, with probability}~~ rho( 1 - tau) / lambda
end{cases}
end{equation*}

This trisection of above-diagonal, diagonal, and below-diagonal is fundamental to the calculation of both $W$ and $Z$.



It can be easily shown that $T perp W$, that the trisection is independent of the minimum. For example, $Prleft{ T = 1~ middle|~~ W = kright} = Pr{ T = 1}$ for all $k$.



Note that $T$ is not independent to $Z$ (even with the corner $k=1$ removed).



In the continuous analogue, ${X,Y,W}$ are Exponential with all the nice properties, and $Z$ is neither Exponential nor Gamma (analogue of Negative Binomial).



The density of the $Z$ analogue is easy and commonly used, but its doesn't seem to have a name. At best one can categorize it as a special case of phase-type (PH distribution).





$largetextbf{Appendix.B.2: }normalsizetext{Covariance between $min(X,Y)$ and $max(X,Y)$, along with related discussions.}$



Recall the expectations: $mathbb{E}[X] = mu_{_X} = 1 / rho$, and the similar
begin{align*}
mu_{_Y} &= frac1{ tau }~, & mu_{_W} &= frac1{ lambda } = frac1{rho + tau - rhotau}~, & &text{and}quad
mu_{_Z} = mu_{_X} + mu_{_Y} - mu_{_W}
end{align*}

The covariance between the minimum and the maximum is
begin{equation*}
C_{WZ} equiv mathbb{E}left[ (W - mu_{_W} ) (Z - mu_{_Z} ) right] = mathbb{E}[ WZ ] - mu_{_W} mu_{_Z} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_Z} tag*{Eq.(B)} label{EqB}
end{equation*}

The last equal sign invokes the intriguing $WZ = XY$ mentioned just before ref{Eq08better}.



A positive covariance (thus correlation) expresses the intuitive idea that when the min is large then the max also tends to be large.



Here we do a 'verbal proof' of $C_{WZ} > 0$ for all combinations of parameters. The whole purpose is not really a technical proof of truth but more illustrating the relations between ${W, Z}$ and ${X, Y}$.



Since $mu_{_X} + mu_{_Y} = mu_{_W} mu_{_Z}$, the covariance $C_{WZ}$ is comparing products (of two non-negative numbers) given a fixed sum. We know that for a fixed sum, the closer the two 'sides' (think rectangle) are the larger the product.



Being the min and max, $mu_{_W}$ and $mu_{_Z}$ are more 'extreme' and can never be as close as $mu_{_X}$ and $mu_{_Y}$, throughout the entire parametric space ${ rho, tau } in (0, 1)^2$. Boom, QED.



(the extreme cases where at least one of ${rho, tau}$ is zero or one are all ill-defined when it comes to min and max)



An algebraic proof for $C_{WZ} > 0$ (or for everything in this appendix) is easy. Let me emphasize the descriptive aspect.



Consider what it means for $W$ as a Geometric distribution to be the minimum of $X$ and $Y$.
begin{align*}
frac1{lambda} &< min( frac1{rho}, frac1{tau} ) &
&text{faster to 'succeed' on average} \
lambda &> max( rho, tau ) & &text{better than the higher success parameter}
end{align*}

That is, the mean of the 'minimum flip' $mu_{_W} < min( mu_{_X}, mu_{_Y} )$ is more extreme at the lower end.



On the other hand, $Z$ is NOT a Geometric distribution. However, one can define an auxiliary $Z' sim mathrm{Geo}[1 / mu_{_Z}]$ with an equivalent mean $mu_{_{Z'}} = mu_{_Z}$. Once we have arrived at ref{EqB}, the role of $Z$ can be replaced: $C_{WZ} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_{Z'}} $



Now we can have similar descriptions for $mu_{_Z}$, while it is actually $mu_{_{Z'}}$ for the Geometric distribution that is being compared with $X$ and $Y$.
begin{align*}
mu_{_Z} &> max( mu_{_X}, mu_{_Y} ) &
&text{slower to 'succeed' on average} \
frac1{ mu_{_Z} } &< min( rho, tau ) & &text{worse than the lower success parameter}
end{align*}

That is, the mean of the 'maximum flip' is more extreme at the higher end.



This concludes the verbal argument for how $mu_{_W}$ and $mu_{_Z}$ are more dissimilar (than the relation between $mu_{_X}$ and $mu_{_Y}$) thus making a smaller product.






share|cite|improve this answer























  • Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
    – BoLe
    Mar 22 at 10:45













up vote
2
down vote



accepted







up vote
2
down vote



accepted






$largetextbf{Outline}$



Setup the Notations : As titled.



Solution.1 : Direct application of the conditional decomposition of expectation. This is the foolproof approach if one wants a quick numeric evaluation and doesn't want to be bothered with analysis.



Solution.2 : A framework that provides perspectives and better calculation.



Appendix.A : Supplementary material to Solution.1.



Appendix.B.1 : In-site links to existing questions that are closely related.



Appendix.B.2 : Supplementary material to Solution.2.





$largetextbf{Setup the Notations}$



Let $X$ be the total number of trials of Bob's flips when his first head (success) appears.



Let $Y$ be that for Bub. We have $Xsim mathrm{Geo}[rho]$ independent to $Ysim mathrm{Geo}[tau]$.



The following basics for a Geometric distribution will be useful here:
begin{align*}
mu_{_X} &equiv mathbb{E}[X] = frac1{rho} & & tag*{Eq.(1)}
label{Eq01} \
A_X &equiv mathbb{E}[X(X+1)] = frac2{ rho^2 } & &tag*{Eq.(2)} label{Eq02} \
S_X &equiv mathbb{E}[X^2 ] = frac{2 - rho}{ rho^2 } & &tag*{Eq.(3)} label{Eq03} \
V_X &equiv mathbb{V}[X] = frac{1 - rho}{ rho^2 } & &tag*{Eq.(4)} label{Eq04} \
Q_X &equiv mathbb{E}[(X+1)^2] = A_X + mu_{_X} + 1
= frac{ 2 + rho + rho^2 }{ rho^2 } & &tag*{Eq.(5)} label{Eq05}
end{align*}

Recall that we often use $A_X$ to obtain $S_X$ because $A_X$ is easier to derive (it is a more natural quantity for the Geometric distribution).



The shorthands for $Y$ are the same $mu_{_Y}$ and $V_Y$ etc.





$largetextbf{Solution.1}$



Just like how the expectation can be derived (which apparently you know how to), the 2nd moment (thus variance) can be obtained by conditioning on the results of the round. Denote the events of ${ text{Bob head, Bub tail} }$ as just $HT$, and recall that $X$ is for Bob flipping alone as if Bub doesn't exist.
begin{align*}
mathbb{E}[ R^2 ] &= rho tau , mathbb{E}left[ R^2 ,middle|~ HHright]
+ (1 - rho)(1 - tau),mathbb{E}left[ R^2 ,middle|~ TTright] \
&hspace{36pt} + rho (1 - tau) , mathbb{E}left[ R^2 ,middle|~ HTright]
+ tau (1 - rho),mathbb{E}left[ R^2 ,middle|~THright] \
&= rho tau + (1 - rho)(1 - tau),mathbb{E} left[ (1+R)^2 right] + rho (1 - tau) , mathbb{E}left[ (1+Y)^2 right] + tau (1 - rho),mathbb{E}left[ (1+X)^2 right]
end{align*}

Please let me know if you need justification for $mathbb{E}[ R^2 ~|~~TH] = mathbb{E}[ (1+Y)^2 ]$ and the alike. Moving on, use the ref{Eq04} shorthand $Q_X$ and $Q_Y$ for now and rearrange.
$$ mathbb{E}[ R^2 ] = rho tau + (1 - rho)(1 - tau) left( mathbb{E}[ R^2 ] + 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X$$
Denote $lambda = 1 - (1 - rho)(1 - tau) = rho + tau - rho tau$, and collect $mathbb{E}[ R^2 ]$ on the left.
begin{equation*}
lambda,mathbb{E}[ R^2 ] = rho tau + (1 - lambda) left( 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X tag*{Eq.(6)} label{Eq06}
end{equation*}

Note that the denominator of $mathbb{E}[ R ]$ is just $lambda$. Along with the symmetry, this suggests that the numerator of $mathbb{E}[ R ]$ can be rewritten into a better form invoking the basic ref{Eq01}.
begin{align*}
1 + frac{ rho }{ tau } + frac{ tau }{ rho } - (rho + tau) &= bigl( 1 + frac{ rho }{ tau } - rho bigr)
+ bigl( 1 + frac{ tau }{ rho } - tau bigr) - 1 \
&= frac{ tau + rho - rhotau }{ tau } + frac{ rho + tau - rhotau }{ rho } - 1 \
implies mathbb{E}[ R ] = frac1{rho} + frac1{tau} &- frac1{lambda} = mu_{_X} + mu_{_Y} - frac1{lambda} tag*{Eq.(7)} label{Eq07}
end{align*}

It's not a coincidence that the expectation can be expressed as such. The reason will be elaborated in the next section for Solution.2.



For the sake of computing the numeric value, ref{Eq06} was a good place to stop. With the given parameters $rho = 1/3$ and $tau = 2/5$, we have $mathbb{E}[ R ] = 23/6$, $Q_X = 22$, and $Q_Y = 16$. That makes $mathbb{E}[ R^2 ] = 190/9$ and the variance $$V_R equiv mathbb{E}[ R^2 ] - mathbb{E}[R]^2 = frac{77}{12}~.$$ See the end of Solution.2 for a Mathematica code block for the numerical evaluation and more.



One can quickly check the value of $mathbb{E}[ R ] approx 3.8333$ relative to $mu_{_X} = 3$ and $mu_{_Y} = 2.5$, as well as $V_R approx 6.146667$ in relation to $V_X = 6$ and $V_Y = 15/4$. Both of the quantities for $R$ are slightly larger than the max of $X$ and $Y$, which is reasonable.



Now, if you have a strong inclination for algebraic manipulation, then the following is what you might arrive at, after some trial and error. Recall the shorthand for the 2nd moment ref{Eq03}:
begin{equation*}
mathbb{E}[ R^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } = S_X + S_Y - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8)} label{Eq08}
end{equation*}

Again, this is not a coincidence. Along with ref{Eq07}, their respective 3rd terms seem to be another Geometric random variable with the `success' parameter $lambda$. The proper probabilistic analysis is the subject of Solution.2 up next.



By the way, blindly shuffling the terms around is usually not the best thing one can do. Nonetheless, just for the record, Appendix.A shows one way of going from ref{Eq06} to ref{Eq08} mechanically.





$largetextbf{Solution.2}$



Denote $W equiv min(X,Y)$ as the smaller among the two and $Z equiv max(X,Y)$ as the larger. The key observation to solve this problem is that
$$Z overset{d}{=} R~, qquadqquad textbf{the maximum has the same distribution as the 'rounds'.}$$
This allows one to think about the whole scenario differently (not as rounds of a two-player game). For all $k in mathbb{N}$, since $X perp Y$ we have
begin{align*}
Pr{ Z = k } &= Pr{ X < Y = Z = k } \
&hspace{36pt} + Pr{ Y < X = Z = k } \
&hspace{72pt} + Pr{ X = Y = Z = k } \
&= tau (1 - tau)^{k-1} (1 - (1-rho)^k) \
&hspace{36pt} + rho (1 - rho)^{k-1} (1 - (1-tau)^k) \
&hspace{72pt} + rhotau (1 - rho)^{k-1} (1 - tau)^{k-1} tag*{Eq.(9)} label{Eq09}
end{align*}

In principle, now that the distribution of $Z$ (thus $R$) is completely specified, everything one would like to know can be calculated. This is indeed a valid approach (to obtain the mean and variance), and the terms involved are all basic series with good symmetry.



Note that $Z$ is not Geometric (while $X$, $Y$, and $W$ are), nor is it Negative Binomial. At this point, it is not really of interest that ref{Eq09} can be rearranged into a more compact and illuminating form ...... because we can do even better.



There are two more observations that allow one to not only to better calculate but also understand the whole picture.
begin{align*}
&X+Y = W + Z tag*{Eq.(10)} label{Eq10} \
&W sim mathrm{Geo[lambda]} tag*{Eq.(11)} label{Eq11}
end{align*}

This is the special case of the sum of the order statistics being equal to the original sum. In general with many summands this not very useful, but here with just two terms it is crucial.



Back to the two-player game scenario, the fact that $W$ is Geometric with success probability $lambda = 1 - (1 - rho)(1 - tau)$ is easy to see: a round 'fails' if and only if both flips 'fail', with a probability $(1 - rho)(1 - tau) = 1 - lambda$.



The contour of $W = k$ is L-shaped boundary of the '2-dim tail', which fits perfectly with the scaling nature (memoryless) of the joint of two independent Geometric distribution. Pleas picture in the figure below that $Prleft{W = k_0 + k ~ middle|~~ W > k_0 right} = Pr{ W = k}$ manifests itself as the L-shape scaling away.



The contour of $Z = k$ looks like $daleth$, and together with the L-contour of $W$ they make a cross. This is ref{Eq10} the identity $X+Y = W+Z$, visualized in the figures below. See Appendix.B.1 for the linked in-site posts of related topics.



enter image description here



Immediately we know the expectation to be:
begin{equation*}
mathbb{E}[ Z ] = mathbb{E}[ X + Y - W ] = mu_{_X} + mu_{_Y} - mu_{_W} = frac1{rho} + frac1{tau} - frac1{lambda} tag*{Eq.(7.better)}
end{equation*}



This derivation provides perspectives different from (or better, arguably) those obtained by conditioning on the first round.



Derivation of the 2nd moment reveals a more intriguing properties of the setting.
begin{align*}
mathbb{E}[ Z^2 ] &= mathbb{E}[ (X + Y - W)^2 ] \
&= mathbb{E}[ (X + Y)^2 ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (X + Y) W ] \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] + 2 mathbb{E}[ XY ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (W+Z) W ] qquadbecause X+Y = W+Z\
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] + 2mathbb{E}[ XY ] - 2mathbb{E}[ WZ ]
end{align*}

Here's the kicker: when $X neq Y$, by definition $W$ and $Z$ each take one of them so $WZ = XY$, and when $X = Y$ we have $WZ = XY$ just the same!! Consequently, $mathbb{E}[ ZW ] = mathbb{E}[ XY ] =mu_{_X} mu_{_Y}$ always, and they cancel.
begin{equation*}
mathbb{E}[ Z^2 ] = mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8.better)} label{Eq08better}
end{equation*}

This is the proper derivation, and in ref{Eq08} the 3rd term following $S_X + S_Y$ is indeed $-S_W$.



Keep in mind that both $W$ and $Z$ are clearly correlated with $X$ and $Y$, while our intuition also says that the correlation between $W$ and $Z$ is positive (see Appendix.B for a discussion).



While we're at it, this relation in fact holds true for any (higher) moments, $$mathbb{E}[ Z^n ] = mathbb{E}[ X^n ] + mathbb{E}[ Y^n ] - mathbb{E}[ W^n ]~.$$This is true due to the same argument that gave us $WZ = XY$, and using ref{Eq09} doing the explicit sums to derive this identity is also easy.



Without further ado, the variance of $Z$ (thus the variance of $R$), previously known only as the unsatisfying "ref{Eq06} minus the square of $mathbb{E}[ Z ]$'', can now be put in its proper form.
begin{align*}
V_Z &equiv mathbb{E}[ Z^2 ] - mathbb{E}[ Z ]^2 \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] - ( mu_{_X} + mu_{_Y} - mu_{_W} )^2 \
&= V_X + V_Y - V_W + 2( mu_{_W} mu_{_Z} - mu_{_X} mu_{_Y}) tag*{Eq.(12.a)} \
&= frac{ 1 - rho }{ rho^2 } + frac{ 1 - tau }{ tau^2 } - frac{ 1 - lambda }{ lambda^2 } + 2left( frac1{ lambda } bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) - frac1{ rho tau }right) tag*{Eq.(12.b)}
end{align*}

This expression (or the symbolic one just above) is a perfectly nice formula to me. You can rearrange it to your heart's content, for example, like this
begin{equation*}
V_Z = bigl( frac1{ rho } - frac1{ tau } bigr)^2 - frac1{ lambda^2 } + 2 (frac1{lambda} - frac12) bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) tag*{Eq.(12.c)}
end{equation*}

which emphasizes the role played by the difference between $X-Y$.



I'm not motivated to pursue a 'better' form beyond Eq.(12), and if anyone knows any alternative expressions that are significant either algebraically or probabilistically, please do share.



Let me conclude with a Mathematica code block verifying the numeric values and identities for the variance, the 2nd moment, as well as the earliest algebra of the conditioning of expectation.



(* t = tau, p = rho, m = E[R],  S = E[R^2] , h = lambda = 1-(1-p)(1-t) *) ClearAll[t , p, m, S, V, h, tmp, simp, sq, var, basics]; basics[p_] := {1/p, (1 - p)/p^2, (2 - p)/p^2, (2 + p + p^2)/ p^2};(* EX, VX, E[X^2], E[(1+X)^2]*) 
Row@{basics[1/3], Spacer@10, basics[2/5], Spacer@10, basics[1 - (1 - 1/3) (1 - 2/5)]}
simp = Simplify[#, 0 < p < 1 && 0 < t < 1] &; h = 1 - (1 - p) (1 - t); m = (1 + t/p + p/t - (t + p))/h; sq[a_] := (2 - a)/a^2;(* 2nd moment of a Geometric distribution *) var[a_] := (1 - a)/a^2;
S = tmp /. Part[#, 1]&@Solve[tmp == p t + (1 - p) (1 - t) (1 + 2 m + tmp) + p (1 - t) (1 + 2/t + sq@t) + (1 - p) t (1 + 2/p + sq@p), tmp] //simp (* this simplification doesn't matter ... *)
V = S - m^2 // simp (* neither does this. *)
{tmp = V /. {p -> 1/3, t -> 2/5}, tmp // N} (* below: veryfiy the various identities for 2nd moment and variance. Difference being zero means equal *)
{sq@t + sq@p - sq@h - S, var@p + var@t - var@h + 2 m 1/h - 2/(p t) - V, (1/p - 1/t)^2 - 1/h^2 + 2 (1/p + 1/t - 1/h) (1/h - 1/2) - V} // simp




$largetextbf{Appendix.A: }normalsizetext{an algebraic route from ref{Eq06} to ref{Eq08}}$



Consider ref{Eq06} one part at a time. First, the $2(1 - lambda), mathbb{E}[ R ]$. From the the first line to the 2nd line, $1 - lambda = (1 - rho)(1 - tau)$ is inserted for the 2nd term:
begin{align*}
2(1 - lambda), mathbb{E}[ R ] &= 2(1 - lambda )frac{ - 1}{lambda} + 2(1 - lambda) bigl( frac1{ rho } + frac1{ tau } bigr) \
&= 2frac{lambda - 1}{lambda} + frac{2(1 - rho ) }{ rho }(1 - tau) + (1 - rho ) frac{ 2(1 - tau) }{ rho } \
&= 2 - frac2{lambda} + bigl( frac{2 - rho}{ rho } - 1bigr) (1 - tau) + bigl( frac{2 - tau}{ tau } - 1 bigr) (1 - rho ) \
&= color{magenta}{2 - frac2{lambda} } + frac{2 - rho}{ rho^2 } (1 - tau)rho color{magenta}{- (1 - tau)} + frac{2 - tau}{ tau^2 } (1 - rho )tau color{magenta}{- (1 - rho )}
end{align*}

Next in line are the $Q_X$ terms.
begin{align*}
rho (1 - tau) Q_Y + tau (1 - rho),Q_X &= rho (1 - tau) bigl( frac{ 2 + tau }{ tau^2 } + 1bigr)
+ tau (1 - rho) bigl( frac{ 2 + rho }{ rho^2 } + 1 bigr) \
&= rho bigl( frac{ 2 - tau - tau^2 }{ tau^2 } + 1 - tau bigr) + tau bigl( frac{ 2 - rho - rho^2 }{ rho^2 } + 1 - rho bigr) \
&= rho frac{ 2 - tau}{ tau^2 } + tau frac{ 2 - rho }{ rho^2 } color{magenta}{- 2taurho}
end{align*}

Put things back together in ref{Eq06}, the magenta terms combine with $rhotau$ and $1 - lambda$ (multiplied by 1):
begin{align*}
lambda,mathbb{E}[ R^2 ] &= rho tau + (1 - lambda) + (1 - lambda) 2 mathbb{E}[ R ] + rho (1 - tau) Q_Y + tau (1 - rho),Q_X \
&= rho tau + (1 - lambda) color{magenta}{ {}+ 2 - frac2{lambda} - (1 - tau) - (1 - rho ) - 2taurho} \
&hphantom{{}= rho tau} + frac{2 - rho}{ rho^2 } (1 - tau)rho + frac{2 - tau}{ tau^2 } (1 - rho )tau tag{from $mathbb{E}[R]$}\
&hphantom{{}= rho tau } + tau frac{ 2 - rho }{ rho^2 } + rho frac{ 2 - tau}{ tau^2 } tag{from $Q_X$ etc}\
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } bigl( rho + tau - rhotau bigr) + frac{ 2 - tau }{ tau^2 } bigl( rho + tau - rhotau bigr) \
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } lambda + frac{ 2 - tau }{ tau^2 } lambda
end{align*}

Finally we can divide by the coefficient of $mathbb{E}[ R^2 ]$ on the left hand side and obtain ref{Eq08}.





$largetextbf{Appendix.B.1: }normalsizetext{In-Site Links to Related Questions}$



I found many existing posts about this topic (minimum of two independent non-identical Geometric distributions). In chronological order: 90782, 845706, 1040620, 1056296, 1169142, 1207241, and 2669667.



Unlike the min, only 2 posts about max is found so far: 971214, and 1983481. Neither of the posts go beyond the 1st moment, and only a couple of the answers address the 'real geometry' of the situation.



In particular, consider the trinomial random variable $T$ that splits the 2-dim plane into the 3 regions.
begin{equation*}
T equiv mathrm{Sign}(Y - X) = begin{cases}
hphantom{-{}} 1 & text{if}~~Y > X quad text{, with probability}~~ tau( 1 - rho) / lambda \
hphantom{-{}} 0 & text{if}~~X = Y quad text{, with probability}~~ rho tau / lambda \
-1 & text{if}~~Y < X quad text{, with probability}~~ rho( 1 - tau) / lambda
end{cases}
end{equation*}

This trisection of above-diagonal, diagonal, and below-diagonal is fundamental to the calculation of both $W$ and $Z$.



It can be easily shown that $T perp W$, that the trisection is independent of the minimum. For example, $Prleft{ T = 1~ middle|~~ W = kright} = Pr{ T = 1}$ for all $k$.



Note that $T$ is not independent to $Z$ (even with the corner $k=1$ removed).



In the continuous analogue, ${X,Y,W}$ are Exponential with all the nice properties, and $Z$ is neither Exponential nor Gamma (analogue of Negative Binomial).



The density of the $Z$ analogue is easy and commonly used, but its doesn't seem to have a name. At best one can categorize it as a special case of phase-type (PH distribution).





$largetextbf{Appendix.B.2: }normalsizetext{Covariance between $min(X,Y)$ and $max(X,Y)$, along with related discussions.}$



Recall the expectations: $mathbb{E}[X] = mu_{_X} = 1 / rho$, and the similar
begin{align*}
mu_{_Y} &= frac1{ tau }~, & mu_{_W} &= frac1{ lambda } = frac1{rho + tau - rhotau}~, & &text{and}quad
mu_{_Z} = mu_{_X} + mu_{_Y} - mu_{_W}
end{align*}

The covariance between the minimum and the maximum is
begin{equation*}
C_{WZ} equiv mathbb{E}left[ (W - mu_{_W} ) (Z - mu_{_Z} ) right] = mathbb{E}[ WZ ] - mu_{_W} mu_{_Z} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_Z} tag*{Eq.(B)} label{EqB}
end{equation*}

The last equal sign invokes the intriguing $WZ = XY$ mentioned just before ref{Eq08better}.



A positive covariance (thus correlation) expresses the intuitive idea that when the min is large then the max also tends to be large.



Here we do a 'verbal proof' of $C_{WZ} > 0$ for all combinations of parameters. The whole purpose is not really a technical proof of truth but more illustrating the relations between ${W, Z}$ and ${X, Y}$.



Since $mu_{_X} + mu_{_Y} = mu_{_W} mu_{_Z}$, the covariance $C_{WZ}$ is comparing products (of two non-negative numbers) given a fixed sum. We know that for a fixed sum, the closer the two 'sides' (think rectangle) are the larger the product.



Being the min and max, $mu_{_W}$ and $mu_{_Z}$ are more 'extreme' and can never be as close as $mu_{_X}$ and $mu_{_Y}$, throughout the entire parametric space ${ rho, tau } in (0, 1)^2$. Boom, QED.



(the extreme cases where at least one of ${rho, tau}$ is zero or one are all ill-defined when it comes to min and max)



An algebraic proof for $C_{WZ} > 0$ (or for everything in this appendix) is easy. Let me emphasize the descriptive aspect.



Consider what it means for $W$ as a Geometric distribution to be the minimum of $X$ and $Y$.
begin{align*}
frac1{lambda} &< min( frac1{rho}, frac1{tau} ) &
&text{faster to 'succeed' on average} \
lambda &> max( rho, tau ) & &text{better than the higher success parameter}
end{align*}

That is, the mean of the 'minimum flip' $mu_{_W} < min( mu_{_X}, mu_{_Y} )$ is more extreme at the lower end.



On the other hand, $Z$ is NOT a Geometric distribution. However, one can define an auxiliary $Z' sim mathrm{Geo}[1 / mu_{_Z}]$ with an equivalent mean $mu_{_{Z'}} = mu_{_Z}$. Once we have arrived at ref{EqB}, the role of $Z$ can be replaced: $C_{WZ} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_{Z'}} $



Now we can have similar descriptions for $mu_{_Z}$, while it is actually $mu_{_{Z'}}$ for the Geometric distribution that is being compared with $X$ and $Y$.
begin{align*}
mu_{_Z} &> max( mu_{_X}, mu_{_Y} ) &
&text{slower to 'succeed' on average} \
frac1{ mu_{_Z} } &< min( rho, tau ) & &text{worse than the lower success parameter}
end{align*}

That is, the mean of the 'maximum flip' is more extreme at the higher end.



This concludes the verbal argument for how $mu_{_W}$ and $mu_{_Z}$ are more dissimilar (than the relation between $mu_{_X}$ and $mu_{_Y}$) thus making a smaller product.






share|cite|improve this answer














$largetextbf{Outline}$



Setup the Notations : As titled.



Solution.1 : Direct application of the conditional decomposition of expectation. This is the foolproof approach if one wants a quick numeric evaluation and doesn't want to be bothered with analysis.



Solution.2 : A framework that provides perspectives and better calculation.



Appendix.A : Supplementary material to Solution.1.



Appendix.B.1 : In-site links to existing questions that are closely related.



Appendix.B.2 : Supplementary material to Solution.2.





$largetextbf{Setup the Notations}$



Let $X$ be the total number of trials of Bob's flips when his first head (success) appears.



Let $Y$ be that for Bub. We have $Xsim mathrm{Geo}[rho]$ independent to $Ysim mathrm{Geo}[tau]$.



The following basics for a Geometric distribution will be useful here:
begin{align*}
mu_{_X} &equiv mathbb{E}[X] = frac1{rho} & & tag*{Eq.(1)}
label{Eq01} \
A_X &equiv mathbb{E}[X(X+1)] = frac2{ rho^2 } & &tag*{Eq.(2)} label{Eq02} \
S_X &equiv mathbb{E}[X^2 ] = frac{2 - rho}{ rho^2 } & &tag*{Eq.(3)} label{Eq03} \
V_X &equiv mathbb{V}[X] = frac{1 - rho}{ rho^2 } & &tag*{Eq.(4)} label{Eq04} \
Q_X &equiv mathbb{E}[(X+1)^2] = A_X + mu_{_X} + 1
= frac{ 2 + rho + rho^2 }{ rho^2 } & &tag*{Eq.(5)} label{Eq05}
end{align*}

Recall that we often use $A_X$ to obtain $S_X$ because $A_X$ is easier to derive (it is a more natural quantity for the Geometric distribution).



The shorthands for $Y$ are the same $mu_{_Y}$ and $V_Y$ etc.





$largetextbf{Solution.1}$



Just like how the expectation can be derived (which apparently you know how to), the 2nd moment (thus variance) can be obtained by conditioning on the results of the round. Denote the events of ${ text{Bob head, Bub tail} }$ as just $HT$, and recall that $X$ is for Bob flipping alone as if Bub doesn't exist.
begin{align*}
mathbb{E}[ R^2 ] &= rho tau , mathbb{E}left[ R^2 ,middle|~ HHright]
+ (1 - rho)(1 - tau),mathbb{E}left[ R^2 ,middle|~ TTright] \
&hspace{36pt} + rho (1 - tau) , mathbb{E}left[ R^2 ,middle|~ HTright]
+ tau (1 - rho),mathbb{E}left[ R^2 ,middle|~THright] \
&= rho tau + (1 - rho)(1 - tau),mathbb{E} left[ (1+R)^2 right] + rho (1 - tau) , mathbb{E}left[ (1+Y)^2 right] + tau (1 - rho),mathbb{E}left[ (1+X)^2 right]
end{align*}

Please let me know if you need justification for $mathbb{E}[ R^2 ~|~~TH] = mathbb{E}[ (1+Y)^2 ]$ and the alike. Moving on, use the ref{Eq04} shorthand $Q_X$ and $Q_Y$ for now and rearrange.
$$ mathbb{E}[ R^2 ] = rho tau + (1 - rho)(1 - tau) left( mathbb{E}[ R^2 ] + 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X$$
Denote $lambda = 1 - (1 - rho)(1 - tau) = rho + tau - rho tau$, and collect $mathbb{E}[ R^2 ]$ on the left.
begin{equation*}
lambda,mathbb{E}[ R^2 ] = rho tau + (1 - lambda) left( 2 mathbb{E}[ R ] + 1 right) + rho (1 - tau) Q_Y + tau (1 - rho),Q_X tag*{Eq.(6)} label{Eq06}
end{equation*}

Note that the denominator of $mathbb{E}[ R ]$ is just $lambda$. Along with the symmetry, this suggests that the numerator of $mathbb{E}[ R ]$ can be rewritten into a better form invoking the basic ref{Eq01}.
begin{align*}
1 + frac{ rho }{ tau } + frac{ tau }{ rho } - (rho + tau) &= bigl( 1 + frac{ rho }{ tau } - rho bigr)
+ bigl( 1 + frac{ tau }{ rho } - tau bigr) - 1 \
&= frac{ tau + rho - rhotau }{ tau } + frac{ rho + tau - rhotau }{ rho } - 1 \
implies mathbb{E}[ R ] = frac1{rho} + frac1{tau} &- frac1{lambda} = mu_{_X} + mu_{_Y} - frac1{lambda} tag*{Eq.(7)} label{Eq07}
end{align*}

It's not a coincidence that the expectation can be expressed as such. The reason will be elaborated in the next section for Solution.2.



For the sake of computing the numeric value, ref{Eq06} was a good place to stop. With the given parameters $rho = 1/3$ and $tau = 2/5$, we have $mathbb{E}[ R ] = 23/6$, $Q_X = 22$, and $Q_Y = 16$. That makes $mathbb{E}[ R^2 ] = 190/9$ and the variance $$V_R equiv mathbb{E}[ R^2 ] - mathbb{E}[R]^2 = frac{77}{12}~.$$ See the end of Solution.2 for a Mathematica code block for the numerical evaluation and more.



One can quickly check the value of $mathbb{E}[ R ] approx 3.8333$ relative to $mu_{_X} = 3$ and $mu_{_Y} = 2.5$, as well as $V_R approx 6.146667$ in relation to $V_X = 6$ and $V_Y = 15/4$. Both of the quantities for $R$ are slightly larger than the max of $X$ and $Y$, which is reasonable.



Now, if you have a strong inclination for algebraic manipulation, then the following is what you might arrive at, after some trial and error. Recall the shorthand for the 2nd moment ref{Eq03}:
begin{equation*}
mathbb{E}[ R^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } = S_X + S_Y - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8)} label{Eq08}
end{equation*}

Again, this is not a coincidence. Along with ref{Eq07}, their respective 3rd terms seem to be another Geometric random variable with the `success' parameter $lambda$. The proper probabilistic analysis is the subject of Solution.2 up next.



By the way, blindly shuffling the terms around is usually not the best thing one can do. Nonetheless, just for the record, Appendix.A shows one way of going from ref{Eq06} to ref{Eq08} mechanically.





$largetextbf{Solution.2}$



Denote $W equiv min(X,Y)$ as the smaller among the two and $Z equiv max(X,Y)$ as the larger. The key observation to solve this problem is that
$$Z overset{d}{=} R~, qquadqquad textbf{the maximum has the same distribution as the 'rounds'.}$$
This allows one to think about the whole scenario differently (not as rounds of a two-player game). For all $k in mathbb{N}$, since $X perp Y$ we have
begin{align*}
Pr{ Z = k } &= Pr{ X < Y = Z = k } \
&hspace{36pt} + Pr{ Y < X = Z = k } \
&hspace{72pt} + Pr{ X = Y = Z = k } \
&= tau (1 - tau)^{k-1} (1 - (1-rho)^k) \
&hspace{36pt} + rho (1 - rho)^{k-1} (1 - (1-tau)^k) \
&hspace{72pt} + rhotau (1 - rho)^{k-1} (1 - tau)^{k-1} tag*{Eq.(9)} label{Eq09}
end{align*}

In principle, now that the distribution of $Z$ (thus $R$) is completely specified, everything one would like to know can be calculated. This is indeed a valid approach (to obtain the mean and variance), and the terms involved are all basic series with good symmetry.



Note that $Z$ is not Geometric (while $X$, $Y$, and $W$ are), nor is it Negative Binomial. At this point, it is not really of interest that ref{Eq09} can be rearranged into a more compact and illuminating form ...... because we can do even better.



There are two more observations that allow one to not only to better calculate but also understand the whole picture.
begin{align*}
&X+Y = W + Z tag*{Eq.(10)} label{Eq10} \
&W sim mathrm{Geo[lambda]} tag*{Eq.(11)} label{Eq11}
end{align*}

This is the special case of the sum of the order statistics being equal to the original sum. In general with many summands this not very useful, but here with just two terms it is crucial.



Back to the two-player game scenario, the fact that $W$ is Geometric with success probability $lambda = 1 - (1 - rho)(1 - tau)$ is easy to see: a round 'fails' if and only if both flips 'fail', with a probability $(1 - rho)(1 - tau) = 1 - lambda$.



The contour of $W = k$ is L-shaped boundary of the '2-dim tail', which fits perfectly with the scaling nature (memoryless) of the joint of two independent Geometric distribution. Pleas picture in the figure below that $Prleft{W = k_0 + k ~ middle|~~ W > k_0 right} = Pr{ W = k}$ manifests itself as the L-shape scaling away.



The contour of $Z = k$ looks like $daleth$, and together with the L-contour of $W$ they make a cross. This is ref{Eq10} the identity $X+Y = W+Z$, visualized in the figures below. See Appendix.B.1 for the linked in-site posts of related topics.



enter image description here



Immediately we know the expectation to be:
begin{equation*}
mathbb{E}[ Z ] = mathbb{E}[ X + Y - W ] = mu_{_X} + mu_{_Y} - mu_{_W} = frac1{rho} + frac1{tau} - frac1{lambda} tag*{Eq.(7.better)}
end{equation*}



This derivation provides perspectives different from (or better, arguably) those obtained by conditioning on the first round.



Derivation of the 2nd moment reveals a more intriguing properties of the setting.
begin{align*}
mathbb{E}[ Z^2 ] &= mathbb{E}[ (X + Y - W)^2 ] \
&= mathbb{E}[ (X + Y)^2 ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (X + Y) W ] \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] + 2 mathbb{E}[ XY ] + mathbb{E}[ W^2 ] - 2mathbb{E}[ (W+Z) W ] qquadbecause X+Y = W+Z\
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] + 2mathbb{E}[ XY ] - 2mathbb{E}[ WZ ]
end{align*}

Here's the kicker: when $X neq Y$, by definition $W$ and $Z$ each take one of them so $WZ = XY$, and when $X = Y$ we have $WZ = XY$ just the same!! Consequently, $mathbb{E}[ ZW ] = mathbb{E}[ XY ] =mu_{_X} mu_{_Y}$ always, and they cancel.
begin{equation*}
mathbb{E}[ Z^2 ] = mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] = frac{ 2 - rho }{ rho^2 } + frac{ 2 - tau }{ tau^2 } - frac{ 2 - lambda }{ lambda^2 } tag*{Eq.(8.better)} label{Eq08better}
end{equation*}

This is the proper derivation, and in ref{Eq08} the 3rd term following $S_X + S_Y$ is indeed $-S_W$.



Keep in mind that both $W$ and $Z$ are clearly correlated with $X$ and $Y$, while our intuition also says that the correlation between $W$ and $Z$ is positive (see Appendix.B for a discussion).



While we're at it, this relation in fact holds true for any (higher) moments, $$mathbb{E}[ Z^n ] = mathbb{E}[ X^n ] + mathbb{E}[ Y^n ] - mathbb{E}[ W^n ]~.$$This is true due to the same argument that gave us $WZ = XY$, and using ref{Eq09} doing the explicit sums to derive this identity is also easy.



Without further ado, the variance of $Z$ (thus the variance of $R$), previously known only as the unsatisfying "ref{Eq06} minus the square of $mathbb{E}[ Z ]$'', can now be put in its proper form.
begin{align*}
V_Z &equiv mathbb{E}[ Z^2 ] - mathbb{E}[ Z ]^2 \
&= mathbb{E}[ X^2 ] + mathbb{E}[ Y^2 ] - mathbb{E}[ W^2 ] - ( mu_{_X} + mu_{_Y} - mu_{_W} )^2 \
&= V_X + V_Y - V_W + 2( mu_{_W} mu_{_Z} - mu_{_X} mu_{_Y}) tag*{Eq.(12.a)} \
&= frac{ 1 - rho }{ rho^2 } + frac{ 1 - tau }{ tau^2 } - frac{ 1 - lambda }{ lambda^2 } + 2left( frac1{ lambda } bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) - frac1{ rho tau }right) tag*{Eq.(12.b)}
end{align*}

This expression (or the symbolic one just above) is a perfectly nice formula to me. You can rearrange it to your heart's content, for example, like this
begin{equation*}
V_Z = bigl( frac1{ rho } - frac1{ tau } bigr)^2 - frac1{ lambda^2 } + 2 (frac1{lambda} - frac12) bigl( frac1{ rho } + frac1{ tau } - frac1{ lambda } bigr) tag*{Eq.(12.c)}
end{equation*}

which emphasizes the role played by the difference between $X-Y$.



I'm not motivated to pursue a 'better' form beyond Eq.(12), and if anyone knows any alternative expressions that are significant either algebraically or probabilistically, please do share.



Let me conclude with a Mathematica code block verifying the numeric values and identities for the variance, the 2nd moment, as well as the earliest algebra of the conditioning of expectation.



(* t = tau, p = rho, m = E[R],  S = E[R^2] , h = lambda = 1-(1-p)(1-t) *) ClearAll[t , p, m, S, V, h, tmp, simp, sq, var, basics]; basics[p_] := {1/p, (1 - p)/p^2, (2 - p)/p^2, (2 + p + p^2)/ p^2};(* EX, VX, E[X^2], E[(1+X)^2]*) 
Row@{basics[1/3], Spacer@10, basics[2/5], Spacer@10, basics[1 - (1 - 1/3) (1 - 2/5)]}
simp = Simplify[#, 0 < p < 1 && 0 < t < 1] &; h = 1 - (1 - p) (1 - t); m = (1 + t/p + p/t - (t + p))/h; sq[a_] := (2 - a)/a^2;(* 2nd moment of a Geometric distribution *) var[a_] := (1 - a)/a^2;
S = tmp /. Part[#, 1]&@Solve[tmp == p t + (1 - p) (1 - t) (1 + 2 m + tmp) + p (1 - t) (1 + 2/t + sq@t) + (1 - p) t (1 + 2/p + sq@p), tmp] //simp (* this simplification doesn't matter ... *)
V = S - m^2 // simp (* neither does this. *)
{tmp = V /. {p -> 1/3, t -> 2/5}, tmp // N} (* below: veryfiy the various identities for 2nd moment and variance. Difference being zero means equal *)
{sq@t + sq@p - sq@h - S, var@p + var@t - var@h + 2 m 1/h - 2/(p t) - V, (1/p - 1/t)^2 - 1/h^2 + 2 (1/p + 1/t - 1/h) (1/h - 1/2) - V} // simp




$largetextbf{Appendix.A: }normalsizetext{an algebraic route from ref{Eq06} to ref{Eq08}}$



Consider ref{Eq06} one part at a time. First, the $2(1 - lambda), mathbb{E}[ R ]$. From the the first line to the 2nd line, $1 - lambda = (1 - rho)(1 - tau)$ is inserted for the 2nd term:
begin{align*}
2(1 - lambda), mathbb{E}[ R ] &= 2(1 - lambda )frac{ - 1}{lambda} + 2(1 - lambda) bigl( frac1{ rho } + frac1{ tau } bigr) \
&= 2frac{lambda - 1}{lambda} + frac{2(1 - rho ) }{ rho }(1 - tau) + (1 - rho ) frac{ 2(1 - tau) }{ rho } \
&= 2 - frac2{lambda} + bigl( frac{2 - rho}{ rho } - 1bigr) (1 - tau) + bigl( frac{2 - tau}{ tau } - 1 bigr) (1 - rho ) \
&= color{magenta}{2 - frac2{lambda} } + frac{2 - rho}{ rho^2 } (1 - tau)rho color{magenta}{- (1 - tau)} + frac{2 - tau}{ tau^2 } (1 - rho )tau color{magenta}{- (1 - rho )}
end{align*}

Next in line are the $Q_X$ terms.
begin{align*}
rho (1 - tau) Q_Y + tau (1 - rho),Q_X &= rho (1 - tau) bigl( frac{ 2 + tau }{ tau^2 } + 1bigr)
+ tau (1 - rho) bigl( frac{ 2 + rho }{ rho^2 } + 1 bigr) \
&= rho bigl( frac{ 2 - tau - tau^2 }{ tau^2 } + 1 - tau bigr) + tau bigl( frac{ 2 - rho - rho^2 }{ rho^2 } + 1 - rho bigr) \
&= rho frac{ 2 - tau}{ tau^2 } + tau frac{ 2 - rho }{ rho^2 } color{magenta}{- 2taurho}
end{align*}

Put things back together in ref{Eq06}, the magenta terms combine with $rhotau$ and $1 - lambda$ (multiplied by 1):
begin{align*}
lambda,mathbb{E}[ R^2 ] &= rho tau + (1 - lambda) + (1 - lambda) 2 mathbb{E}[ R ] + rho (1 - tau) Q_Y + tau (1 - rho),Q_X \
&= rho tau + (1 - lambda) color{magenta}{ {}+ 2 - frac2{lambda} - (1 - tau) - (1 - rho ) - 2taurho} \
&hphantom{{}= rho tau} + frac{2 - rho}{ rho^2 } (1 - tau)rho + frac{2 - tau}{ tau^2 } (1 - rho )tau tag{from $mathbb{E}[R]$}\
&hphantom{{}= rho tau } + tau frac{ 2 - rho }{ rho^2 } + rho frac{ 2 - tau}{ tau^2 } tag{from $Q_X$ etc}\
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } bigl( rho + tau - rhotau bigr) + frac{ 2 - tau }{ tau^2 } bigl( rho + tau - rhotau bigr) \
&= 1 - frac2{lambda} + frac{ 2 - rho }{ rho^2 } lambda + frac{ 2 - tau }{ tau^2 } lambda
end{align*}

Finally we can divide by the coefficient of $mathbb{E}[ R^2 ]$ on the left hand side and obtain ref{Eq08}.





$largetextbf{Appendix.B.1: }normalsizetext{In-Site Links to Related Questions}$



I found many existing posts about this topic (minimum of two independent non-identical Geometric distributions). In chronological order: 90782, 845706, 1040620, 1056296, 1169142, 1207241, and 2669667.



Unlike the min, only 2 posts about max is found so far: 971214, and 1983481. Neither of the posts go beyond the 1st moment, and only a couple of the answers address the 'real geometry' of the situation.



In particular, consider the trinomial random variable $T$ that splits the 2-dim plane into the 3 regions.
begin{equation*}
T equiv mathrm{Sign}(Y - X) = begin{cases}
hphantom{-{}} 1 & text{if}~~Y > X quad text{, with probability}~~ tau( 1 - rho) / lambda \
hphantom{-{}} 0 & text{if}~~X = Y quad text{, with probability}~~ rho tau / lambda \
-1 & text{if}~~Y < X quad text{, with probability}~~ rho( 1 - tau) / lambda
end{cases}
end{equation*}

This trisection of above-diagonal, diagonal, and below-diagonal is fundamental to the calculation of both $W$ and $Z$.



It can be easily shown that $T perp W$, that the trisection is independent of the minimum. For example, $Prleft{ T = 1~ middle|~~ W = kright} = Pr{ T = 1}$ for all $k$.



Note that $T$ is not independent to $Z$ (even with the corner $k=1$ removed).



In the continuous analogue, ${X,Y,W}$ are Exponential with all the nice properties, and $Z$ is neither Exponential nor Gamma (analogue of Negative Binomial).



The density of the $Z$ analogue is easy and commonly used, but its doesn't seem to have a name. At best one can categorize it as a special case of phase-type (PH distribution).





$largetextbf{Appendix.B.2: }normalsizetext{Covariance between $min(X,Y)$ and $max(X,Y)$, along with related discussions.}$



Recall the expectations: $mathbb{E}[X] = mu_{_X} = 1 / rho$, and the similar
begin{align*}
mu_{_Y} &= frac1{ tau }~, & mu_{_W} &= frac1{ lambda } = frac1{rho + tau - rhotau}~, & &text{and}quad
mu_{_Z} = mu_{_X} + mu_{_Y} - mu_{_W}
end{align*}

The covariance between the minimum and the maximum is
begin{equation*}
C_{WZ} equiv mathbb{E}left[ (W - mu_{_W} ) (Z - mu_{_Z} ) right] = mathbb{E}[ WZ ] - mu_{_W} mu_{_Z} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_Z} tag*{Eq.(B)} label{EqB}
end{equation*}

The last equal sign invokes the intriguing $WZ = XY$ mentioned just before ref{Eq08better}.



A positive covariance (thus correlation) expresses the intuitive idea that when the min is large then the max also tends to be large.



Here we do a 'verbal proof' of $C_{WZ} > 0$ for all combinations of parameters. The whole purpose is not really a technical proof of truth but more illustrating the relations between ${W, Z}$ and ${X, Y}$.



Since $mu_{_X} + mu_{_Y} = mu_{_W} mu_{_Z}$, the covariance $C_{WZ}$ is comparing products (of two non-negative numbers) given a fixed sum. We know that for a fixed sum, the closer the two 'sides' (think rectangle) are the larger the product.



Being the min and max, $mu_{_W}$ and $mu_{_Z}$ are more 'extreme' and can never be as close as $mu_{_X}$ and $mu_{_Y}$, throughout the entire parametric space ${ rho, tau } in (0, 1)^2$. Boom, QED.



(the extreme cases where at least one of ${rho, tau}$ is zero or one are all ill-defined when it comes to min and max)



An algebraic proof for $C_{WZ} > 0$ (or for everything in this appendix) is easy. Let me emphasize the descriptive aspect.



Consider what it means for $W$ as a Geometric distribution to be the minimum of $X$ and $Y$.
begin{align*}
frac1{lambda} &< min( frac1{rho}, frac1{tau} ) &
&text{faster to 'succeed' on average} \
lambda &> max( rho, tau ) & &text{better than the higher success parameter}
end{align*}

That is, the mean of the 'minimum flip' $mu_{_W} < min( mu_{_X}, mu_{_Y} )$ is more extreme at the lower end.



On the other hand, $Z$ is NOT a Geometric distribution. However, one can define an auxiliary $Z' sim mathrm{Geo}[1 / mu_{_Z}]$ with an equivalent mean $mu_{_{Z'}} = mu_{_Z}$. Once we have arrived at ref{EqB}, the role of $Z$ can be replaced: $C_{WZ} = mu_{_X} mu_{_Y} - mu_{_W} mu_{_{Z'}} $



Now we can have similar descriptions for $mu_{_Z}$, while it is actually $mu_{_{Z'}}$ for the Geometric distribution that is being compared with $X$ and $Y$.
begin{align*}
mu_{_Z} &> max( mu_{_X}, mu_{_Y} ) &
&text{slower to 'succeed' on average} \
frac1{ mu_{_Z} } &< min( rho, tau ) & &text{worse than the lower success parameter}
end{align*}

That is, the mean of the 'maximum flip' is more extreme at the higher end.



This concludes the verbal argument for how $mu_{_W}$ and $mu_{_Z}$ are more dissimilar (than the relation between $mu_{_X}$ and $mu_{_Y}$) thus making a smaller product.







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Dec 2 at 2:49

























answered Mar 21 at 1:05









Lee David Chung Lin

3,45531038




3,45531038












  • Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
    – BoLe
    Mar 22 at 10:45


















  • Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
    – BoLe
    Mar 22 at 10:45
















Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
– BoLe
Mar 22 at 10:45




Thank you for this comprehensive, inspiring answer. Solution One I don't have problem understanding, I'm upset a bit that I didn't complete my analysis. I turned back while trying to form this recursion involving nonlinear terms, like E[(1+X)^2]. Solution Two and the Appendices I am able to follow as well.
– BoLe
Mar 22 at 10:45


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2544529%2fvariance-of-alternate-flipping-rounds%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Cabo Verde

Gyllenstierna