Intuitively use marginalization

Is it always true that if you sum over some variable then you can "remove" that varaible in each expression of the product inside the sum? For example:

$ sum_x P(x, y)P(y | x)P(y | x, z) = P(y)P(y)P(y|z)? $

It seems that this is often the case, but what I'm asking is if this is a general "rule" you can use?

asked Dec 16 '18 at 19:28

Ferus

103

add a comment |

Is it always true that if you sum over some variable then you can "remove" that varaible in each expression of the product inside the sum? For example:

$ sum_x P(x, y)P(y | x)P(y | x, z) = P(y)P(y)P(y|z)? $

It seems that this is often the case, but what I'm asking is if this is a general "rule" you can use?

asked Dec 16 '18 at 19:28

Ferus

103

add a comment |

Is it always true that if you sum over some variable then you can "remove" that varaible in each expression of the product inside the sum? For example:

$ sum_x P(x, y)P(y | x)P(y | x, z) = P(y)P(y)P(y|z)? $

It seems that this is often the case, but what I'm asking is if this is a general "rule" you can use?

asked Dec 16 '18 at 19:28

Ferus

103

Is it always true that if you sum over some variable then you can "remove" that varaible in each expression of the product inside the sum? For example:

$ sum_x P(x, y)P(y | x)P(y | x, z) = P(y)P(y)P(y|z)? $

It seems that this is often the case, but what I'm asking is if this is a general "rule" you can use?

probability probability-theory conditional-probability marginal-probability

asked Dec 16 '18 at 19:28

Ferus

103

asked Dec 16 '18 at 19:28

Ferus

103

asked Dec 16 '18 at 19:28

Ferus

103

asked Dec 16 '18 at 19:28

Ferus

103

asked Dec 16 '18 at 19:28

Ferus

103

add a comment |

1 Answer
1

active

oldest

votes

The notation you use is a little ambiguous, but in general the way you describe it, it is incorrect. For example, take $X, Y, Z$ as discrete random variables that are independent. Let $X$ take values $x_1, dots , x_n$. According to your conjecture we should find that,

$$sum_{i=1}^n mathbb{P}(X = x_i, Y = y) mathbb{P}(X = x_i, Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) $$

But this is false since if $X, Y, Z$ are independent we obtain,

$$sum_{i=1}^n mathbb{P}(X = x_i) mathbb{P}(Y = y) mathbb{P}(X = x_i) mathbb{P}(Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) sum_{i=1}^n mathbb{P}(X = x_i)^2 $$

The sum of squares of the probabilities is not in general 1, take $X sim text{Bernoulli}(p)$ for example; that is, $X$ takes value $1$ with probability $p$ and $0$ with probability $(1-p)$. We do have that $(1-p) + p = 1$ but in general $(1-p)^2 + p^2 neq 1$.

You should try to understand why marginalization works in the first place, it all comes back down to the following law of probability,

If $A, B$ are disjoint events then,

$$mathbb{P}(A cup B) = mathbb{P}(A) + mathbb{P}(B), (*)$$

From this we have that if $B_1, dots, B_n$ is a partition of the sample space, meaning

$$bigcup_{i=1}^n B_i = Omega, B_i cap B_j = emptyset, i neq j$$

Then we obtain by applying the additivity of probability $(*)$, since we have that,

$$A = A cap Omega = A cap bigcup_{i=1}^n B_i = bigcup_{i=1}^n A cap B_i $$

Then,

$$mathbb{P} (A) = sum_{i=1}^n mathbb{P}(A cap B_i) $$

Now, coming back to the language of events as values of random variables, how is this marginilisation? Consider $X$ as a discrete random variable taking values $x_1, dots , x_n$. Then the events, ${X = x_1 }, {X = x_2 }, dots , {X = x_n } $ constitute a partition of the sample space! (since the cases are exhaustive and disjoint).

So if we let $A = {Y = y}$ and $B_i = {X = x_i }$ we have the familiar procedure of "marginilisation"

$$mathbb{P} (Y = y) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i), (**)$$

Note that this also works when there is a conditional probability as long as that conditional probability does not involve the variable you are marginilising over, again, try to link it back to the basics, a conditional event is of the form,

$$mathbb{P}(A | C) = frac{mathbb{P}(A cap C)}{mathbb{P}(C)} $$

For our marginilisation formula we have from $(**)$ that if we wanted to condition on $Z = z$ for example we can do the following,

$$mathbb{P} (Y = y, Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i, Z = z)$$

Then we have,

$$frac{mathbb{P} (Y = y, Z = z)}{mathbb{P}(Z = z)} = sum_{i=1}^n frac{mathbb{P}(Y = y, X = x_i, Z = z)}{mathbb{P}(Z = z)}$$

So in fact we recover our "marginilisation" with just simple arithmetic,

$$mathbb{P}(Y = y | Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i| Z = z)$$

In summary, when thinking about laws of probability, try to link it back to the fundamental laws of probability.

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

$begingroup$
Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.
$endgroup$
– Ferus
Dec 16 '18 at 23:44

$begingroup$
You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition
$endgroup$
– symchdmath
Dec 16 '18 at 23:49

$begingroup$
Alright, thanks
$endgroup$
– Ferus
Dec 17 '18 at 0:03

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3043051%2fintuitively-use-marginalization%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

$$sum_{i=1}^n mathbb{P}(X = x_i, Y = y) mathbb{P}(X = x_i, Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) $$

But this is false since if $X, Y, Z$ are independent we obtain,

$$sum_{i=1}^n mathbb{P}(X = x_i) mathbb{P}(Y = y) mathbb{P}(X = x_i) mathbb{P}(Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) sum_{i=1}^n mathbb{P}(X = x_i)^2 $$

You should try to understand why marginalization works in the first place, it all comes back down to the following law of probability,

If $A, B$ are disjoint events then,

$$mathbb{P}(A cup B) = mathbb{P}(A) + mathbb{P}(B), (*)$$

From this we have that if $B_1, dots, B_n$ is a partition of the sample space, meaning

$$bigcup_{i=1}^n B_i = Omega, B_i cap B_j = emptyset, i neq j$$

Then we obtain by applying the additivity of probability $(*)$, since we have that,

$$A = A cap Omega = A cap bigcup_{i=1}^n B_i = bigcup_{i=1}^n A cap B_i $$

Then,

$$mathbb{P} (A) = sum_{i=1}^n mathbb{P}(A cap B_i) $$

So if we let $A = {Y = y}$ and $B_i = {X = x_i }$ we have the familiar procedure of "marginilisation"

$$mathbb{P} (Y = y) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i), (**)$$

$$mathbb{P}(A | C) = frac{mathbb{P}(A cap C)}{mathbb{P}(C)} $$

For our marginilisation formula we have from $(**)$ that if we wanted to condition on $Z = z$ for example we can do the following,

$$mathbb{P} (Y = y, Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i, Z = z)$$

Then we have,

$$frac{mathbb{P} (Y = y, Z = z)}{mathbb{P}(Z = z)} = sum_{i=1}^n frac{mathbb{P}(Y = y, X = x_i, Z = z)}{mathbb{P}(Z = z)}$$

So in fact we recover our "marginilisation" with just simple arithmetic,

$$mathbb{P}(Y = y | Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i| Z = z)$$

In summary, when thinking about laws of probability, try to link it back to the fundamental laws of probability.

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

$begingroup$
Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.
$endgroup$
– Ferus
Dec 16 '18 at 23:44

$begingroup$
You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition
$endgroup$
– symchdmath
Dec 16 '18 at 23:49

$begingroup$
Alright, thanks
$endgroup$
– Ferus
Dec 17 '18 at 0:03

add a comment |

$$sum_{i=1}^n mathbb{P}(X = x_i, Y = y) mathbb{P}(X = x_i, Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) $$

But this is false since if $X, Y, Z$ are independent we obtain,

$$sum_{i=1}^n mathbb{P}(X = x_i) mathbb{P}(Y = y) mathbb{P}(X = x_i) mathbb{P}(Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) sum_{i=1}^n mathbb{P}(X = x_i)^2 $$

You should try to understand why marginalization works in the first place, it all comes back down to the following law of probability,

If $A, B$ are disjoint events then,

$$mathbb{P}(A cup B) = mathbb{P}(A) + mathbb{P}(B), (*)$$

From this we have that if $B_1, dots, B_n$ is a partition of the sample space, meaning

$$bigcup_{i=1}^n B_i = Omega, B_i cap B_j = emptyset, i neq j$$

Then we obtain by applying the additivity of probability $(*)$, since we have that,

$$A = A cap Omega = A cap bigcup_{i=1}^n B_i = bigcup_{i=1}^n A cap B_i $$

Then,

$$mathbb{P} (A) = sum_{i=1}^n mathbb{P}(A cap B_i) $$

So if we let $A = {Y = y}$ and $B_i = {X = x_i }$ we have the familiar procedure of "marginilisation"

$$mathbb{P} (Y = y) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i), (**)$$

$$mathbb{P}(A | C) = frac{mathbb{P}(A cap C)}{mathbb{P}(C)} $$

For our marginilisation formula we have from $(**)$ that if we wanted to condition on $Z = z$ for example we can do the following,

$$mathbb{P} (Y = y, Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i, Z = z)$$

Then we have,

$$frac{mathbb{P} (Y = y, Z = z)}{mathbb{P}(Z = z)} = sum_{i=1}^n frac{mathbb{P}(Y = y, X = x_i, Z = z)}{mathbb{P}(Z = z)}$$

So in fact we recover our "marginilisation" with just simple arithmetic,

$$mathbb{P}(Y = y | Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i| Z = z)$$

In summary, when thinking about laws of probability, try to link it back to the fundamental laws of probability.

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

$begingroup$
Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.
$endgroup$
– Ferus
Dec 16 '18 at 23:44

$begingroup$
You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition
$endgroup$
– symchdmath
Dec 16 '18 at 23:49

$begingroup$
Alright, thanks
$endgroup$
– Ferus
Dec 17 '18 at 0:03

add a comment |

$$sum_{i=1}^n mathbb{P}(X = x_i, Y = y) mathbb{P}(X = x_i, Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) $$

But this is false since if $X, Y, Z$ are independent we obtain,

$$sum_{i=1}^n mathbb{P}(X = x_i) mathbb{P}(Y = y) mathbb{P}(X = x_i) mathbb{P}(Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) sum_{i=1}^n mathbb{P}(X = x_i)^2 $$

You should try to understand why marginalization works in the first place, it all comes back down to the following law of probability,

If $A, B$ are disjoint events then,

$$mathbb{P}(A cup B) = mathbb{P}(A) + mathbb{P}(B), (*)$$

From this we have that if $B_1, dots, B_n$ is a partition of the sample space, meaning

$$bigcup_{i=1}^n B_i = Omega, B_i cap B_j = emptyset, i neq j$$

Then we obtain by applying the additivity of probability $(*)$, since we have that,

$$A = A cap Omega = A cap bigcup_{i=1}^n B_i = bigcup_{i=1}^n A cap B_i $$

Then,

$$mathbb{P} (A) = sum_{i=1}^n mathbb{P}(A cap B_i) $$

So if we let $A = {Y = y}$ and $B_i = {X = x_i }$ we have the familiar procedure of "marginilisation"

$$mathbb{P} (Y = y) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i), (**)$$

$$mathbb{P}(A | C) = frac{mathbb{P}(A cap C)}{mathbb{P}(C)} $$

For our marginilisation formula we have from $(**)$ that if we wanted to condition on $Z = z$ for example we can do the following,

$$mathbb{P} (Y = y, Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i, Z = z)$$

Then we have,

$$frac{mathbb{P} (Y = y, Z = z)}{mathbb{P}(Z = z)} = sum_{i=1}^n frac{mathbb{P}(Y = y, X = x_i, Z = z)}{mathbb{P}(Z = z)}$$

So in fact we recover our "marginilisation" with just simple arithmetic,

$$mathbb{P}(Y = y | Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i| Z = z)$$

In summary, when thinking about laws of probability, try to link it back to the fundamental laws of probability.

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

$$sum_{i=1}^n mathbb{P}(X = x_i, Y = y) mathbb{P}(X = x_i, Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) $$

But this is false since if $X, Y, Z$ are independent we obtain,

$$sum_{i=1}^n mathbb{P}(X = x_i) mathbb{P}(Y = y) mathbb{P}(X = x_i) mathbb{P}(Z = z) = mathbb{P}(Y=y) mathbb{P}(Z = z) sum_{i=1}^n mathbb{P}(X = x_i)^2 $$

You should try to understand why marginalization works in the first place, it all comes back down to the following law of probability,

If $A, B$ are disjoint events then,

$$mathbb{P}(A cup B) = mathbb{P}(A) + mathbb{P}(B), (*)$$

From this we have that if $B_1, dots, B_n$ is a partition of the sample space, meaning

$$bigcup_{i=1}^n B_i = Omega, B_i cap B_j = emptyset, i neq j$$

Then we obtain by applying the additivity of probability $(*)$, since we have that,

$$A = A cap Omega = A cap bigcup_{i=1}^n B_i = bigcup_{i=1}^n A cap B_i $$

Then,

$$mathbb{P} (A) = sum_{i=1}^n mathbb{P}(A cap B_i) $$

So if we let $A = {Y = y}$ and $B_i = {X = x_i }$ we have the familiar procedure of "marginilisation"

$$mathbb{P} (Y = y) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i), (**)$$

$$mathbb{P}(A | C) = frac{mathbb{P}(A cap C)}{mathbb{P}(C)} $$

For our marginilisation formula we have from $(**)$ that if we wanted to condition on $Z = z$ for example we can do the following,

$$mathbb{P} (Y = y, Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i, Z = z)$$

Then we have,

$$frac{mathbb{P} (Y = y, Z = z)}{mathbb{P}(Z = z)} = sum_{i=1}^n frac{mathbb{P}(Y = y, X = x_i, Z = z)}{mathbb{P}(Z = z)}$$

So in fact we recover our "marginilisation" with just simple arithmetic,

$$mathbb{P}(Y = y | Z = z) = sum_{i=1}^n mathbb{P}(Y = y, X = x_i| Z = z)$$

In summary, when thinking about laws of probability, try to link it back to the fundamental laws of probability.

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

edited Dec 16 '18 at 23:48

answered Dec 16 '18 at 21:40

symchdmath

39816

answered Dec 16 '18 at 21:40

symchdmath

39816

answered Dec 16 '18 at 21:40

symchdmath

39816

$begingroup$
Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.
$endgroup$
– Ferus
Dec 16 '18 at 23:44

$begingroup$
You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition
$endgroup$
– symchdmath
Dec 16 '18 at 23:49

$begingroup$
Alright, thanks
$endgroup$
– Ferus
Dec 17 '18 at 0:03

add a comment |

$begingroup$
Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.
$endgroup$
– Ferus
Dec 16 '18 at 23:44

$begingroup$
You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition
$endgroup$
– symchdmath
Dec 16 '18 at 23:49

$begingroup$
Alright, thanks
$endgroup$
– Ferus
Dec 17 '18 at 0:03

Great explanation! Though I don’t understand how you say A and B are disjoint if B is a partition of the sample space? Also I think there is a typo in the last formula.

– Ferus
Dec 16 '18 at 23:44

You're right there was a typo in the last formula. As for A and B being disjoint, that only refers to the formula (*), if A and B are disjoint then we can add the probabilities over them. $B$ cannot be a partition as it is a single set (unless $B = Omega$), but the sequence of events $B_1, B_2, dots , B_n$ can form a partition

– symchdmath
Dec 16 '18 at 23:49

Alright, thanks

– Ferus
Dec 17 '18 at 0:03

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

6mOGSI7M8 KKh1RlXuyleScfU

搜尋此網誌

Xrfgtjtk