### Introduction

The Chain Rule of Conditional Probabilities is also called the general product rule. It allows the calculation of any number of the associate distribution of a set of random variables. It permits by using only conditional probabilities.

The Chain Rule is very helpful in the study of Bayesian networks that define a probability distribution in terms of conditional probabilities. In this article, we will understand the Chain Rule in detail.

### Description

Conditional probability arises:

- When the probability of occurrence of a specific event changes
- When one or more conditions are satisfied.
- These conditions once more are events.
- In technical terms, if A and B are two events then the conditional probability of A w.r.t B is denoted by P (A|B).
- Therefore, when we speak in terms of conditional probability, just for an instance, we make a statement corresponding to “The probability of event A given that B has previously happened”.

**If A and B are independent events**

From the definition of independent events, the occurrence of event A is not dependent on event B. Therefore P (A|B) = P (A).

**If A and B are mutually exclusive**

As A and B are disjoint events, the probability that A will occur when B has already occurred is 0. Therefore, P (A|B) = 0

Conditional probability of P (A|B) is undefined when P (B) =0. That is acceptable as if P (B) =0. It means that event B never occurs. Therefore, it does not make sense to talk about the probability of A given B.

If A and B are two events in a sample space S, then the conditional probability of A given B is defined as

P (A|B) =P (A∩B) P (B), when P (B)>0.

### The Chain rule

We can rearrange the formula for conditional probability to get the Chain rule:

P (A, B) = p (A|B) p (B)

We can range this for three variables:

P(A,B,C) = P(A| B,C) P(B,C) = P(A|B,C) P(B|C) P(C)

and in general to n variables:

P(A1, A2, …, An) = P(A1| A2, …, An) P(A2| A3, …, An) P(An-1|An) P(An)

In general, we refer to this as the chain rule.

This formula is particularly important for Bayesian Belief Nets. It delivers a means of calculating the full joint probability distribution. The conditional probability of the aforementioned is a probability measure. Therefore, it fulfills probability axioms. In specific,

Axiom 1: For any event A, P (A|B) ≥0.

Axiom 2: Conditional probability of B given B is 1, i.e., P (B|B) =1.

Axiom 3: If A1,A2,A3,⋯ are disjoint events, then P(A1∪A2∪A3⋯|B)=P(A1|B)+P(A2|B)+P(A3|B)+⋯