The Ultimate Guide To Bayesian Network

by | Nov 19, 2019 | Data Science

10 Min Read. |

In this guide, we shall explain the Bayesian Network. Bayesian networks are all about probability and statistics. When you talk of probability, this funny definition of probability comes to mind.

“The probability of meeting someone you know becomes higher when you are with someone you are not supposed to be seen with.”

A Bayesian network is something similar. They belong to the family of probabilistic graphical models. You can use these models to know something about and uncertain domain.

What are Bayesian Networks?

If you ask students of statistics, “What are Bayesian Networks,” you will get an answer something like this. A Bayesian network is a marked cyclic graph that represents a Joint Probability Distribution (JPD) over a set of random variables.

Does this definition go over your head? It most probably will. We shall explain the concept by using a simple Bayesian network example.

Bayesian Network Example

Consider a person, say, James, who has a chance of suffering from a back injury. We shall represent this event by a variable ‘Back’ (B). An injury to your back can cause backache. Therefore, we have another variable with us, ‘Ache’ (A).

How can the A originate? James can fall down while performing a sporting activity. Now, you get another variable ‘Sport’ (S) connected with both B and A.

Backache can also come from using an incorrect posture when sitting in an uncomfortable chair in your office. This event gives rise to two completely different variables like ‘Chair’ (C), and Worker (W).

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Date: 5th Oct, 2020 (Saturday)
Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

Let us now see how the Bayesian network example pans out.

There are five different variables, a couple of which are not at all interconnected in any way. However, all the variables are binary. It implies that each variable can induce a True (T) value or a False (F) value.

Here are some inferences that you can make.

(i)  The variable Back has the nodes Chair and Sport as its parents.
(ii) The variable Ache is the child of Back
(iii) Similarly, the parent of the Worker is Chair.

Now, you can notice that Chair and Sport are independent variables, but when you bring in the variable Back, they become conditionally dependent. One can also refer to this relationship as ‘Explaining Away.’

When Chair is given, you find that Worker and Back are conditionally independent. However, when you factor in the variable Back, the variable Ache becomes independent of its ancestors, Chair, and Sport.

Therefore, the conditional independence statement of the Bayesian Network offers a compact factorization of Joint Probability Distribution (JPD).

Thus, instead of factorizing the joint distribution of each variable using the chain rule,

P(C,S,W,B,A) = P(C)P(S|C)P(W|S,C)P(B|W,S,C)P(A|B,W,S,C) the Bayesian network defines a unique JPD P(C,S,W,B,A) = P(C)P(S)P(W|C)P(B|S,C)P(A|B)

The Bayesian Network form reduces the number of model parameters from 25-1 to 10 parameters, thereby making it easy to deduce.

The backache example is also one of the best examples of Bayesian network applications. We shall look at them later on in the article. In the meanwhile, we shall discuss another Bayesian network example that is also a common one used in various classrooms when explaining the concept.

We shall now look at the probability of grass becoming wet or dry due to the occurrence of certain conditions.

Whether the grass gets wet or remains dry depends upon the weather (pun unintended). Now, the weather can be rainy, cloudy, or sunny. The grass has only two possibilities. It is either dry or wet. There is no third option.

However, there is a different variable that can play a crucial role in the grass becoming wet, even if it does not rain. Yes, you have sprinklers in lawns that spray jets of water on the lawns to keep it wet.

The sprinkler also has two probabilities. It can either be off or on. If it rains, you do not use the sprinkler at all. Hence, it remains off, but the grass becomes wet.

At the same time, the weather can be sunny, and therefore, you expect the grass to be dry. Suddenly, you find the gardener switching on the sprinkler, and the grass becomes wet again, even when it does not rain.

Here is a classic Bayesian network example. Using Bayesian networks, we can explain what the probability of the sprinkler or the rain is that can make the grass wet.

Let us assume that the grass is wet now. What could be the reason? It could either be raining or the sprinklers could have been turned on. Using Bayesian network applications, we can deduce that it is the rain that is the most significant contributor to the grass becoming wet.

Now, we put forward a new question, “What are Bayesian networks in R.”

Bayesian Networks in R

Here are some characteristics of Bayesian networks in R.

1. Explaining away

We have seen the concept of explaining away in our earlier backache example. You can find the same in the sprinkler and rain example, as well. In this case, the child is wet grass (W). The sprinkler S and the rain R are its parents, but they are independent. However, when you factor in W, the S and R become dependent.

2. Bottom-Up and Top-Down Reasoning

Bayesian networks are generative models because they support both direct probability and inverse probability. When you go from the causes to the effects, it is a direct probability. You look at the sky and see that it is cloudy. You guess that the rain is going to pour down and cause the grass to become wet.

This is an example of top-down reasoning. Conversely, you see that the sun is shining, but the grass is still wet. You reason that the grass is wet because of the use of the sprinkler. Thus you go from the effect to the cause. It is also known as Bottom-up reasoning.

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Date: 5th Oct, 2020 (Saturday)
Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

3. Enable Casualty Discussion

Casualty discussions explain the causes and effects of any phenomenon. It enables you to deduce what phenomenon causes what. In the rain-sprinkler experiment, the cause is the rain, and the effect is the grass getting wet. Bayesian networks enable you to provide explanations for casualty discussions without an experiment.

Some of the other characters of Bayesian networks in R are conditional independence, temporal models, and so on.

What are Bayesian Network Applications?

Here are some typical Bayesian network applications in fields as diverse as medicine, computers, spam filtering, and semantic search.

1. Medicine

Bayesian networks have vast applications in medicine. It is handy when you do research in medicine. Here is a Bayesian network example in medicine. Anybody will say that cancer and heart attack are two independent variables. They do not have any relations between them.

However, if you bring in a common parent, say smoking, cancer and heart attack become conditionally dependent. Therefore, if you stop smoking, you do not get cancer. At the same time, it also reduces the chance of a heart attack.

2. Spam Filtering

Emails are the most common methods of communication today. Just as you love to send and receive emails, hackers love it too. They use this mode of communication to send spam and junk emails. It not only clogs your inbox but also puts your documents classifications and systems at risk.

The use of spam filters is an excellent way to get rid of such emails. The designing of the spam filter is a typical Bayesian network application.

The program is such that it identifies completely independent words like MONEY, FREE, CLICK HERE, and so on and makes them conditionally dependent when you factor in spam.

Spam makes all these words conditionally dependent, thereby bringing the concept of Bayesian networks into the picture.

Similarly, there are many fields where you can use Bayesian network applications.

Bayesian Network in R

Bayesian Network in R Source – Data Flair

Uses of Bayesian networks

Bayesian networks have tremendous applications in fields where there is a need for predicting something, but the outcome is uncertain. What are Bayesian networks for if you just had to guess the outcome?

Bayesian networks obviate the need for guessing as they help the user make smart, well-informed, quantifiable, and justifiable decisions.

Bayesian network applications include fields like medicine for diagnosing ailments, identifying financial risk in the insurance and banking sector, and for modeling ecosystems.

Other uses of Bayesian networks include monitoring and alerting, weather forecasting, sports betting, portfolio allocation, and so on.

Advantages of Bayesian networks

Here are some of the benefits of Bayesian networks.

(i) Since it encodes all the variables, it helps in handling the missing data successfully.
(ii) When you use this concept for studying casual human relationships, Bayesian networks can help understand a problem better and thereby, helps you to forecast the consequences better if there is an intervention.
(iii) It becomes easy to represent prior data and knowledge by using casual and probabilistic semantics.
(iv) You also avoid overfitting of data when you Bayesian networks
(v) The best aspect of Bayesian networks is that it predicts the result of an intervention without intervening.

However, one should know how to use them correctly.

Bayesian Network Example in our Daily Life

Bayesian networks have significant applications in our daily lives. Here is one such Bayesian network example that we use every day, without probably realizing its importance.

Why do you turn on the AC? The answer is obvious. You need the room to be cool. If the ambient temperature is cold, the room will not need an AC. It will already remain cool. Thus, we can say that both the ambient temperature and the AC influences whether the room is cool

Let us look at the Bayesian network with conditional probability tables.

We have a cyclic relationship between the AC, the cold room, and the ambient temperature. Therefore, there are three variables with two possible values, True (T) or False (F).

Air conditioner

Temperature

Temperature True False True False
F 0.4 0.6 0.2 0.8
T 0.01 0.99    

 

Air-conditioner

Temperature

Cool Room

    True False
False False 0.0 1.0
False True 0.8 0.2
True False 0.9 0.1
True True 0.99 0.01

If you go through the conditional probability table, you can deduce the joint probability function in the Bayesian network example

(i) CR denotes – Cool Room
(ii) AC denotes – Air-conditioner running
(iii) AT denotes – Ambient Temperature is cool

P(CR,AC,AT) = P(CR|AC,AT)P(AC|AT)P(AT)
= 0.99 X 0.01 X 0.2
= 0.00198
It amounts to 35.77%

What are the limitations of the Bayesian network?

Bayesian networks have significant advantages because of their remarkable power and potential for addressing inferential procession. However, they have inherent limitations, as well.
(i) The most significant problem with Bayesian networks is the computational difficulty of exploring a previously unknown network.
(ii) The extent and quality of the prior beliefs used in Bayesian inference processing can also be an issue. If the prior knowledge is not reliable, the Bayesian network will also not provide reliable information.

Download Detailed Curriculum and Get Complimentary access to Orientation Session

Date: 5th Oct, 2020 (Saturday)
Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)
  • This field is for validation purposes and should be left unchanged.

How do you use Bayesian networks for decision making?

What are Bayesian networks, and how can they help in decision making? We have seen what Bayesian networks are and the benefits that accrue from using them. Let us now see how Bayesian networks help in decision making.

Consider that you are the owner of an international soccer team. You have to decide before the start of the season the amount you need investing in new players. There are many factors to consider.

(i) The likely income from the sales of unwanted players
(ii) The amount of money that other teams spend on buying players
(iii) The negative impact on team performance of making several changes

It is a typical Bayesian network application. It helps you make the decision using Bayesian networks.

Final Thoughts

Coming back to the quote that we had made at the start of the article, we can state that it is a classic Bayesian network example. There are high levels of uncertainty.

You have a limited amount of data and reliable information. However, you can deduce that the probability of meeting someone you know is always high when you are in the company of someone you should not be with.

To make a career in Bayesian statistics, you should be aware of the most common Bayesian Statistics interview questions.

Join the Data Science Course to get in-depth details and hands-on experience in the Data Science Sphere.

Register for FREE Orientation Class on Data Science & Analytics for Career Growth

Date: 5th Oct, 2020 (Saturday)
Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)

  • This field is for validation purposes and should be left unchanged.

You May Also Like…

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *