BNLearn Bayesian Networks – how is the structure decided

bayesian networkbnlearnpython

I'm currently playing around with BNLearn on Python, and creating BN's from Pandas datasets. I'm not really sure why/how the package decides on the structure of the BNs it creates; see for example this BN generated from the Titanic dataset:

enter image description here

I got the above from an example on Kaggle, with the relevant code being

# Structure learning
DAG = bn.structure_learning.fit(dfnum, methodtype='hc', root_node='Survived', bw_list_method='nodes', verbose=3)

# Plot
G = bn.plot(DAG)

# Parameter learning
model = bn.parameter_learning.fit(DAG, dfnum, verbose=3);

I would have thought that all variables affect Survived (ie more being a Naive Bayes classifier than a full on BN here). How has this structure been generated, and what difference does this make (in terms of accuracy and applicability) from Naive Bayes?

Best Answer

First of all, bnlearn "only" learns Bayesian networks, so the arrows cannot be interpreted as causal directions. The documentation claims that causality "is incorporated in Bayesian graphical models" but that is only true for causal Bayesian graphical models. Bayesian networks are mainly used to describe stochastic dependencies and contain only limited causal information. E.g., if you give a dataset of two dependent binary variables $X$ and $Y$ to bnlearn, it will either return $X\to Y$ or $Y\to X$ independent of whether $X$ caused $Y$ or $Y$ caused $X$, because the causal relation cannot be deduced just from the observations of $X$ and $Y$.

Thus, if you say "I would have thought that all variables affect Survived", referring to the direction of the arrows, then this insinuates that you presume the arrows to indicate causal effect, which is not the case.

Now to your question: "How has this structure been generated?" You have used the method Hillclimbing which is a greedy local method, i.e. it is fast but can lead to wrong results.

Next, the NaiveBayes method: First, note that the documentation says: "Naive Bayes is a special case of Bayesian Model where the only edges in the model are from the feature variables to the dependent variable." If this was true, conditioning on the dependent variable would make all the features dependent. However, the defining property of Naive Bayes is that it is actually the other way around: conditioning on the dependent variable makes all the features independent.

Summary:

  • If you are interested in causation, don't use bnlearn, but causality software, see e.g. here, here, or here.
  • If you are (only) interested in the probabilistic graphical models described by Bayesian networks, then I would rather suggest trying something more trustworthy like the R package bnlearn and friends.
Related Question