You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Throws an IndexError when it gets down to DiscreteFactor:
WARNING:pgmpy:Found unknown state name. Trying to switch to using all state names as state numbers
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[1], line 21
18 predict_data = predict_data.mask(mask)
20 # predict throws error
---> 21 y_pred = model.predict(predict_data)
File [~/.conda/envs/bnlearn/lib/python3.10/site-packages/pgmpy/models/BayesianNetwork.py:730](https://ood-mccleary.ycrc.yale.edu/node/r102u31n01.mccleary.ycrc.yale.edu/37837/lab/tree/Documents/Uncertainty_Network/~/.conda/envs/bnlearn/lib/python3.10/site-packages/pgmpy/models/BayesianNetwork.py#line=729), in BayesianNetwork.predict(self, data, stochastic, n_jobs)
727 pred_values = []
729 # Send state_names dict from one of the estimated CPDs to the inference class.
--> 730 pred_values = Parallel(n_jobs=n_jobs)(
731 delayed(model_inference.map_query)(
732 variables=missing_variables,
733 evidence=data_point.to_dict(),
734 show_progress=False,
735 )
736 for index, data_point in tqdm(
737 data_unique.iterrows(), total=data_unique.shape[0]
738 )
739 )
741 df_results = pd.DataFrame(pred_values, index=data_unique.index)
742 data_with_results = pd.concat([data_unique, df_results], axis=1)
...
File [~/.conda/envs/bnlearn/lib/python3.10/site-packages/joblib/parallel.py:1918](https://ood-mccleary.ycrc.yale.edu/node/r102u31n01.mccleary.ycrc.yale.edu/37837/lab/tree/Documents/Uncertainty_Network/~/.conda/envs/bnlearn/lib/python3.10/site-packages/joblib/parallel.py#line=1917), in Parallel.__call__(self, iterable)
1916 output = self._get_sequential_output(iterable)
1917 next(output)
-> 1918 return output if self.return_generator else list(output)
1920 # Let's create an ID that uniquely identifies the current call. If the
1921 # call is interrupted early and that the same instance is immediately
1922 # re-used, this id will be used to prevent workers that were
1923 # concurrently finalizing a task from the previous call to run the
1924 # callback.
1925 with self._lock:
File [~/.conda/envs/bnlearn/lib/python3.10/site-packages/pgmpy/factors/discrete/DiscreteFactor.py:569](https://ood-mccleary.ycrc.yale.edu/node/r102u31n01.mccleary.ycrc.yale.edu/37837/lab/tree/Documents/Uncertainty_Network/~/.conda/envs/bnlearn/lib/python3.10/site-packages/pgmpy/factors/discrete/DiscreteFactor.py#line=568), in DiscreteFactor.reduce(self, values, inplace, show_warnings)
567 phi.cardinality = phi.cardinality[var_index_to_keep]
568 phi.del_state_names([var for var, _ in values])
--> 569 phi.values = phi.values[tuple(slice_)]
571 if not inplace:
572 return phi
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
Solution
This is an easy fix. I just changed the following line:
It needs to be changed for both the stochastic and non-stochastic cases, as well as in predict_probability, I'm assuming.
We probably want to put in the documentation that we will use available evidence, but will only be predicting for the missing_variables, not all missing variables. This could alternatively be changed to fill in any values that are missing for the entire DataFrame.
The text was updated successfully, but these errors were encountered:
@vsocrates Thanks a lot for reporting this. This should indeed have been clearer in the documentation. I also really like your idea of just filling in all the missing values in the given dataframe. I will try to implement that and will try to figure out the best way to deal with that in case of predict_probability.
Subject of the issue
When there are multiple missing values in the DataFrame passedi nto the
predict
function for BayesianNetwork, it throws an error.Your environment
Steps to reproduce
Using the documentation example:
Expected behaviour
Shouldn't throw an error.
Actual behaviour
Throws an IndexError when it gets down to
DiscreteFactor
:Solution
This is an easy fix. I just changed the following line:
pgmpy/pgmpy/models/BayesianNetwork.py
Line 710 in 5f52e03
to
Two things to note:
predict_probability
, I'm assuming.missing_variables
, not all missing variables. This could alternatively be changed to fill in any values that are missing for the entire DataFrame.The text was updated successfully, but these errors were encountered: