Error: Classification metrics can't handle a mix of multiclass-multioutput and multilabel-indicator targets
I was creating the y array manually and it seems that was my mistake. I used now MultiLabelBinarizer
to create it, as the following example and now it works:
train_foo = [['sci-fi', 'thriller'],['comedy'],['sci-fi', 'thriller'],['comedy']]
mlb = MultiLabelBinarizer()
mlb_label_train = mlb.fit_transform(train_foo)
X = np.loadtxt("docvecs.txt", delimiter=",")
cv_scores = []
mlknn = MLkNN(k=3)
scores = cross_val_score(mlknn, X, mlb_label_train, cv=5, scoring='f1_macro')
cv_scores.append(scores)
you can find the documentation for MultiLabelBinarizer
here.
Can you show the first couple elements of y? Are you using scikit-multilearn? Also, if you can please use the 0.1.0 release candidate of scikit-multilearn, there second error is most likely a bug that was fixed in master, and a new version is planned for release in a couple of days.
You can get the master via pip:
pip uninstall -y scikit-multilearn
pip install https://github.com/scikit-multilearn/scikit-multilearn/archive/master.zip