How to clone an scikit-learn estimator including its data?

  1. model.fit() returns the model itself (the same object). So you don't have to assign it to a different variable as it's just aliasing.

  2. You can use deepcopy to copy the object in a similar way to what loading a pickled object does.

So if you do something like:

from copy import deepcopy

model = MultinomialNB()
model.fit(np.array(X), np.array(y))

model2 = deepcopy(model)

model2.partial_fit(np.array(Z),np.array(w)), np.unique(y))
# ...

model2 will be a distinct object, with the copied parameters of model, including the "trained" parameters.


from copy import deepcopy

model = MultinomialNB()
model.fit(np.array(X), np.array(y))

model2 = deepcopy(model)

weight_vector_model = array(model.coef_[0])
weight_vector_model2 = array(model2.coef_[0])

model2.partial_fit(np.array(Z),np.array(w)), np.unique(y))

weight_vector_model = array(model.coef_[0])
weight_vector_model2 = array(model2.coef_[0])

model and model2 are now completely different objects. partial_fit() on model2 will have no impact on model. The two weight vectors are same after deepcopy but differ after partial_fit() on model2