TypeError: get_params() missing 1 required positional argument: 'self'
This error is almost always misleading, and actually means that you're calling an instance method on the class, rather than the instance (like calling dict.keys()
instead of d.keys()
on a dict
named d
).*
And that's exactly what's going on here. The docs imply that the best_estimator_
attribute, like the estimator
parameter to the initializer, is not an estimator instance, it's an estimator type, and "A object of that type is instantiated for each grid point."
So, if you want to call methods, you have to construct an object of that type, for some particular grid point.
However, from a quick glance at the docs, if you're trying to get the params that were used for the particular instance of the best estimator that returned the best score, isn't that just going to be best_params_
? (I apologize that this part is a bit of a guess…)
For the Pipeline
call, you definitely have an instance there. And the only documentation for that method is a param spec which shows that it takes one optional argument, deep
. But under the covers, it's probably forwarding the get_params()
call to one of its attributes. And with ('clf', LogisticRegression)
, it looks like you're constructing it with the class LogisticRegression
, rather than an instance of that class, so if that's what it ends up forwarding to, that would explain the problem.
* The reason the error says "missing 1 required positional argument: 'self'" instead of "must be called on an instance" or something is that in Python, d.keys()
is effectively turned into dict.keys(d)
, and it's perfectly legal (and sometimes useful) to call it that way explicitly, so Python can't really tell you that dict.keys()
is illegal, just that it's missing the self
argument.
I finally get the problem solved. The reason is exactly as what abarnert said.
Firstly I tried:
pipeline = LogisticRegression()
parameters = {
'penalty': ('l1', 'l2'),
'C': (0.01, 0.1, 1, 10)
}
and it works well.
With that intuition I modified the pipeline to be:
pipeline = Pipeline([
('vect', TfidfVectorizer(stop_words='english')),
('clf', LogisticRegression())
])
Note that there is a ()
after LogisticRegression
.
This time it works.