Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
572 views
in Technique[技术] by (71.8m points)

python - AttributeError: 'str' object has no attribute 'decode' in fitting Logistic Regression Model

I am currently trying to create a binary classification using Logistic regression. Currently I am in determining the feature importance. I already did the data preprocessing (One Hot Encoding and sampling) and ran it with XGBoost and RandomFOrestClassifier, no problem

However, when I tried to fit a LogisticRegression model (below is my code in Notebook),

from sklearn.linear_model import LogisticRegression

#Logistic Regression
# fit the model
model = LogisticRegression()
# fit the model
model.fit(np.array(X_over), np.array(y_over))
# get importance
importance = model.coef_[0]
# summarize feature importance
df_imp = pd.DataFrame({'feature':list(X_over.columns), 'importance':importance})
display(df_imp.sort_values('importance', ascending=False).head(20))

# plot feature importance
plt.bar(list(X_over.columns), importance)
plt.show()

it gave an error

...
~AppDataLocalContinuumanaconda3libsite-packagesjoblibparallel.py in <listcomp>(.0)
    223         with parallel_backend(self._backend, n_jobs=self._n_jobs):
    224             return [func(*args, **kwargs)
--> 225                     for func, args, kwargs in self.items]
    226 
    227     def __len__(self):

~AppDataLocalContinuumanaconda3libsite-packagessklearnlinear_model\_logistic.py in _logistic_regression_path(X, y, pos_class, Cs, fit_intercept, max_iter, tol, verbose, solver, coef, class_weight, dual, penalty, intercept_scaling, multi_class, random_state, check_input, max_squared_sum, sample_weight, l1_ratio)
    762             n_iter_i = _check_optimize_result(
    763                 solver, opt_res, max_iter,
--> 764                 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
    765             w0, loss = opt_res.x, opt_res.fun
    766         elif solver == 'newton-cg':

~AppDataLocalContinuumanaconda3libsite-packagessklearnutilsoptimize.py in _check_optimize_result(solver, result, max_iter, extra_warning_msg)
    241                 "    https://scikit-learn.org/stable/modules/"
    242                 "preprocessing.html"
--> 243             ).format(solver, result.status, result.message.decode("latin1"))
    244             if extra_warning_msg is not None:
    245                 warning_msg += "
" + extra_warning_msg

AttributeError: 'str' object has no attribute 'decode'    

I googled it and mostly all the responses said that this error is because the scikit-learn library tried to decode an already decoded string. But I don't know how to solve it in my case here. I made sure all my data is either integer or float64, and no strings.

Please help, and let me know if I missed anything in my question


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I encountered the same problem. Unfortunately, I work with sensible data and have difficulties sharing it, but maybe my description can help:

  • My data is all integer/numeric.
  • It is also unlikely that a weird data point is the problem, since I can shrink my data to 10 rows and 20 columns and encounter the same problem. I can't even single out a specific row or column that causes the error, it seems that it depends on the complexity of the solving problem.
  • It only happens with the lbfgs solver
  • I encountered the problem with sklearn version 0.22.2 and 0.23.0

My Parameters (most of them are defaults):

log_reg_parameters = {
    "penalty": 'l2',
    "fit_intercept": True,
    "random_state": 1,
    "solver": "lbfgs",
    "max_iter": 100
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...