Thanks for the idea Jacob, I am going to consider writing that article.

To answer your question. Random Forest in sklearn has feature_importances_ and LightGBM has plot_importance. Both output feature importance.

When working on feature engineering you look at feature importance and you try to engineer more new features similar to the most important one — with similar I don’t mean correlated, but looking at the problem from a perspective of that feature.

When you add a new feature, I also check how the model responds to it — how do lower /higher values of the feature influence the target. Let me know if you would like to learn more about these tools.

Written by

Senior Data Scientist, tweeting twitter.com/romanorac.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store