Another nice Medium post from Benjamin Obi Tayo has a good summary of the types of issues you should always be mindful of when you get a new data set:
When we train a machine-learning model, we almost always report some performance metric, such as accuracy, recall, or F1-score.
Pandas introduced pipe() starting from version 0.16.2. pipe() enables user-defined methods in method chains.
The plot_tree() function allows you to create a diagram of steps present in a decision tree model:
Estimators can be displayed with a HTML representation when shown in a jupyter notebook. This can be useful to diagnose or visualize a Pipeline with many ...