Scikit Learn Cheat Sheet
Scikit Learn Cheat Sheet
Scikit Learn Cheat Sheet
2019 Sc k t-Learn Cheat Sheet: Python Mach ne Learn ng (art cle) - DataCamp
Most of you who are learn ng data sc ence w th Python w ll have def n tely heard already about
scikit-learn , the open source Python l brary that mplements a w de var ety of mach ne learn ng,
preprocess ng, cross-val dat on and v sual zat on algor thms w th the help of a un f ed nterface.
If you're st ll qu te new to the f eld, you should be aware that mach ne learn ng, and thus also th s
Python l brary, belong to the must-knows for every asp r ng data sc ent st.
That's why DataCamp has created a scikit-learn cheat sheet for those of you who have already
started learn ng about the Python package, but that st ll want a handy reference sheet. Or, f you st ll
have no dea about how scikit-learn works, th s mach ne learn ng cheat sheet m ght come n handy
to get a qu ck f rst dea of the bas cs that you need to know to get started.
E ther way, we're sure that you're go ng to f nd t useful when you're tackl ng mach ne learn ng
problems!
Th s scikit-learn cheat sheet w ll ntroduce you to the bas c steps that you need to go through to
mplement mach ne learn ng algor thms successfully: you'll see how to load n your data, how to
preprocess t, how to create your own model to wh ch you can f t your data and pred ct target labels,
how to val date your model and how to tune t further to mprove ts performance.
In short, th s cheat sheet w ll k ckstart your data sc ence projects: w th the help of code examples, you'll
have created, val dated and tuned your mach ne learn ng models n no t me.
(Cl ck above to download a pr ntable vers on or read the onl ne vers on below.)
A Basic Example
Your data needs to be numer c and stored as NumPy arrays or Sc Py sparse matr ces. Other types that
are convert ble to numer c arrays, such as Pandas DataFrame, are also acceptable.
Standardization
Normalization
Binarization
L near Regress on
Na ve Bayes
KNN
K Means
Model Fitting
Supervised learning
>>> lr.fit(X, y)
>>> knn.fit(X_train, y_train)
>>> svc.fit(X_train, y_train)
>>> k_means.fit(X_train)
fi f (
https://www.datacamp.com/commun ty/blog/sc k t-learn-cheat-sheet
i ) 5/9
09.07.2019 Sc k t-Learn Cheat Sheet: Python Mach ne Learn ng (art cle) - DataCamp
>>> pca_model = pca.fit_transform(X_train)
Prediction
Classification Metrics
Accuracy Score
Confus on Matr x
Regression Metrics
https://www.datacamp.com/commun ty/blog/sc k t-learn-cheat-sheet 6/9
09.07.2019 Sc k t-Learn Cheat Sheet: Python Mach ne Learn ng (art cle) - DataCamp
R2 Score
Clustering Metrics
Homogene ty
V-measure
Cross-Validation
Grid Search
>>> print(grid.best_score_)
>>> print(grid.best_estimator_.n_neighbors)
>>> print(rsearch.best_score_)
Going Further
Beg n w th our sc k t-learn tutor al for beg nners, n wh ch you'll learn n an easy, step-by-step way how
to explore handwr tten d g ts data, how to create a model for t, how to f t your data to your model and
how to pred ct target values. In add t on, you'll make use of Python's data v sual zat on l brary
matplotl b to v sual ze your results.