Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
April 6, 2022 10:31 pm GMT

Introduction to Scikit-Learn

In this article, we will be discussing Scikit learn. Scikit is an open-source Python library which provides a range of supervised and unsupervised machine learning algorithms. Besides this, Scikit also contains very powerful packages which include:
NumPy
Matplotlib
SciPy
The above packages must be installed (using the Terminal) and imported in order to implement Scikit learn. In the same way, we need to import Scikit. Scikit is built upon SciPy (Science Python) which, also, must be installed.
To install SciPy, type the command below in the Terminal:
Pip install scipy
Scikit-learn comes from sample datasets, such as iris and digits. To use the afore mentioned, we need to import SVM (Support Vector Machine). SVM is a form of machine learning which is used to analyze data.
We can take digits dataset and it will categorize the numbers for us. Lets consider the code below:

_import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn import svm

digits= datasets.load_digits()
print(digits.data)_

The Output of the above code will be:
[[ 0. 0. 5. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 10. 0. 0.]
[ 0. 0. 0. ..., 16. 9. 0.]
...,
[ 0. 0. 1. ..., 6. 0. 0.]
[ 0. 0. 2. ..., 12. 0. 0.]
[ 0. 0. 10. ..., 12. 1. 0.]]

The imported libraries above gives us access to the features that can be used to classify digits sample. The same can be done with images. Lets consider the following line of code:

_import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn import svm

digits= datasets.load_digits()
print(digits.target)
print(digits.images[0])_

OUTPUT:
[0 1 2 ..., 8 9 8] // target of the data
[[ 0. 0. 5. 13. 9. 1. 0. 0.] // image of the data
[ 0. 0. 13. 15. 10. 15. 5. 0.]
[ 0. 3. 15. 2. 0. 11. 8. 0.]
[ 0. 4. 12. 0. 0. 8. 8. 0.]
[ 0. 5. 8. 0. 0. 9. 8. 0.]
[ 0. 4. 11. 0. 1. 12. 7. 0.]
[ 0. 2. 14. 5. 10. 12. 0. 0.]
[ 0. 0. 6. 13. 10. 0. 0. 0.]]

In the output above, both the digits and the image of the digits are printed.
digits.target give the ground truth for the digit dataset, i.e., the number corresponding to each digit image. It should be mentioned that data is always a 2D array which has a shape (n_sample, n_features) even though the original data may have had a different shape. In the case of digits, each original sample is an image of shape (8, 8) which can be accessed using digits.image.
A look at learning and predicting
Using the above work where we have used a dataset sample of 10 possible classes (digits from 0 - 9), we will need to predict the digits when the image is given. To achieve this, we need an estimator which helps to predict the classes to which unseen samples belong. An estimator is a Python object that implements classification using the methods:
fit(x,y) and predict(T). Lets consider the following example:

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn import svm
digits= datasets.load_digits() // dataset
clf = svm.SVC(gamma=0.001, C=100)
print(len(digits.data))
x,y=digits.data[:-1],digits.target[:-1] // train the data
clf.fit(x,y)
print('Prediction:', clf.predict(digits.data[-1])) //predict data
plt.imshow(digits.images[-1],cmap=plt.cm.gray_r,
interpolation="nearest")
plt.show()

OUTPUT:
_1796
Prediction: [8]

Image Here_

In the above example, we had first found the length and loaded 1796 examples. Next, we have used this data as a learning data, where we need to test the last element and first negative element. Also, we need to check whether the machine has predicted the right data or not. For that, we had used Matplotlib where we had displayed the image of digits.
In a nutshell, we have digits data, we got the target, we fit and predict it and thats all.
Now, we can go ahead and visualize the target labels with an image as in the code below:

_import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn import svm

digits= datasets.load_digits()
//Join the images and target labels in a list
images_and_labels = list(zip(digits.images, digits.target))

//for every element in the list
for index, (image, label) in enumerate(images_and_labels[:8]):
//initialize a subplot of 2X4 at the i+1-th position
plt.subplot(2, 4, index + 1)
//Display images in all subplots
plt.imshow(image, cmap=plt.cm.gray_r,interpolation='nearest')
//Add a title to each subplot
plt.title('Training: ' + str(label))
//Show the plot
plt.show()
_
OUPUT:

Images here

As can be seen in the code above, we have used the zip function to join the images and target labels in a list and then save it into a variable, images_and_labels.
Then we have indexed the first eight elements in a grid of 2 by 4 at each position and displayed the images with the help of Matplotlib and added the title as training.


Original Link: https://dev.to/mukumbuta/introduction-to-scikit-learn-3eji

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To