Machine Learning Hello World in Less than 50 lines

October 22, 2018

Machine Learning Hello World in Less than 50 lines

Supervised Learning has proved to be useful when it comes to learning from labeled data. This article aims to help beginners and intermediate students and implement their own machine learning models. Machine learning models can be grouped into three main classes namely Supervised , Unsupervised and reinforcement learning.

What is Supervised Learning?

Imagine having a group of oranges and apples mixed together. You want to somehow sort these but the number of apples and oranges you have is insanely many making the work a little tedious for you. You however have a machine that could do the sorting except it needs to know what distinguishes an apple from am orange. So all you need now is to build a classifier and then embed it into your machine. Let just say you were successful in building the classifier using some features from both fruits and your machine now sorts the apples from the oranges. It's not by magic that your machine sorts the apples but instead through a process called Supervised Learning.

First Machine Learning Project

Now that you have a general idea of how supervised learning works, we are going to implement a model to classify some labels based on a given dataset we have.

Things you will need installed:

Python ( 2 or 3)
pandas
Numpy
sci-kit learn

If you have anaconda installed you don't need to install these as anaconda comes with python and the libraries above already installed. Having Problem? feel free to shoot me a message of the problem and I will be glad to help.

I will be using google's open source online Jupyther Notebook for this tutorial. Our dataset will be the classic Iris dataset. the first thing we need to do is import the data. Fortunately sci-kit learn already comes with this dataset so no need to download. the rest of the project is pure python code, a snippet of which can be found below. Full code is on my github repository.


# Import libraries needed

from sklearn import datasets

import pandas as pd

import numpy as np

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression



# Get the iris Dataset

ds = datasets.load_iris()

print('label_names: {}'.format(ds.target_names))

print('feature_names: {}'.format(ds.feature_names))

Full version of code with outputs can be found here.

Search This Blog

Felix Agbavor

Machine Learning Hello World in Less than 50 lines

What is Supervised Learning?

First Machine Learning Project

Comments

Post a Comment

Popular Posts

5 Steps to become a successful self-taught programmer

Facebook is using an old drug dealer tactic to keep users hooked