Machine learning your first object detection

Today we talk about machine learning.

Apple was recently introduced CoreML. CoreML was build to work with a trained model and can be used easily in mobile App. But how to create this model?

Apple released a few weeks ago, Turicreate, an open source framework to create easily model for CoreML.

Machine learning can be used for recommendations, object detection, image classification, image similarity or activity classify for example.

The goal of this tutorial

In this tutorial we will learn how to create an object detection script. We will train the model to find the head of a cat 🐱 . At the end, by giving an image containing a cat 🐈 this one would give us the position with a prediction confidence.

Object detection

Object detection is the task of simultaneously classifying and localizing object instances in an image.

Coding!

First step Install Turi

All steps are released on Mac. But you can try on Ubuntu for example.

To train your model Turi requires python, we use pip and virtualenv for this.

Open terminal and run

sudo easy_install pip
sudo pip install virtualenv

Now, create new place for this amazing project 😸

mkdir theevilcat
cd theevilcat

Create the virtualenv

virtualenv venv
source venv/bin/activate

You are now in your virtualenv

Let’s install turicreate

pip install -U turicreate

Second part? Coding!

First, download images, cat.zip and unzip in the root folder of project, this is your image database.

https://www.dropbox.com/sh/rgm8cfsyhvgw1xg/AACsdIgrM05oj4uMAhsAv9Pfa?dl=0

Your first machine learning script. We define “annotation”, Annotation require label and position of object, where height and width are simple size bound, and x and y are the center of your rectangle

ml-first-obj-detection

Now create file build.py

#First include module turicreate
import turicreate as tc
#Define where the cat can be found in image
annotations = tc.SArray([
 [
  {'label': 'cat', 'coordinates': {'height': 1287, 'width': 1226, 'x': 1561, 'y': 1952}}
 ],
 [
  {'label': 'cat', 'coordinates': {'height': 2049, 'width': 2077, 'x': 2212, 'y': 1379}}
 ],
 [
  {'label': 'cat', 'coordinates': {'height': 1871, 'width': 1892, 'x': 1890, 'y': 1121}}
 ],
 [
  {'label': 'cat', 'coordinates': {'height': 1344, 'width': 1427, 'x': 2601, 'y': 814}}
 ],
 [
  {'label': 'cat', 'coordinates': {'height': 2716, 'width': 2728, 'x': 4429, 'y': 1802}}
 ]
])
#load images
images = tc.SArray([
 tc.Image('cat1.jpg'),
 tc.Image('cat2.jpg'),
 tc.Image('cat3.jpg'),
 tc.Image('cat4.jpg'),
 tc.Image('cat5.jpg')
])
#merge images and annotations
data = tc.SFrame({'image': images, 'annotations': annotations})
# this part work only on Mac.
#data['image_with_ground_truth'] = \
#    tc.object_detector.util.draw_bounding_boxes(data['image'], #data['annotations'])
#data.explore()
train_data, test_data = data.random_split(0.8)
# Create a model
model = tc.object_detector.create(train_data, max_iterations=1000)
# Save predictions to an SArray
predictions = model.predict(test_data)
# Evaluate the model and save the results into a dictionary
metrics = model.evaluate(test_data)
# Save the model for later use in Turi Create
model.save('mymodel.model')
# Export for use in Core ML
model.export_coreml('MyCustomObjectDetector.mlmodel')

Now you can build your model, but this part really took a while . you can take ☕️ this part is really longer… really long… It took 20h on my mbp 😴😴😴

This step can be reduced if you using Cuda with Nvida GPU

python build.py

You can reduce the building time with smaller max_iterations, but the accuracy will be reduced!!!

After this build, you get your first trained model, and CoreML format. If you want to skip this long building part, you can download model from dropbox

Try your dataset

Now, we write a first test script for your trained model Download test.jpg in root folder of project https://www.dropbox.com/sh/rgm8cfsyhvgw1xg/AACsdIgrM05oj4uMAhsAv9Pfa?dl=0

import turicreate as tc
#load image
test = tc.SFrame({'image': [tc.Image('test.jpg')]})
#load the model
model = tc.load_model('mymodel.model')
test['predictions'] = model.predict(test)
print(test['predictions'])
#this part work only on Mac
#test['image_with_predictions'] = \
#    tc.object_detector.util.draw_bounding_boxes(test['image'], #test['predictions'])
#test.explore()

Normally the script returns the coordinates where the cat will be found in image with the confidence.

That’s fun and easy isn’t it?