Raspberry pi robot: real-time object detection using remote Tensorflow server

Finished product first:

The video above shows real-time objection detection using a neural network model called Single Shot MultiBox Detector (SSD). The neural network model was trained on (and runs with) Tensorflow.

The “AI” model works much better than older Computer Vision (OpenCV) techniques. For example, here’s an example of a previous attempt to track objects with the color “tomato”. The way that worked was by filtering a camera frame for a specific color, then masking out that color to find the contour of an object.

There were many problems with this approach. For example, color-based detection is sensitive to lighting changes. It performs differently depending on whether it’s day or night time. Also, we get false detections because many other objects in the scene may have the same color.

But, one advantage of classic OpenCV techniques is they’re simple and fast. Fast enough to run in real time on a live camera feed on a tiny Raspberry Pi processor.

On the other hand, AI models are more accurate and powerful. They can detect many different objects in various orientations (even if they’re partially hidden or chopped out of frame). They’re less sensitive to lighting changes. 

But the downside is they’re expensive to run. The Raspberry Pi doesn’t have quite enough horsepower. I tried it and it’s possible… but I wasn’t happy with the performance. I found some interesting USB AI accelerators like Google’s Coral but they’re pricey. 

So I decided to go with the “Mars Rover” approach and use a remote inference server (running on a PC in my living room). This buys us “infinite” computing power for a fixed cost of network latency. As long as your network is fast, this scheme works well.

One thing going for us is even though the Raspberry Pi camera can capture HD images, the SSD AI model only needs 300×300 pixel inputs. So we can downsize the images before sending them over the network to improve performance.

Here’s the Raspberry Pi code. All it does is capture frames, encode them as Base64 strings and makes requests to our remote inference server: 

Now for the object detection neural network. Google provides a set of pre-trained models for object detection in their Model Zoo. The model I picked was ssdlite_mobilenet_v2_coco. You’ll also need this file to convert the detection IDs to text labels: coco-labels-paper.txt 

I made a Python module to load and run the pre-trained Tensorflow object detection model. The class Predictor in coco_predictor.py  below loads the saved model and provides a predict() function for running inference on new input images.

coco_predictor.py

Finally, the Flask inference server that runs on a PC. The Raspberry Pi sends it images and it replies with detections:

coco_flask_server.py

Oh, and if you’re curious about the Robot Kit I’m using, it’s made by Adeept. They call it the Mars Rover PiCar-B:

Mars Rover PiCar-B

The hardware is excellent and well-designed. All the pieces fit together perfectly and mount onto a solid acrylic chassis with bolts. There is a steering rack and a RWD drivetrain that sends power from one motor to both rear wheels. You can pretty much follow the instructions and everything fits together like a Lego set. And when you’re not using it in robot mode, it’s a good looking “case” that lets you use it like a regular Raspberry Pi on your desk.

The software, however, is so-so. You definitely need programming experience. Some things might not work right out of the box, and some things need fiddling around with. On the plus side, they do provide tons of working code for all the various sensors, servos and components. So as long as you are comfortable with Python, you can use their code as excellent references. You can pick and choose and mix and match what you need.

I like this robot kit because it uses a Raspberry Pi (which I already have). So it uses standard Linux stuff. You can use state-of-the-art software and write regular programs like a civilized person. And because it uses a Pi, you get WiFi, Ethernet, Bluetooth, USB, HDMI, etc for free. No painful caveman Arduino programming. It runs on CR 18650 lithium ion batteries (which I also have from salvaging old laptop batteries). It can also run on the regular Pi USB power source.

“Behind the scenes”: Driving school. Programming the head.

I hope to make it autonomous and self-recharging one day!

Happy hacking!
aaron@secretsciencelab.com