Inspecting TensorFlow Lite image classification model

What to know before implementing TFLite model in mobile app

In previous posts, either about building a machine learning model or using transfer learning to retrain existing one, we could look closer at their architecture directly in the code. But what if we get *.tflite model from an external source? How do we know how to handle it properly? In this blog post, we’ll look closer at what we can do to get enough knowledge for plugging-in TensorFlow Lite image classification model into Android application.

We’ll investigate two different models:

  • Mnist model created in one of the previous blog posts,
  • MobileNet_v2 model, taken from TensorFlow hosted models website.

Example Android app

First, let’s create a simple Android app that can handle all of our models. For a simplified camera preview setup we will use CameraView – an open source library that is up to 10 lines of code will enable us a possibility to process camera output. 

Here is most of the MainActivity code, we’ll use in our app:

ClassificationFrameProcessor is a class by us, that will run the inference process with TensorFlow Lite model.

Because process() method is run on a background thread, all we have to do is:

  1. Get bitmap representing camera preview with the appropriate size (model’s input size),
  2. Transform bitmap into bytes,
  3. Interpret bytes by our machine learning model,
  4. Translate inference output (labels probabilities) into human-readable results. 

Now what we need to do is to provide valid configuration for our frame processor, so TensorFlow lite model receives data in expected shape and type. In our example app there are 2 models already saved in assets/ directory:

  • mnist.tflite and labels_mnist.txt
  • mobilenet_v2_1.0_224.tflite and labels_mobilenet.txt

Investigating model

There are a couple of different ways of gathering information about *.tflite model. In our test project we have a base class, that we would like to configure for MNIST and MobileNet v2 models:

Documentation

If you use a well-known model like MobileNet v2, it’s pretty likely, all pieces of information are already available. You can find them e.g. here: https://keras.io/applications/#mobilenetv2. MobileNet can have different input sizes, but the default one is 224×224 pixels, 3 channels each. Name convention says that MobileNet models have size at the end of the filename.
When it comes to input values normalization, there are two conventions, not always well-documented. The default range for Keras and TensorFlow is [-1, 1] – it means that each channel can have a value between -1 and 1, reflecting the range: 0-255. This information can be found among the others in Keras utility source code.

But if your model comes from a retrained MobileNet model taken from TensorFlow Hub, it’s pretty likely that values are in range [0, 1]. More about it here:

Colab, python code

Another way to learn about the model is to load it with Python tf.lite.Interpreter, either on your machine or Colab notebook. For our mnist.tflite model, we can do:

More about tf.lite.Interpreter can be found in TensorFlow documentation.

As output of it, we’ll get:

It means that as a input of the model we need to pass 1x28x28 array of floats, and as output we’ll get 1×10 array of floats. What it can tell us?

  • There is only 1 color channel, it’s pretty likely images should be in B&W mode,
  • There are 10 output labels,
  • Output are float values, and the layer name is “Softmax”, what can suggest that array can contain a list of probabilities that sum to 1 (see how Softmax function work).

What we don’t know for sure:

  • What are values ranges. They don’t have to be [0,1] or [-1,1],
  • How input image should be preprocessed. E.g. Mnist uses reversed colors, so 0 means 100% white, max value means 100% black.

What about our MobileNet model?

Input shape matches information from Keras documentation – 224×224 px with 3 channel (this doesn’t always have to be RGB!). But no information about input values range.  The output tells about 1001 labels (MobileNet has 1000 classes + one called “background” for everything else). But we aren’t very sure about their values, either they sum up to 1 or not. 
What else we can do to know our model better is to simply run the inference process on tf.lite.Interpreter. Here is a fragment from Colab notebook that does it:

As a result of it (and another inference for data in range [-1, 1], we get:

Both operations return the correct result, but better confidence for data in the range [-1, 1] suggests us, that this is the correct scale of input data.

Entire notebook is available on Github: TensorFlow_Lite_models_overview.ipynb. You can run it on Colaboratory by clicking here:

Netron

If you don’t want to write any additional Python code to know better your *.tflite model, fortunately, there is also another great option to do it. Netron – it is an open source application that makes visualization for deep learning and machine learning models for almost all supported frameworks like TensorFlow, Keras, CoreML, Caffe2 and many more. It is available on Github: https://github.com/lutzroeder/netron.
To start using Netron, all you have to do is to install in on your computer (Mac, Linux, Windows are supported), or run its browser version on https://lutzroeder.github.io/netron/
Now just drag and drop *.tflite file, and learn more from really nice graphs describing your model. Here is the visualization of our mnist.tflite file:

Netron visualization of Mnist, TensorFlow Lite model

You can see there not only input and output but actually entire machine learning model, layer by layer. Similarly to the notebook, we can see input and output data shapes and their types. And if we look closer at MobileNet v2 visualization, we can see mentioned Softmax layer, what could suggest that at the end, output array sums up to 1. 

Netron visualization of MobileNet v2, TensorFlow Lite model

Is it everything? Not really. As you can see, even with visualization or investigation in python code, there are still some uncertainties like input data scale, image color format or output data (either it is normalized, sum up to 1, or there is logit function with unknown boundaries. But For sure, the techniques presented here are helpful starting the implementation and quick debugging.

Source code for this blog post is available on Github (Android app and Colab notebook): https://github.com/frogermcs/TFLite-Checker

Thanks for reading! 🙂
Please share your feedback below. 👇

Leave a Reply

Your email address will not be published. Required fields are marked *