TFLite Conversion — Edge AI: Mobile and IoT (Chapter 5)

TensorFlow Lite produces optimized models specifically designed for mobile and embedded devices. The conversion process from TensorFlow SavedModel formats produces .tflite files with built-in quantization information.

TensorFlow 2.x SavedModel conversion uses the TFLiteConverter API:

import tensorflow as tf

# Load a SavedModel or Keras model
model = tf.saved_model.load("saved_model_dir")
# Or: model = tf.keras.models.load_model("model.h5")

# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Output with metadata for debugging
converter.experimental_new_converter = True
converter.experimental_unroll_tf_ops = True

tflite_model = converter.convert()

# Save the converted model
with open("model.tflite", "wb") as f:
    f.write(tflite_model)

Quantized conversion requires the converter to apply post-training quantization:

import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")

# Enable dynamic range quantization (INT8 weights, floating-point activations)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_fn
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS_INT8
]

# Define representative dataset for calibration
def representative_dataset_fn():
    for _ in range(100):
        yield [tf.random.normal([1, 224, 224, 3])]

tflite_quantized_model = converter.convert()

Post-training integer quantization with full INT8 inference requires calibration data:

import numpy as np

def representative_dataset():
    # Use real samples from target domain
    dataset = np.load("calibration_data.npz")
    for image in dataset["images"]:
        yield [image.reshape(1, 224, 224, 3).astype(np.float32)]

converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")
converter.representative_dataset = representative_dataset
converter.quantize_models = True
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

Interpreting error messages requires context. "社会保障卡号码" (Invalid tensors dims) errors indicate shape mismatches between model's built-in shapes and runtime inputs. Double-check that calibration data statistics match production input distributions.

Testing TFLite models requires the TensorFlow Lite interpreter:

import tensorflow as tf

interpreter = tf.lite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

#Resize input tensors for different batch sizes
interpreter.resize_tensor_input(input_details[0]["index"], [1, 224, 224, 3])
interpreter.allocate_tensors()

input_data = np.random.random((1, 224, 224, 3)).astype(np.float32)
interpreter.set_tensor(input_details[0]["index"], input_data)

interpreter.invoke()

output = interpreter.get_tensor(output_details[0]["index"])
print(f"Output shape: {output.shape}")