RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Edge AI: Mobile and IoT
  6. /Ch. 5
Edge AI: Mobile and IoT

05. TFLite Conversion

Chapter 5 of 18 · 20 min
KEY INSIGHT

TFLite conversion requires representative calibration data for integer quantization—calibration data distribution directly determines quantized model accuracy.

TensorFlow Lite produces optimized models specifically designed for mobile and embedded devices. The conversion process from TensorFlow SavedModel formats produces .tflite files with built-in quantization information.

TensorFlow 2.x SavedModel conversion uses the TFLiteConverter API:

import tensorflow as tf

# Load a SavedModel or Keras model
model = tf.saved_model.load("saved_model_dir")
# Or: model = tf.keras.models.load_model("model.h5")

# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Output with metadata for debugging
converter.experimental_new_converter = True
converter.experimental_unroll_tf_ops = True

tflite_model = converter.convert()

# Save the converted model
with open("model.tflite", "wb") as f:
    f.write(tflite_model)

Quantized conversion requires the converter to apply post-training quantization:

import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")

# Enable dynamic range quantization (INT8 weights, floating-point activations)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_fn
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS_INT8
]

# Define representative dataset for calibration
def representative_dataset_fn():
    for _ in range(100):
        yield [tf.random.normal([1, 224, 224, 3])]

tflite_quantized_model = converter.convert()

Post-training integer quantization with full INT8 inference requires calibration data:

import numpy as np

def representative_dataset():
    # Use real samples from target domain
    dataset = np.load("calibration_data.npz")
    for image in dataset["images"]:
        yield [image.reshape(1, 224, 224, 3).astype(np.float32)]

converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")
converter.representative_dataset = representative_dataset
converter.quantize_models = True
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

Interpreting error messages requires context. "社会保障卡号码" (Invalid tensors dims) errors indicate shape mismatches between model's built-in shapes and runtime inputs. Double-check that calibration data statistics match production input distributions.

Testing TFLite models requires the TensorFlow Lite interpreter:

import tensorflow as tf

interpreter = tf.lite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

#Resize input tensors for different batch sizes
interpreter.resize_tensor_input(input_details[0]["index"], [1, 224, 224, 3])
interpreter.allocate_tensors()

input_data = np.random.random((1, 224, 224, 3)).astype(np.float32)
interpreter.set_tensor(input_details[0]["index"], input_data)

interpreter.invoke()

output = interpreter.get_tensor(output_details[0]["index"])
print(f"Output shape: {output.shape}")
EXERCISE

Convert a Keras image classification model to TFLite with dynamic range quantization, measure inference latency, and verify output correctness against the original model.

← Chapter 4
Model Conversion to ONNX
Chapter 6 →
Core ML for iOS