Model Conversion to ONNX — Edge AI: Mobile and IoT (Chapter 4)

Converting models to ONNX format creates portable files that ONNX Runtime can execute across hardware platforms. PyTorch and TensorFlow both support ONNX export, though the export processes differ significantly.

PyTorch conversion uses torch.onnx.export() with traced model execution:

import torch
import torch.nn as nn

class SimpleClassifier(nn.Module):
    def __init__(self, input_dim=784, hidden_dim=256, num_classes=10):
        super().__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.2)
        self.fc2 = nn.Linear(hidden_dim, num_classes)
    
    def forward(self, x):
        x = x.flatten(1)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        return x

# Instantiate and export
model = SimpleClassifier()
model.eval()

# Create dummy input matching real input shape
dummy_input = torch.randn(1, 1, 28, 28)

torch.onnx.export(
    model,
    dummy_input,
    "classifier.onnx",
    export_params=True,
    opset_version=14,
    input_names=["input"],
    output_names=["output"],
    dynamic_axes={
        "input": {0: "batch_size"},
        "output": {0: "batch_size"}
    }
)

TensorFlow conversion requires a different approach using the tf2onnx converter:

pip install tf2onnx
python -m tf2onnx.convert \
    --opset 14 \
    --input model_checkpoint.meta \
    --output classifier.onnx

Common conversion failures stem from dynamic control flow. ONNX opset 14 supports dynamic shapes through dynamic_axes, but dynamic loops (torch.nn.LSTM with variable sequence length) require careful handling. Exporting LSTMs often produces ops that certain backends don't support—using torch.jit.script first bridges control flow:

# Script the model to eliminate dynamic control flow
scripted_model = torch.jit.script(model)
scripted_model.eval()

torch.onnx.export(
    scripted_model,
    dummy_input,
    "lstm_classifier.onnx",
    opset_version=14
)

Verifying converted models requires comparing outputs between the original and converted versions:

import onnxruntime as ort

# Load original model
original_outputs = model(dummy_input).detach().numpy()

# Load ONNX model
session = ort.InferenceSession("classifier.onnx")
onnx_outputs = session.run(
    None, 
    {"input": dummy_input.numpy()}
)[0]

# Verify numerical equivalence
np.testing.assert_allclose(original_outputs, onnx_outputs, rtol=1e-4, atol=1e-5)
print("Model conversion verified successfully")

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.