RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Edge AI: Mobile and IoT
  6. /Ch. 1
Edge AI: Mobile and IoT

01. Edge AI Overview

Chapter 1 of 18 · 10 min
KEY INSIGHT

Edge devices trade raw throughput for zero-latency, offline operation and bandwidth elimination—these tradeoffs must drive deployment architecture decisions.

Edge AI moves inference workloads from cloud servers to devices physically located near data sources. This proximity eliminates network latency, enables offline operation, and reduces bandwidth costs. For production deployments, these factors often determine whether a system is economically viable.

The fundamental constraint driving edge AI is compute-to-connectivity ratio. A cloud GPU clusters can deliver thousands of TOPS (tera operations per second) but requires consistent 100Mbps+ bandwidth and introduces 50-200ms round-trip latency. A Raspberry Pi 4 delivers approximately 0.4 TOPS—a fraction of cloud throughput—but operates with zero network dependency and processes data as it arrives.

Three primary device categories define the edge landscape. microcontrollers (Cortex-M class) handle <1 TOPS with milliwatts of power draw, suitable for simple signal classification. Single-board computers like Raspberry Pi and Jetson Nano operate at 1-10 TOPS within 5-15W thermal envelopes. Mobile system-on-chips in flagship smartphones reach 20-40 TOPS while managing thermal throttling.

Real-world failure modes cluster around thermal constraints and memory bandwidth. Mobile neural processing units (NPUs) throttle from 100% to 40% throughput within 90 seconds when ambient temperature exceeds 30°C. Memory-bound models—those with parameters exceeding on-chip cache twice over—suffer 10x slowdowns as swap operations activate. Awareness of these constraints shapes every aspect of edge deployment.

Budget allocation for edge projects typically splits 40% compute, 30% memory/storage, 20% power delivery, 10% thermal management. Neglecting power delivery causes brownout resets under peak inference load. Skimping on thermal management triggers thermal throttling that undermines all performance predictions.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Profile three current device categories (microcontroller, single-board computer, mobile SoC) by their TOPS, memory capacity, and thermal design power. Document which workloads justify cloud versus edge processing.

← Overview
Edge AI: Mobile and IoT
Chapter 2 →
Raspberry Pi Setup