RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Edge AI: Mobile and IoT
  6. /Ch. 12
Edge AI: Mobile and IoT

12. Power Optimization

Chapter 12 of 18 · 20 min
KEY INSIGHT

Batching amortizes energy consumed per inference sample by reducing idle time between operations—batching factor of 4-8 typically optimizes power efficiency for edge deployment.

Power consumption determines battery life for mobile and IoT deployments. Neural network inference power draw varies dramatically based on hardware utilization, memory bandwidth, and compute intensity. Understanding power profiles enables informed tradeoffs between performance and battery life.

Measuring power consumption on Raspberry Pi:

# Using vcgencmd (Broadcom GPU tools)
import subprocess

def read_power_draw():
    """Read instantaneous power draw"""
    try:
        result = subprocess.run(
            ['vcgencmd', 'power_status'],
            capture_output=True, text=True
        )
        return result.stdout.strip()
    except FileNotFoundError:
        return None

# Continuous power monitoring
def monitor_power(duration_sec=60, interval_ms=100):
    import time
    measurements = []
    start = time.time()
    
    while time.time() - start < duration_sec:
        measurements.append(read_power_draw())
        time.sleep(interval_ms / 1000)
    
    return measurements

# Parse power measurements
# Expected output: "power_on=1 is_rail_enabled=1 rail=AXP173-lifeext"

Android power profiling uses Battery Historian:

# Enable detailed battery debugging
adb shell dumpsys batterystart
adb shell dumpsys batterystats --enable full-wake-tracker
# Run inference workload
adb shell dumpsys batterystats > battery_stats.txt
adb pull /data/batterystats/battery-history.txt

# Process with battery historian docker image
docker run --rm -v $(pwd):/battery-history:ro \
  gcr.io/android-battery-historian/stable --parse=battery-history.txt

CPU frequency scaling impacts power linearly:

# Check available governors
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
# Output: conservative ondemand userspace powersave performance schedutil

# Set to performance for maximum throughput
sudo bash -c 'for f in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
    echo performance > $f
done'

# Set to powersave for minimum consumption
sudo bash -c 'for f in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
    echo powersave > $f
done'

Power optimization through batching reduces per-sample energy:

def measure_energy_inference(interpreter, input_data, batch_size):
    """Power model: E = P_active * t_processing + P_idle * t_idle"""
    from time import perf_counter
    
    # Energy during computation
    t_start = perf_counter()
    interpreter.invoke()
    t_compute = perf_counter() - t_start
    
    # Estimate energy assuming 5W active, 0.5W idle
    energy_active = 5.0 * t_compute  # Joules
    energy_idle = 0.5 * 0.5  # 500mW idle 500ms
    energy_per_sample = energy_active / batch_size
    
    return energy_per_sample, energy_active + energy_idle

Dark wake prevention for mobile:

// Use WorkManager for inference to avoid app suspension
val constraints = Constraints.Builder()
    .setRequiresBatteryNotLow(true)
    .build()

val inferenceRequest = OneTimeWorkRequestBuilder<InferenceWorker>()
    .setConstraints(constraints)
    .build()

WorkManager.getInstance(context)
    .enqueue(inferenceRequest)

// Worker implementation to avoid wake locks
class InferenceWorker: CoroutineWorker() {
    override suspend fun doWork(): Result {
        return try {
            runInference()
            Result.success()
        } catch (e: Exception) {
            Result.retry()
        }
    }
}
EXERCISE

Profile power consumptionIdle versus inference-active states on an edge device, calculate energy per inference sample, and determine optimal batch size for power efficiency.

← Chapter 11
Edge Benchmarking
Chapter 13 →
Offline Operation