Autonomous Vehicles
Autonomous vehicles are self-driving systems that use AI to perceive their environment, plan routes, and control vehicle motion without human intervention. Operators encounter this term in the context of running perception models (e.g., YOLO for object detection) or planning algorithms on edge hardware, where latency and power constraints matter. Local AI inference is relevant because autonomous vehicles often require real-time processing on embedded GPUs (e.g., NVIDIA Jetson) rather than cloud servers, making model quantization and optimization critical for meeting safety-critical latency budgets.
Deeper dive
Autonomous vehicles rely on a pipeline of perception (camera, LiDAR, radar), localization (GPS, SLAM), prediction (of other agents), planning (path and behavior), and control (steering, throttle). Deep learning models handle object detection, semantic segmentation, and end-to-end driving. Operators running local AI for autonomous vehicle research use frameworks like TensorRT or ONNX Runtime to optimize models for embedded GPUs. Quantization (FP16, INT8) reduces model size and latency but may affect accuracy. Safety standards like ISO 26262 and SOTIF impose strict validation requirements. The term also appears in datasets (nuScenes, Waymo Open Dataset) and simulation environments (CARLA, AirSim) used for training and testing.
Practical example
An operator deploying YOLOv8 for pedestrian detection on an NVIDIA Jetson Orin (32 GB shared memory) might use TensorRT to convert the model to FP16, achieving ~30 FPS at 640x640 input. Quantizing to INT8 could push that to ~60 FPS but requires calibration data to avoid accuracy loss. The operator must balance detection range and frame rate against the vehicle's braking distance at highway speeds.
Workflow example
In a typical workflow, an operator downloads a pretrained model (e.g., from Hugging Face), exports it to ONNX, then uses trtexec --onnx=model.onnx --fp16 to build a TensorRT engine. They test inference on a Jetson device using a custom C++ or Python script that reads camera frames, runs the engine, and overlays bounding boxes. Latency is measured per frame; if it exceeds 50 ms, the operator may reduce input resolution or switch to a lighter model.
Reviewed by Fredoline Eruo. See our editorial policy.