RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Computer vision / Instance Segmentation
Computer vision

Instance Segmentation

Instance segmentation is a computer vision task that assigns a pixel-level mask to each distinct object instance in an image, while also classifying it. Unlike semantic segmentation, which labels all pixels of the same class with one color (e.g., all cars as 'car'), instance segmentation separates overlapping or adjacent objects of the same class into individual masks. Operators encounter this in models like YOLOv8-seg or SAM (Segment Anything Model). The output is a list of masks, each with a class label and confidence score. VRAM matters because high-resolution images and many instances require more memory for mask decoding.

Deeper dive

Instance segmentation combines object detection and semantic segmentation. First, a model detects bounding boxes and class labels for each object. Then, within each box, a segmentation head predicts a binary mask for that instance. Common architectures include Mask R-CNN (two-stage: region proposal + mask head) and YOLOv8-seg (single-stage, faster). SAM uses a prompt-based approach: given a point or box, it segments the corresponding object. For operators, inference speed varies: YOLOv8-seg runs at ~30 FPS on an RTX 3060 for 640x640 images, while SAM (ViT-H) needs ~1-2 seconds per image on the same GPU. Quantization (e.g., FP16 or INT8) reduces VRAM usage and speeds up inference, but may slightly reduce mask accuracy.

Practical example

On an RTX 3060 (12 GB VRAM), running YOLOv8n-seg (nano) on a 640x640 image uses ~1.5 GB VRAM and processes ~100 images per second. Running SAM (ViT-B) on the same image uses ~3 GB and takes ~0.5 seconds per image. For a 4K image, SAM may need 8+ GB and take several seconds. Operators often resize inputs to balance accuracy and speed.

Workflow example

In a Python script using Ultralytics YOLO: from ultralytics import YOLO; model = YOLO('yolov8n-seg.pt'); results = model('image.jpg') returns a list of masks. Each mask is a binary array. To visualize, use results[0].plot(). For SAM, use from segment_anything import sam_model_registry, SamPredictor; predictor = SamPredictor(sam_model_registry['vit_b'](checkpoint='sam_vit_b_01ec64.pth')); predictor.set_image(image); masks, _, _ = predictor.predict(point_coords=[[500, 375]], point_labels=[1]).

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →