RADSeg: Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models

RADSeg is a dense, language-aligned feature encoder that enables low-parameter, low-latency open-vocabulary semantic segmentation in 2D and 3D. By enhancing spatial locality of RADIO features, RADSeg outperforms previous state-of-the-art methods in accuracy while remaining highly efficient.