Co-Me: Confidence-Guided Token Merging for Visual Geometry Transformer

Published: by
Yutian Chen

Co-Me: Confidence-Guided Token Merging for Visual Geometry Transformer

Co-Me is an acceleration mechanism that uses a lightweight confidence predictor to selectively merge low-confidence tokens in visual geometry transformers, enabling substantial speedups without any model retraining.

Learning, Perception