Back to feed
arXiv cs.AI·

Vision Transformer-Conditioned UNet for Domain-Adaptive Semantic Segmentation

Signal
72
Hype
18
In three linesViTC-UNet conditions a UNet on frozen pre-trained Vision Transformer representations via learnable tokens and two-way attention decoder. The approach improves biomedical semantic segmentation on MRI and CT without end-to-end fine-tuning, combining ViT global priors with UNet local inductive bias and high-resolution decoding.
Read source
Your take?
VisionPapersBenchmarks

Summary generated by Claude — human-verified