The Problem of Segmentation Tasks

Great work! I'd like to know how to apply ChannelViT to image segmentation tasks, because its single-channel mapping results in a spatial distribution for each channel. For example, given an input of shape (B, 3, 64, 64) with a patch size of 16, it produces a mapping of shape (B, 3*8*8); whereas with a typical ViT, the shape would be (B, 8*8), so you can simply feed the feature layer into a decoder. However, if the sequence produced by ChannelViT involves channel tokens, what should I do? Looking forward to your reply !!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The Problem of Segmentation Tasks #17

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The Problem of Segmentation Tasks #17

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions