Skip to content

Structural downsampling and static token sparsification #5

@Yeez-lee

Description

@Yeez-lee

Hi, it's a quite solid and promising work but I have some questions.
(1) In the paper, you perform an average pooling with kernel size 2 × 2 after the sixth block for the structural downsampling. But in Table 3, you show the results of structural downsampling and static dynamic token sparsification. What is the difference between structural downsampling and static token sparsification since their ACCs are not same?
(2) I'm interested in the average pooling with kernel size 2 × 2. Did you do extra experiments in the position of such structural downsampling, like the seventh block or the tenth block in ViT?
(3) Could you provide the codes for reproducing the results of structural downsampling and static token sparsification in Table 3 and the probability heat-map in Figure 6?

Thanks for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions