Skip to content

Details of zero-shot performance on SSv2 #44

@bpiyush

Description

@bpiyush

Dear authors,

Great work!

I was wondering if you have the script to reproduce zero-shot numbers on SSv2 (Table 7).

Based on my experiments and also other papers [1, 2], I get 2.7% accuracy on the 174 classes in SSv2 with a frozen CLIP with mean pooling on per-frame features. Could you please elaborate on this discrepancy or what I may be missing?

[1] Videoprompter: an ensemble of foundational models for zero-shot video understanding. https://arxiv.org/pdf/2310.15324
[2] GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? https://arxiv.org/pdf/2311.15732

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions