-
Notifications
You must be signed in to change notification settings - Fork 68
Open
Description
Love the project, and looks super interesting. Works almost too well (very surprisingly) given how noisy dense labeling data is. I'm surprised the model can still infer!
I was looking through the paper, code and demo, but it wasn't clear to me how the video summarization worked. Can you elaborate? Do you break down the video frame by frame and treat it as an image problem?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels