Skip to content

The role of calibration data #85

@hxu105

Description

@hxu105

Hello, thank you for this amazing work! I have a couple of questions regarding the use and role of calibration data in Wanda:

  1. In the paper, calibration data is used to estimate the metric defined in Equation (4), which is then used to rank each entry and determine the weight masking. Does this mean that the masking is inherently dependent on the specific calibration dataset used? In other words, would different calibration datasets result in different maskings and potentially lead to varying downstream performance?

  2. When evaluating a Wanda-pruned model in a zero-shot setting, is it possible for Wanda to generate an effective masking using only the data from the zero-shot task itself, following the Algorithm 1? More generally, could Wanda be extended to function as an online pruning method?

Thanks again for your work and for any insights you can share!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions