The role of calibration data

Hello, thank you for this amazing work! I have a couple of questions regarding the use and role of calibration data in Wanda:

1. In the paper, calibration data is used to estimate the metric defined in Equation (4), which is then used to rank each entry and determine the weight masking. Does this mean that the masking is inherently dependent on the specific calibration dataset used? In other words, would different calibration datasets result in different maskings and potentially lead to varying downstream performance?

2. When evaluating a Wanda-pruned model in a zero-shot setting, is it possible for Wanda to generate an effective masking using only the data from the zero-shot task itself, following the Algorithm 1? More generally, could Wanda be extended to function as an online pruning method?

Thanks again for your work and for any insights you can share!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The role of calibration data #85

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The role of calibration data #85

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions