Compatibility of Quant3Linear and 4-bit quantization

Hi! I've noticed that the quantization layer would pack the quantized weight using class Quant3Linear, as shown below:
![image](https://github.com/IST-DASLab/gptq/assets/23287116/7a761339-015b-450f-b817-5ee1fb78fc8e)

However, it seems to me that it only suits for 2bits and 3bits weights. If the original weights in $intweight is 4bits, some bits would be lost.

Could you explain the logic behind this? Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compatibility of Quant3Linear and 4-bit quantization #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compatibility of Quant3Linear and 4-bit quantization #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions