Hi! I've noticed that the quantization layer would pack the quantized weight using class Quant3Linear, as shown below:

However, it seems to me that it only suits for 2bits and 3bits weights. If the original weights in $intweight is 4bits, some bits would be lost.
Could you explain the logic behind this? Thanks!