For example, the Kcat dataset has over 10,000 sequences and SMIELS data, but the PDB contains only over 7,000. Why?