Hi, thanks for the great work. After comparing this code and the original SAM code, I found that since the original SAM cannot output class categories for each mask (man, head, truck, etc.), the NMS suppression in SamAutomaticMaskGenerator is applied over all masks instead of masks of a specific category. This is likely to impair performance. Can your Semantic-SAM model fix this issue by providing predicted class categories for each output mask?