Skip to content

Conversation

@ashmi8
Copy link

@ashmi8 ashmi8 commented Jan 20, 2026

added all the missing type hints for the funciton in python files solution for issue no #1927

Copilot AI review requested due to automatic review settings January 20, 2026 16:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds return type hints to Python functions across test files, scripts, and examples. While the PR aims to address missing type hints (issue #1927), it only partially completes the type annotation effort by adding return types without including parameter type hints.

Changes:

  • Added return type hints to fixture functions and utility functions in test files
  • Added return type hints to converter methods and utility functions in scripts
  • Added typing imports where needed for type annotations

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 41 comments.

Show a summary per file
File Description
bindings/python/tests/utils.py Added return type hints to download function and pytest fixtures (roberta_files, bert_files, openai_files, train_files, albert_base, doc_wiki_tokenizer, doc_pipeline_bert_tokenizer)
bindings/python/tests/bindings/test_trainers.py Added return type hint to split method in GoodCustomPretok test class
bindings/python/tests/bindings/test_tokenizer.py Added return type hint to format helper function
bindings/python/tests/bindings/test_processors.py Added return type hints to get_bert, get_roberta, and get_t5_squad helper methods
bindings/python/tests/bindings/test_pre_tokenizers.py Added return type hints to split and pre_tokenize methods in custom pre-tokenizer test classes
bindings/python/scripts/spm_parity_check.py Added return type hints to check_diff, check_details, and check_encode functions
bindings/python/scripts/convert.py Added return type hints to vocab, normalizer, post_processor, unk_id methods across multiple converter classes and check function
bindings/python/examples/example.py Added typing import and return type hints to tokenize_r and tokenize_p functions
bindings/python/benches/test_tiktoken.py Added return type hint to test function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


@pytest.fixture(scope="session")
def bert_files(data_dir):
def bert_files(data_dir) -> dict[str, str]:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for data_dir is missing. Consider adding the appropriate type hint for the parameter to make the function signature complete.

Copilot uses AI. Check for mistakes.
]

def normalizer(self, proto):
def normalizer(self, proto) -> Sequence:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for proto is missing. Consider adding an appropriate type hint for the parameter to make the function signature complete.

Copilot uses AI. Check for mistakes.
return Sequence(normalizers)

def post_processor(self, tokenizer):
def post_processor(self, tokenizer) -> TemplateProcessing:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for tokenizer is missing. Consider adding an appropriate type hint for the parameter to make the function signature complete.

Suggested change
def post_processor(self, tokenizer) -> TemplateProcessing:
def post_processor(self, tokenizer: Tokenizer) -> TemplateProcessing:

Copilot uses AI. Check for mistakes.
return 3

def post_processor(self, tokenizer):
def post_processor(self, tokenizer) -> TemplateProcessing:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for tokenizer is missing. Consider adding an appropriate type hint for the parameter to make the function signature complete.

Copilot uses AI. Check for mistakes.

class XLMRobertaConverter(SpmConverter):
def vocab(self, proto):
def vocab(self, proto) -> list[tuple[str, float]]:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for proto is missing. Consider adding an appropriate type hint for the parameter to make the function signature complete.

Copilot uses AI. Check for mistakes.
offset = 103

def vocab(self, proto):
def vocab(self, proto) -> list[tuple[str, float]]:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hint for proto is missing. Consider adding an appropriate type hint for the parameter to make the function signature complete.

Copilot uses AI. Check for mistakes.


def download(url, with_filename=None):
def download(url, with_filename=None) -> str:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hints are missing. Consider adding type hints for url (str) and with_filename (Optional[str] or str | None) to make the function signature complete.

Copilot uses AI. Check for mistakes.
return "rest"

def split(self, n, normalized):
def split(self, n, normalized) -> list:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hints for n and normalized are missing. Consider adding appropriate type hints for the parameters to make the function signature complete.

Copilot uses AI. Check for mistakes.


def check_details(line, spm_ids, tok_ids, sp, tok):
def check_details(line, spm_ids, tok_ids, sp, tok) -> bool:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hints for line, spm_ids, tok_ids, sp, and tok are missing. Consider adding appropriate type hints for the parameters to make the function signature complete.

Copilot uses AI. Check for mistakes.


def check(pretrained, filename):
def check(pretrained, filename) -> tuple[str, datetime.timedelta]:
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type hint has been added, but the parameter type hints for pretrained and filename are missing. Consider adding appropriate type hints for the parameters to make the function signature complete.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant