Skip to content

Conversation

@pandu-rao
Copy link

No description provided.

@amonshiz
Copy link

amonshiz commented Jun 30, 2025

@simonw any chance you would accept this? Running ttok is pretty noisy with this warning from click.

amonshiz@m3-air ; uv tool install ttok
Resolved 9 packages in 17ms
Installed 9 packages in 6ms
 + certifi==2025.6.15
 + charset-normalizer==3.4.2
 + click==8.2.1
 + idna==3.10
 + regex==2024.11.6
 + requests==2.32.4
 + tiktoken==0.9.0
 + ttok==0.3
 + urllib3==2.5.0
Installed 1 executable: ttok

amonshiz@m3-air ; ttok --help
/Users/amonshiz/.local/share/uv/tools/ttok/lib/python3.13/site-packages/click/core.py:1193: UserWarning: The parameter --tokens is used more than once. Remove its duplicate as parameters should be unique.
  parser = self.make_parser(ctx)
/Users/amonshiz/.local/share/uv/tools/ttok/lib/python3.13/site-packages/click/core.py:1186: UserWarning: The parameter --tokens is used more than once. Remove its duplicate as parameters should be unique.
  self.parse_args(ctx, args)
/Users/amonshiz/.local/share/uv/tools/ttok/lib/python3.13/site-packages/click/core.py:1002: UserWarning: The parameter --tokens is used more than once. Remove its duplicate as parameters should be unique.
  pieces = self.collect_usage_pieces(ctx)
/Users/amonshiz/.local/share/uv/tools/ttok/lib/python3.13/site-packages/click/core.py:1104: UserWarning: The parameter --tokens is used more than once. Remove its duplicate as parameters should be unique.
  self.format_options(ctx, formatter)
Usage: ttok [OPTIONS] [PROMPT]...

  Count and truncate text based on tokens

  To count tokens for text passed as arguments:

      ttok one two three

  To count tokens from stdin:

      cat input.txt | ttok

  To truncate to 100 tokens:

      cat input.txt | ttok -t 100

  To truncate to 100 tokens using the gpt2 model:

      cat input.txt | ttok -t 100 -m gpt2

  To view token integers:

      cat input.txt | ttok --encode

  To convert tokens back to text:

      ttok 9906 1917 --decode

  To see the details of the tokens:

      ttok "hello world" --tokens

  Outputs:

      [b'hello', b' world']

Options:
  --version               Show the version and exit.
  -i, --input FILENAME
  -t, --truncate INTEGER  Truncate to this many tokens
  -m, --model TEXT        Which model to use
  --encode, --tokens      Output token integers
  --decode                Convert token integers to text
  --tokens                Output full tokens
  --allow-special         Do not error on special tokens
  --help                  Show this message and exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants