diff --git a/README.md b/README.md
index 6cc3903..262836c 100644
--- a/README.md
+++ b/README.md
@@ -7,27 +7,35 @@
 
 ### Latest experimental
 
+#### **Features**
+
 <details>
 
-#### **Features**
 - LlamaCPP Python wrapper support ([#116](https://github.com/epfl-dlab/transformers-CFG/pull/116))
 
+</details>
+
 #### **Bug fixes**
+
+<details>
+
 - `pip show` license ([#117](https://github.com/epfl-dlab/transformers-CFG/pull/117))
 
 </details>
 
 ### Latest stable
-#### **[v0.2.7 Latest](https://github.com/epfl-dlab/transformers-CFG/releases/tag/v0.2.7)** (2025-03-02)
+#### **[v0.2.7](https://github.com/epfl-dlab/transformers-CFG/releases/tag/v0.2.7)** (2025-03-02)
 
 #### **Features**
 
 - Types and MLX ([#93](https://github.com/epfl-dlab/transformers-CFG/pull/93))
-- Negation, wildcards, repetition brackets ([#94](https://github.com/epfl-dlab/transformers-CFG/pull/94), [#95](https://github.com/epfl-dlab/transformers-CFG/pull/95), [#96](https://github.com/epfl-dlab/transformers-CFG/pull/96), [#104](https://github.com/epfl-dlab/transformers-CFG/pull/104))
+- Negation ([#94](https://github.com/epfl-dlab/transformers-CFG/pull/94))
+- Wildcards ([#95](https://github.com/epfl-dlab/transformers-CFG/pull/95))
+- Repetition brackets ([#96](https://github.com/epfl-dlab/transformers-CFG/pull/96), [#104](https://github.com/epfl-dlab/transformers-CFG/pull/104))
 - Qwen2 and Qwen2.5 ([#97](https://github.com/epfl-dlab/transformers-CFG/pull/97))
-- Resuable `GrammarConstrainedLogitsProcessor` for efficiency ([#100](https://github.com/epfl-dlab/transformers-CFG/pull/100))
-- Pytest for testing ([#109](https://github.com/epfl-dlab/transformers-CFG/pull/109))
-- GitHub Actions workflow for automation ([#110](https://github.com/epfl-dlab/transformers-CFG/pull/110))
+- Resuable logits processor ([#100](https://github.com/epfl-dlab/transformers-CFG/pull/100))
+- Pytest ([#109](https://github.com/epfl-dlab/transformers-CFG/pull/109))
+- GitHub Actions workflow ([#110](https://github.com/epfl-dlab/transformers-CFG/pull/110))
 
 #### **Bug fixes**
 
@@ -47,11 +55,11 @@
 - **[Online demo](http://saibo-creator.xyz:7860/)** (2024-04-10)
 - **Unicode and foreign text** (2024-02-29)
 - **Text-Generation-WebUI** (2023-12-17)
-  - We are pleased to announce that `transformers-cfg` has been integrated into the [Text-Generation-WebUI](https://github.com/oobabooga/text-generation-webui) project, allowing users to leverage CFG capabilities within this widely used text-generation interface ([Pull](https://github.com/oobabooga/text-generation-webui/pull/4953)).
+  - We are pleased to announce that `transformers-cfg` has been integrated into the [Text-Generation-WebUI](https://github.com/oobabooga/text-generation-webui) project, allowing users to leverage CFG capabilities within this widely used text-generation interface ([PR](https://github.com/oobabooga/text-generation-webui/pull/4953)).
 
 ## 🚀 Introduction
 
-Initially developed as a pull request to the [Hugging Face Transformers](https://github.com/huggingface/transformers) library ([Pull](https://github.com/huggingface/transformers/pull/27557)), `transformers-cfg` extends the Hugging Face Transformers library to support constrained decoding through context-free grammars (CFG), offering a Transformers parellel for LlamaCPP's GBNF support, but with stricter generation rules.
+Initially developed as a pull request to the [Hugging Face Transformers](https://github.com/huggingface/transformers) library ([PR](https://github.com/huggingface/transformers/pull/27557)), `transformers-cfg` extends the Hugging Face Transformers library to support constrained decoding through context-free grammars (CFG), offering a Transformers parellel for LlamaCPP's GBNF support, but with stricter generation rules.
 
 ## 💻 Installation
 
@@ -71,6 +79,29 @@ For the latest updates, install directly from GitHub:
 pip install git+https://github.com/epfl-dlab/transformers-CFG.git@main
 ```
 
+## 💡 Why use `transformers-cfg`?
+
+- **EBNF Grammar Support**: Uses Extended Backus-Naur Form (EBNF) for grammar description.
+- **Seamless Integration**: Compatible with the llama-cpp project for easy replacement.
+- **Broad Model Compatibility**: Works with all models in the 🤗 Transformers library.
+- **Multilingual Grammar Support**: Enables grammars in various languages, including Chinese (中文), Japanese (日本語), Korean (한국어), Hindi (हिन्दी), Hebrew (עברית), Arabic (العربية), and emoji (🤗).  
+
+## 🤔 What is a grammar?
+
+Think of it as an enhanced version of regular expressions.
+
+### Valid JSON object
+
+```bnf
+root ::= object
+object ::= "{" pair ("," pair)* "}"
+pair ::= string ":" value
+string ::= '"' [a-zA-Z0-9]* '"'
+value ::= string | object | "true" | "false" | "null"
+```
+
+For advanced grammar debugging, see our [debugging guide](docs/debugging_custom_grammars.md).
+
 ## 🔧 Grammar quickstart
 Let's set up a predictable generation method where the model would usually reply with "The animal is a dog." However, we'll force the model to say either "The animal is a cat" or "The animal is a fish," two other common domestic pets that contradict the inital text.
 
@@ -80,13 +111,11 @@ The `transformers-cfg-cli` tool enables text generation using a model and a spec
 
 ```bash
 transformers-cfg-cli generate \
-    -m "microsoft/Phi-3-mini-4k-instruct" \
-    -g "examples/grammars/json.ebnf" \
-    -p "This is a valid JSON string for an HTTP request:" \
-    --use_4bit \
-    --max_new_tokens 60 \
-    --repetition_penalty 1.1
-# {"name":"John","age":30,"car":null}
+    -m "facebook/opt-125m" \
+    -g "examples/grammars/animal.ebnf" \
+    -p 'The text says, "The animal is a dog." The answer is obvious. ' \
+    --max_new_tokens 50 \
+# The animal is a cat.
 ```
 
 Run `transformers-cfg-cli generate --help` for available options.
@@ -100,37 +129,39 @@ from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
 from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor
 
 if __name__ == "__main__":
-    # Detect if GPU is available, otherwise use CPU
+    # Set device: use GPU if available, else CPU.
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
     print(f"Using device: {device}")
 
+    # Model identifier
     model_id = "facebook/opt-125m"
 
     # Load model and tokenizer
     tokenizer = AutoTokenizer.from_pretrained(model_id)
     tokenizer.pad_token = tokenizer.eos_token
-
     model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
     model.generation_config.pad_token_id = model.generation_config.eos_token_id
 
     # Define grammar string
-    json_grammar = """
-
+    grammar_str = """
     root   ::= "The animal is a " animal "."
-
     animal ::= "cat" | "fish"
-
     """
     
-    grammar = IncrementalGrammarConstraint(json_grammar, "root", tokenizer)
+    # Create grammar constraint and logits processor
+    grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
     grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
 
-    # Generate
+    # Define prompts
     prompts = [
-        'The text says, "The animal is a dog." The answer is obvious. ', 'I\'m going to say "The animal is a dog." Here I go! '
-              ]
+        'The text says, "The animal is a dog." The answer is obvious. ',
+        'I\'m going to say "The animal is a dog." Here I go! '
+    ]
+    
+    # Tokenize prompts
     input_ids = tokenizer(prompts, add_special_tokens=False, return_tensors="pt", padding=True)["input_ids"].to(device)
 
+    # Generate constrained text
     output = model.generate(
         input_ids,
         max_length=50,
@@ -139,13 +170,12 @@ if __name__ == "__main__":
         num_return_sequences=1,
     )
     
-    # Decode output
+    # Decode and print generated text
     generations = tokenizer.batch_decode(output, skip_special_tokens=True)
-
-    # Print all generations in for loop
     for generation in generations:
         print(generation)
 
+# The animal is a cat.
 ```
 
 #### Stream
@@ -159,41 +189,42 @@ from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
 from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor
 
 if __name__ == "__main__":
-    # Detect if GPU is available, otherwise use CPU
+    # Set device: use GPU if available, else CPU
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
     print(f"Using device: {device}")
 
+    # Model identifier
     model_id = "facebook/opt-125m"
 
     # Load model and tokenizer
     tokenizer = AutoTokenizer.from_pretrained(model_id)
     tokenizer.pad_token = tokenizer.eos_token
-
     model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
     model.generation_config.pad_token_id = model.generation_config.eos_token_id
 
-    # Define grammar as a string
+    # Define grammar string
     grammar_str = """
-
     root   ::= "The animal is a " animal "."
-
     animal ::= "cat" | "fish"
-
     """
     
+    # Create grammar constraint and logits processor
     grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
     grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
 
-    # Generate
+    # Define prompt
     prompts = [
-        'The text says, "The animal is a dog." The answer is obvious. ', #'I\'m going to say "The animal is a dog." Here I go! '
-              ]
+        'The text says, "The animal is a dog." The answer is obvious. '
+    ]
+    
+    # Tokenize prompt
     input_ids = tokenizer(prompts, add_special_tokens=False, return_tensors="pt", padding=True)["input_ids"].to(device)
 
     # Set up streaming
     streamer = TextStreamer(tokenizer)
 
-    output = model.generate(
+    # Generate constrained text with streaming.
+    model.generate(
         input_ids,
         max_length=50,
         logits_processor=[grammar_processor],
@@ -202,6 +233,7 @@ if __name__ == "__main__":
         streamer=streamer
     )
 
+# The animal is a cat.
 ```
 
 </details>
@@ -216,30 +248,26 @@ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
 from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
 from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor
 
-# Load model and tokenizer
+# Model identifier
 model_id = "facebook/opt-125m"
 
+# Load model and tokenizer
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 tokenizer.pad_token = tokenizer.eos_token
-
-# Detect if GPU is available, otherwise use CPU
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-
 model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
 
 # Define grammar string
-json_grammar = """
-
+grammar_str = """
 root   ::= "The animal is a " animal "."
-
 animal ::= "cat" | "fish"
-
 """
 
-grammar = IncrementalGrammarConstraint(json_grammar, "root", tokenizer)
+# Create grammar constraint and logits processor
+grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
 grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
 
-# Initialize pipeline
+# Initialize text generation pipeline
 pipe = pipeline(
     "text-generation",
     model=model,
@@ -249,20 +277,25 @@ pipe = pipeline(
     batch_size=2,
 )
 
-# Generate text
+# Define prompts
+prompts = [
+    'The text says, "The animal is a dog." The answer is obvious. ',
+    'I\'m going to say "The animal is a dog." Here I go! '
+]
+
+# Generate constrained text using the pipeline.
 generations = pipe(
-    [
-        'The text says, "The animal is a dog." The answer is obvious. ',
-        'I\'m going to say "The animal is a dog." Here I go! '
-    ],
+    prompts,
     do_sample=False,
     logits_processor=[grammar_processor],
 )
 
-# Print results
+# Print generated texts
 for generation_group in generations:
     for generation in generation_group:
         print(generation['generated_text'])
+
+# The animal is a cat.
 ```
 
 </details>
@@ -272,7 +305,6 @@ Use the `llama-cpp-python` adapter, automatically loadable with the `adapter` pa
 
 ```py
 import io
-import torch
 import logging
 from contextlib import redirect_stderr
 from llama_cpp import Llama
@@ -282,70 +314,89 @@ from transformers import AutoTokenizer
 
 logging.basicConfig(level=logging.INFO)
 
-# Define your EBNF grammar (you can replace this with your own)
-ebnf_grammar = """
-
-    root   ::= "The animal is a " animal "."
-
-    animal ::= "cat" | "fish"
-
-    """
+# Define grammar string.
+grammar_str = """
+root   ::= "The animal is a " animal "."
+animal ::= "cat" | "fish"
+"""
 
-# Load the tokenizer matching your model
+# Load the tokenizer matching the model.
 tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5b")
 
-# Redirect stderr and load the model via llama-cpp-python
-f = io.StringIO()
-with redirect_stderr(f):
+# Redirect stderr and load the model via llama-cpp-python.
+with redirect_stderr(io.StringIO()):
     model = Llama(model_path="qwen2.5-1.5b-q8_0.gguf", n_ctx=8000, verbose=False)
 
-# Create the grammar constraint and the logits processor with the new parameter.
-grammar_constraint = IncrementalGrammarConstraint(ebnf_grammar, "root", tokenizer)
-grammar_processor = GrammarConstrainedLogitsProcessor(grammar_constraint, adapter="llama-cpp-python")
+# Create grammar constraint and logits processor using the adapter.
+grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
+grammar_processor = GrammarConstrainedLogitsProcessor(grammar, adapter="llama-cpp-python")
 
-# Define a prompt.
-prompt = """The text says, "The animal is a dog." The answer is obvious. """
+# Define prompt.
+prompt = 'The text says, "The animal is a dog." The answer is obvious. '
 
-# Use the text completion API with the logits processor.
+# Generate constrained text (non-streaming).
 response = model.create_completion(
-    stream=True,
     prompt=prompt,
     logits_processor=[grammar_processor],
     max_tokens=100,
 )
 
-for token in response:
-    token_text = token["choices"][0]["text"]
-    print(token_text, end="", flush=True)
+# Print generated text.
+print(response["choices"][0]["text"])
 
+# The animal is a cat.
 ```
 
-## 💡 Why use `transformers-cfg`?
+#### Stream
+<details>
 
-- **EBNF Grammar Support**: Uses Extended Backus-Naur Form (EBNF) for grammar description.
-- **Seamless Integration**: Compatible with the llama-cpp project for easy replacement.
-- **Broad Model Compatibility**: Works with all models in the 🤗 Transformers library.
-- **Multilingual Grammar Support**: Enables grammars in various languages, including Chinese (中文), Japanese (日本語), Korean (한국어), Hindi (हिन्दी), Hebrew (עברית), Arabic (العربية), and emoji (🤗).  
+```py
+import io
+import logging
+from contextlib import redirect_stderr
+from llama_cpp import Llama
+from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
+from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor
+from transformers import AutoTokenizer
 
-## 🤔 What is a grammar?
+logging.basicConfig(level=logging.INFO)
 
-Think of it as an enhanced version of regular expressions.
+# Define grammar string
+grammar_str = """
+root   ::= "The animal is a " animal "."
+animal ::= "cat" | "fish"
+"""
 
-### Valid JSON object
+# Load the tokenizer matching the model
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5b")
 
-```bnf
-root ::= object
-object ::= "{" pair ("," pair)* "}"
-pair ::= string ":" value
-string ::= '"' [a-zA-Z0-9]* '"'
-value ::= string | object | "true" | "false" | "null"
-```
+# Redirect stderr and load the model via llama-cpp-python
+with redirect_stderr(io.StringIO()):
+    model = Llama(model_path="qwen2.5-1.5b-q8_0.gguf", n_ctx=8000, verbose=False)
 
-For advanced grammar debugging, see our [debugging guide](docs/debugging_custom_grammars.md).
+# Create grammar constraint and logits processor using the adapter
+grammar = IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
+grammar_processor = GrammarConstrainedLogitsProcessor(grammar, adapter="llama-cpp-python")
 
-## 🛠 JSON schema
+# Define prompt.
+prompt = 'The text says, "The animal is a dog." The answer is obvious. '
 
-Learn to create grammars for complex JSON objects in our [documentation](examples/grammars/custom_json_grammars/README.md).
+# Generate constrained text with streaming
+response = model.create_completion(
+    stream=True,
+    prompt=prompt,
+    logits_processor=[grammar_processor],
+    max_tokens=100,
+)
+
+# Stream and print generated text
+for token in response:
+    print(token["choices"][0]["text"], end="", flush=True)
+
+# The animal is a cat.
+```
+
+</details>
 
 ## 📜 Grammar collection
 
@@ -357,21 +408,26 @@ We maintain a collection of grammars in `examples/grammars`, aligned with llama-
 - [chess.ebnf](examples/grammars/chess.ebnf): Valid chess moves.
 - [arithmetic.ebnf](examples/grammars/arithmetic.ebnf): Valid arithmetic expressions.
 
-## ✅ Supported models
+## 🛠 JSON schema
+
+Learn to create grammars for complex JSON objects in our [documentation](examples/grammars/custom_json_grammars/README.md).
+
+## ✅ Supported tokenizers
+
+
+### 🤖 Tested models
 
-### Qwen  
 <details>  
-<summary>Qwen</summary>  
+<summary>Qwen (≤ 2.5)</summary>  
   
-- [Qwen](https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f) ≤ 2.5  
+- [Qwen2](https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f)
+- [Qwen2.5]()
 
 </details>  
 
-### Meta (LLaMa)  
 <details>  
-<summary>Meta (LLaMa)</summary>  
+<summary>LLaMa (≤ 3.3)</summary>  
 
-- [LLaMa](https://huggingface.co/baffo32/decapoda-research-llama-7B-hf) ≤ 3.0  
 - [huggyllama/llama-7b](https://huggingface.co/huggyllama/llama-7b)  
 - [TinyPixel/Llama-2-7B-bf16-sharded](https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded)  
 - [OpenAssistant/llama2-13b-orca-8k-3319](https://huggingface.co/OpenAssistant/llama2-13b-orca-8k-3319)  
@@ -393,11 +449,9 @@ We maintain a collection of grammars in `examples/grammars`, aligned with llama-
 
 </details>  
 
-### GPT  
 <details>  
-<summary>GPT</summary>  
+<summary>GPT (≤ 2)</summary>  
 
-- [GPT](https://huggingface.co/openai-community/gpt2) ≤ 2  
 - [gpt2](https://huggingface.co/gpt2)  
 - [distilgpt2](https://huggingface.co/distilgpt2)  
 - [openai-community/gpt2-large](https://huggingface.co/openai-community/gpt2-large)  
@@ -407,31 +461,25 @@ We maintain a collection of grammars in `examples/grammars`, aligned with llama-
 
 </details>  
 
-### Mistral  
 <details>  
-<summary>Mistral</summary>  
+<summary>Mistral (≤ 0.3)</summary>  
 
-- [Mistral](https://huggingface.co/mistralai/Mistral-7B-v0.1) ≤ 0.3  
 - [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)  
 - [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)  
 
 </details>  
 
-### Falcon  
 <details>  
-<summary>Falcon</summary>  
+<summary>Falcon (≤ 3.0)</summary>  
 
-- [Falcon](https://huggingface.co/tiiuae/falcon-7b)  
 - [tiiuae/falcon-40b-instruct](https://huggingface.co/tiiuae/falcon-40b-instruct)  
 - [tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)  
 
 </details>  
 
-### OPT  
 <details>  
 <summary>OPT</summary>  
 
-- [OPT](https://huggingface.co/collections/facebook/opt-66ed00e15599f02966818844)  
 - [facebook/opt-125m](https://huggingface.co/facebook/opt-125m)  
 - [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)  
 - [facebook/opt-350m](https://huggingface.co/facebook/opt-350m)  
@@ -440,8 +488,6 @@ We maintain a collection of grammars in `examples/grammars`, aligned with llama-
 
 </details>
 
-See [supported_models.yaml](docs/supported_models.yaml) for the full list whose extent is constantly being updated.
-
 If you encounter an unsupported model, please open an issue or submit a pull request.
 
 ## 📖 Citation
diff --git a/examples/grammars/animal.ebnf b/examples/grammars/animal.ebnf
new file mode 100644
index 0000000..a8c1c3a
--- /dev/null
+++ b/examples/grammars/animal.ebnf
@@ -0,0 +1,2 @@
+root   ::= "The animal is a " animal "."
+animal ::= "cat" | "fish"