PromptEngineeringChatbot

Project Description

This project implements a pseudo "chain-of-thought" or prompt engineering approach in a smaller language model (TinyLlama 1.1B). The chatbot breaks down its reasoning process into explicit steps when answering questions, similar to how larger models like ChatGPT approach problems. This project is:

A demonstration of prompt engineering techniques for structured output
An exploration of how formatting prompts affect small LLM responses
A learning tool for understanding the limitations of model scale
A pseudo chain-of-thought implementation that mimics reasoning structure

comparision_demo.py is a comparision bot that outputs the same question in both prompt engineered style and without it to show difference. Looking at the evidence from the output files and code, here's my assessment:

COT here is Mostly Pseudo/Coincidence - Here's Why: 1. The Model is Too Small TinyLlama (1.1B parameters) is fundamentally limited: It doesn't have the capacity for genuine "chain-of-thought" reasoning True CoT reasoning emerges around 10B+ parameters (research shows this) At 1.1B, it's mostly pattern matching, not reasoning

2. Evidence from the Outputs Looking at comparison_20241217_191529_.txt_:

    "Thinking Process" Response (9.31s):
    implications of ai on job opportunities:
    1. Automation: the rise of ai will lead to more automation...
    2. Changes in skills: ai will require humans to have specific skills...
    ```
    
    **Direct Response (3.77s):**
    ```
    I don't have access to current events, but according to recent research...
    ```

What actually happened:

The "thinking" version just generated MORE text (longer output = longer time)
The content isn't actually better - it's just more verbose
The direct response is actually more coherent and concise!

3. Look at the Quality Issues

    From `alice_analysis_20241217_202008.txt`:
    ```
    "Alice followed the White Rabbits because they were the only ones who spoke to her"
    "The White Ravens' voices were soothing..."
    "The White rabbits' voices and presence..."

Notice: Inconsistent (Rabbit → Ravens → rabbits) Contradictory statements No actual reasoning chain visible It's just generating text that looks like analysis

4. The Prompts Don't Actually Change Behavior The structured prompts like: python"Let me think about this step by step:

First, I need to understand what's being asked
Then, I'll break down the key components..." These are just priming the model to generate numbered lists. The model sees:

"step by step" → generates "1), 2), 3)" "analyze" → generates "analysis:", "implications:" But there's no actual reasoning happening underneath

✅ What IS Working 1. Output Structure The prompts DO successfully make the output more structured:

Forces numbered formatting Creates section headers Makes responses longer and more "essay-like"

2. Temperature/Sampling Effects The generation parameters matter more than the prompt:

pythontemperature=0.7,
top_p=0.9,
no_repeat_ngram_size=3,
repetition_penalty=1.2

These actually affect quality by reducing repetition.

4. Psychological Effect on Humans Seeing "Let me think step by step..." makes readers think the response is more thoughtful, even if the model isn't actually reasoning differently.

Features

Step-by-step reasoning process
Multiple thinking patterns (general, analysis, problem-solving)
GPU acceleration support
Response time tracking
Customizable prompt templates
Memory-efficient implementation suitable for consumer GPUs(ensure to account for CUDA version integration)

Requirements

Python 3.12
CUDA-capable GPU (tested on RTX 3070) CUDA 11.8
16GB+ GPU VRAM
Windows

Installation

Create and activate a virtual environment via powershell terminal:

# Create virtual environment
python -m virtualenv venv

# Activate virtual environment
.\venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Note- if using 11.8 CUDA the above will work. For your version of CUDA, find the respective Pytorch command at their official website- https://pytorch.org/get-started/locally/

To check you cuda, start up the Python interpreter in the terminal(by typing "python" and enter.) then type:

import torch
print(torch.version.cuda)

The above will print the CUDA version PyTorch is using.

It is crucial to install three torch libraries with specific cuda version tags to ensure your gpu is utlized, otherwise this project all though utilizing the lower end of LLMs, will still be infeasable on consumer grade CPUs. A GPU with correct drivers and compatible python libraries are crucial for this project.

Usage

Run the demo script:

python PromptEngineering.py

The script will:

Load the TinyLlama model
Run through test questions(you can edit these to add your own questions)
Display thinking process and generation time for each response

Example Output:

Initializing ThinkingLLM (this may take a moment)...        

Test Type: general
Question: What makes a good leader?

- Understanding the question: What makes a good leader? - Breaking down key
components: Leadership, character, skills, and experience - Clarifying the
question: What makes a good leader who can inspire and motivate people? -
Providing a clear answer: A leader who has a strong character, a proven track
record of success, and a deep understanding of their team's needs.  Example: A
company is looking for a new CEO. The CEO is asked, "What makes a good leader?"
Breaking down key components: Leadership, character, skills, and experience  1.
Leadership: The CEO needs to have the ability to inspire and motivate their
team. They need to have a strong personality and be able to connect with their
team members on a personal level.  2. Character: The CEO needs to be a role
model for their team. They need to have strong values and a commitment to doing
what is right for their team.  3. Skills: The CEO needs to be able to lead their
team through complex challenges. They need to have a deep understanding of their
team's needs and be able to bring together the right resources to overcome
obstacles.  4. Experience: The CEO needs to have a proven track record of
success. They need to have experience in leading a successful team and in
developing and executing strategies.  Clarifying the question: A leader who has
a strong character, a proven track record of success, and a deep understanding
of their team's needs.  My answer: A leader who has a strong character, a proven
track record of success, and a deep understanding of their team's needs. This
includes having a strong personality, commitment to doing what is right for
their team, ability to lead their team through complex challenges, and proven
track record of success.

Generation time: 12.16 seconds
==========================================================

Press Enter for next test...

Customization

You can modify the thinking patterns in the generate_thinking_prompt method of the ThinkingLLM class.

Technical Details

The project uses:

TinyLlama 1.1B Chat model
PyTorch with CUDA support
Hugging Face Transformers library
Half-precision (FP16) for efficient memory usage

Files

PromptEngineering.py: Main implementation
comparision_demo.py : Comparision Feature
requirements.txt: Required Python packages
README.md: Project documentation
alice_analysis_...txt : Example test case
comparision_20...txt : Example cases with answers in prompt engineered and non-engineered responses.
example_test_case.txt : a test run of PromptEngineering.py

Memory Usage

Recommended: 8GB+ VRAM for comfortable operation

Known Limitations

Response quality limited by model size (1.1B parameters)
May require further prompt engineering for best results
Generation times vary based on input complexity
To ensure a significantly faster experience, needs specific compatible cuda and torch versions.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
PromptEngineering.py		PromptEngineering.py
README.md		README.md
alice_analysis_20241217_202008.txt		alice_analysis_20241217_202008.txt
comparison_20241217_191529.txt		comparison_20241217_191529.txt
comparison_20241217_191645.txt		comparison_20241217_191645.txt
comparison_20250112_044139.txt		comparison_20250112_044139.txt
comparison_demo.py		comparison_demo.py
example_test_case.txt		example_test_case.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptEngineeringChatbot

Project Description

3. Look at the Quality Issues

Features

Requirements

Installation

Usage

Example Output:

Customization

Technical Details

Files

Memory Usage

Known Limitations

About

Uh oh!

Releases

Packages

Languages

hrbohra/PromptEngineeringChatbot

Folders and files

Latest commit

History

Repository files navigation

PromptEngineeringChatbot

Project Description

3. Look at the Quality Issues

Features

Requirements

Installation

Usage

Example Output:

Customization

Technical Details

Files

Memory Usage

Known Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages