This repository contains tools to load test your RunPod BERTopic handler with different document sizes.
load_test.py- Comprehensive Python load testing scriptgenerate_test_files.py- Generate individual JSON test filestest_with_curl.sh- Simple bash script using curlcreate_test_inpot.py- Original test file generator (fixed)
- Install dependencies:
pip install -r requirements.txt- Generate test files:
python generate_test_files.pyThis will create:
test_input_100.json- 100 documentstest_input_1000.json- 1,000 documentstest_input_10000.json- 10,000 documentstest_input_100000.json- 100,000 documents
The most comprehensive option with detailed statistics:
python load_test.py --url "https://api.runpod.ai/v2/your-endpoint-id/run" --sizes 100 1000 10000 --requests 5 --concurrent 2With API key authentication:
python load_test.py --url "https://api.runpod.ai/v2/your-endpoint-id/run" --api-key "your-api-key-here" --sizes 100 1000 10000 --requests 5 --concurrent 2Options:
--url: Your RunPod endpoint URL (required)--api-key: RunPod API key for authentication (optional)--sizes: Document sizes to test (default: 100 1000 10000)--requests: Number of requests per size (default: 3)--concurrent: Number of concurrent requests (default: 1)--output: Output file for results (default: load_test_results.json)
Example output:
Testing with 100 documents
==================================================
Data size: 45678 bytes
Running load test: 3 requests, 1 concurrent
Request 1/3
Request 2/3
Request 3/3
Results for 100 documents:
Success Rate: 100.00%
Avg Response Time: 12.34s
Min Response Time: 11.89s
Max Response Time: 13.01s
Median Response Time: 12.45s
Std Response Time: 0.56s
Avg Response Size: 12345 bytes
Simple testing with curl:
./test_with_curl.sh "https://api.runpod.ai/v2/your-endpoint-id/run"With API key authentication:
./test_with_curl.sh "https://api.runpod.ai/v2/your-endpoint-id/run" "your-api-key-here"This will test with 100, 1000, and 10000 documents sequentially.
You can also test manually using the generated JSON files:
curl -X POST \
-H "Content-Type: application/json" \
-d @test_input_1000.json \
"https://api.runpod.ai/v2/your-endpoint-id/run"With API key authentication:
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key-here" \
-d @test_input_1000.json \
"https://api.runpod.ai/v2/your-endpoint-id/run"Based on typical BERTopic performance:
- 100 documents: ~30-90 seconds (including polling)
- 1,000 documents: ~2-5 minutes (including polling)
- 10,000 documents: ~5-15 minutes (including polling)
- 100,000 documents: ~15-60 minutes (including polling)
Note: RunPod uses an asynchronous API. The load testing tools:
- Submit the job and receive a job ID immediately
- Poll the status endpoint every 30 seconds
- Report completion when the job finishes
Input Format: The handler now accepts:
num_docs: Number of documents to processnum_topics: Number of topics to reduce to (default: 10)random_seed: Random seed for reproducibility (default: 42)
The dataset is downloaded and processed within the handler to avoid payload size limits.
Performance depends on:
- GPU availability and type
- Model complexity
- Network latency
- RunPod instance specifications
- Queue position (if other jobs are running)
- Timeout errors: Increase the timeout in
load_test.py(default: 300s) - Memory errors: Reduce document size or use smaller batches
- Network errors: Check your RunPod endpoint URL and connectivity
The Python script saves detailed results to a JSON file with:
- Success/failure rates
- Response time statistics (min, max, mean, median, std)
- Response sizes
- Error messages
Use this data to:
- Identify performance bottlenecks
- Determine optimal batch sizes
- Plan capacity requirements
- Monitor performance over time