A Discord search scraper that uses the Discord API to search for and extract messages from guilds.
- Search Discord messages across guilds
- Filter messages by date ranges
- Export results to JSONL/CSV format
- Docker support for containerized deployment
- Comprehensive test suite (66% coverage)
- Pre-commit hooks for code quality
- Fast development tooling with Justfile
- Python 3.13+
- uv package manager
# Install dependencies
uv sync
# Install dev dependencies
uv sync --all-extras# Run with uv
uv run python scraper.py
# Or activate the virtual environment first
source .venv/bin/activate # On Linux/macOS
python scraper.py# Build the image
docker build -t discord-scraper .
# Run the container
docker run --rm -v $(pwd)/output:/out discord-scraperThe project includes a Justfile for convenient command execution:
just install-dev # Install all dependencies
just run # Run the scraper
just lint # Lint code
just format # Format code
just test # Run tests
just test-cov # Run tests with coverage
just clean # Clean generated files
just docker-build # Build Docker image
just pre-commit # Run all pre-commit checks
just setup # Full development setup
# Or use uv directly
uv run python scraper.py
uv run ruff check .
uv run pytestRun just (with no arguments) to see all available commands.
This project uses Ruff for linting and formatting:
# Lint code
uv run ruff check .
# Format code
uv run ruff format .
# Fix linting issues automatically
uv run ruff check --fix .Pre-commit hooks are configured to automatically run linting and formatting before commits:
# Install pre-commit hooks (one-time setup)
just hooks-install
# or: uv run pre-commit install
# Run hooks manually
just hooks-run
# or: uv run pre-commit run --all-filesThe pre-commit hooks will:
- Check and fix code style with Ruff
- Detect merge conflicts
- Check for trailing whitespace
- Validate JSON, YAML, and TOML syntax
- Detect accidentally committed private keys
The project has a comprehensive test suite with 66% code coverage:
# Run tests
just test
# or: uv run pytest
# Run tests with coverage report
just test-cov
# or: uv run pytest --cov=scraper --cov-report=html
# Run tests in watch mode
just test-watchTests cover:
- Discord snowflake conversion utilities
- DiscordSearcher class initialization and configuration
- Query formation with various parameters
- File output handling
- Message appending
discord-search-api-scraper/
├── tests/ # Test suite
│ ├── __init__.py
│ ├── conftest.py
│ ├── test_discord_searcher.py
│ └── test_snowflake_utils.py
├── scraper.py # Main scraper script
├── jsonl-to-csv.py # Utility for converting JSONL to CSV
├── pyproject.toml # Project configuration
├── uv.lock # Dependency lock file
├── Justfile # Just commands
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── .dockerignore # Docker build exclusions
└── Dockerfile # Container configuration