Skip to content

taeyun16/playfast

Repository files navigation

Playfast ⚡

Lightning-Fast Google Play Store Scraper

License: MIT Python 3.11+ Built with Rust CI Coverage PyPI Documentation

Playfast is a high-performance Google Play Store scraper built with Rust + PyO3, delivering 5-10x faster performance with true parallel batch processing.

✨ Features

Play Store Scraping

  • 🚀 Blazingly Fast: Batch API is 5-10x faster than sequential
  • True Parallel: Rust core completely releases GIL
  • 🦀 Pure Rust: HTTP + parsing all in Rust for maximum performance
  • 🔒 Type Safe: Full Pydantic validation and type hints
  • 💾 Memory Efficient: Only 1.5 KB per app, linear scaling
  • 🌍 Multi-Country: 247 countries, 93 unique Play Stores
  • 📦 Batch API: High-level functions for easy parallel processing

APK Download (NEW!)

  • ⬇️ Direct Download: Download APKs directly from Google Play Store
  • 🔐 Smart Authentication: OAuth → AAS token exchange with auto-retry
  • 💾 Credential Management: Save and reuse authentication tokens
  • 🎯 Version Control: Download specific versions or latest
  • Parallel Downloads: Efficient batch downloading with ThreadPoolExecutor

APK/DEX Analysis

  • 🔍 Entry Point Analysis: Identify Activities, Services, deeplink handlers
  • 📊 Call Graph: Method-to-method relationship tracking
  • 🌐 WebView Flow: Track paths from entry points to WebView APIs
  • 🔗 Data Flow: Intent → WebView.loadUrl() data tracking
  • 🛡️ Security Analysis: Deeplink vulnerability detection

📊 Performance

Batch Processing makes bulk operations 5-10x faster through true Rust parallelism!

Method Time Speedup
Batch API ~3s 6-8x 🚀
RustClient + ThreadPool ~3-4s 6-7x
AsyncClient (concurrent) ~3-5s 5-7x
Sequential ~20-30s 1x

Benchmark: Fetching 3 apps across 3 countries (9 requests total)

🚀 Quick Start

Installation

Using pip (traditional):

pip install playfast

Using uv (recommended - faster):

uv add playfast

Using poetry:

poetry add playfast

Option 1: Batch API (Recommended - Easiest & Fastest)

from playfast import fetch_apps

# Fetch multiple apps across countries (parallel!)
apps = fetch_apps(
    app_ids=["com.spotify.music", "com.netflix.mediaclient"],
    countries=["us", "kr", "jp"],
)
print(f"Fetched {len(apps)} apps in ~3 seconds!")

Option 2: RustClient (Maximum Performance)

from playfast import RustClient

client = RustClient()

# Get app information (GIL-free!)
app = client.get_app("com.spotify.music")
print(f"{app.title}: {app.score}⭐ ({app.ratings:,} ratings)")

# Get reviews
reviews, next_token = client.get_reviews("com.spotify.music")
for review in reviews[:5]:
    print(f"{review.user_name}: {review.score}⭐")

Option 3: AsyncClient (Easy Async)

import asyncio
from playfast import AsyncClient


async def main():
    async with AsyncClient() as client:
        app = await client.get_app("com.spotify.music")
        print(f"{app.title}: {app.score}⭐")


asyncio.run(main())

Option 4: APK Download (NEW!)

from playfast import ApkDownloader

# First-time setup with OAuth token
downloader = ApkDownloader(
    email="user@gmail.com", oauth_token="oauth2_4/..."  # Get from Google embedded setup
)
downloader.login()
downloader.save_credentials("~/.playfast/credentials.json")

# Subsequent use - just load credentials
downloader = ApkDownloader.from_credentials("~/.playfast/credentials.json")

# Download APK
apk_path = downloader.download("com.instagram.android")
print(f"Downloaded to: {apk_path}")

# Download specific version
apk_path = downloader.download("com.whatsapp", version_code=450814)

Option 5: APK/DEX Analysis

from playfast import ApkAnalyzer

# High-level API
analyzer = ApkAnalyzer("app.apk")
manifest = analyzer.manifest
classes = analyzer.classes

print(f"Package: {manifest.package_name}")
print(f"Activities: {len(manifest.activities)}")
print(f"Classes: {len(classes)}")

# Advanced: WebView flow analysis (low-level API)
from playfast.core import analyze_webview_flows_from_apk

flows = analyze_webview_flows_from_apk("app.apk", max_depth=10)
for flow in flows:
    print(f"{flow.entry_point}{flow.webview_method}")
    if flow.is_deeplink_handler:
        print("  ⚠️  DEEPLINK HANDLER")

Complete Workflow: Download → Analyze

from playfast import ApkDownloader, ApkAnalyzer

# Download APK from Google Play
downloader = ApkDownloader.from_credentials("~/.playfast/credentials.json")
apk_path = downloader.download("com.instagram.android")

# Analyze the downloaded APK
analyzer = ApkAnalyzer(apk_path)
manifest = analyzer.manifest

print(f"📦 {manifest.package_name}")
print(f"🔢 Version: {manifest.version_name} ({manifest.version_code})")
print(f"📱 Activities: {len(manifest.activities)}")
print(f"🔐 Permissions: {len(manifest.permissions)}")

📚 Examples

See the examples/ directory for more:

Play Store Scraping

APK Download

APK/DEX Analysis

📖 Documentation

Play Store Scraping

APK Download

APK/DEX Analysis

🏗️ Architecture

Playfast uses pure Rust for maximum performance:

┌─────────────────────────────────────────────────────┐
│   Python High-level API                             │
│   - ApkDownloader (APK download)                    │
│   - ApkAnalyzer (APK/DEX analysis)                  │
│   - Batch API (Play Store scraping)                 │
│   - RustClient / AsyncClient                        │
│   - Pydantic Models                                 │
└────────────────────┬────────────────────────────────┘
                     │ PyO3 Bindings
                     ▼
┌─────────────────────────────────────────────────────┐
│   Rust Core (playfast.core)                        │
│   - Google Play API (gpapi - APK download)          │
│   - HTTP Client (reqwest)                           │
│   - HTML Parser (scraper)                           │
│   - DEX Parser (custom)                             │
│   - Parallel Processing (rayon + tokio)             │
│   - Complete GIL Release                            │
└─────────────────────────────────────────────────────┘

API Layers

Layer Components Use Case
High-level ApkDownloader, ApkAnalyzer, Batch API General users (90% of use cases)
Mid-level RustClient, AsyncClient Direct scraping control
Low-level playfast.core.* Security research, advanced analysis

Client Options for Play Store Scraping

Method Speed Ease Best For
Batch API ⚡⚡⚡ ⭐⭐⭐ Multiple items
RustClient ⚡⚡⚡ ⭐⭐ Single items
AsyncClient ⚡⚡ ⭐⭐ Async code

🌍 Multi-Country Optimization

Playfast optimizes global data collection:

from playfast import get_unique_countries, get_representative_country

# Instead of 247 countries, use 93 unique stores (2.7x faster!)
unique = get_unique_countries()  # 93 unique Play Stores

# Get representative for any country
rep = get_representative_country(
    "fi"
)  # Finland → Vanuatu store (shared by 138 countries)

🔧 Development

# Clone repository
git clone https://github.com/taeyun16/playfast.git
cd playfast

# Install dependencies
uv sync

# Build Rust extension
uv run maturin develop --release

# Run tests
uv run pytest

# Run examples
uv run python examples/basic.py

# Run benchmarks
uv run python benchmarks/batch_apps_benchmark.py

See Development Setup for detailed instructions.

🤝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

📝 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

⚠️ Disclaimer

This tool is for educational and research purposes only. Please respect Google Play Store's Terms of Service. Use responsibly with appropriate rate limiting.


Made with ❤️ using Rust + Python

About

Lightning-Fast Google Play Store Scraper

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •