DecryptX Helper Library

A Python library for the DecryptX Round 3 Data Cleaning Contest.

Installation

Install directly from GitHub:

pip install git+https://github.com/keanesc/decryptx-helper.git

Or for development:

git clone https://github.com/keanesc/decryptx-helper.git
cd decryptx-helper
pip install -e .

Dataset Access

The library automatically downloads the FIFA dataset from the DecryptX server when you call load_data(). The dataset is cached locally in ~/.cache/decryptx/ and refreshed every 24 hours.

Environment Variables (Optional)

You can customize the dataset location using an environment variable:

DECRYPTX_DATA_PATH: Direct path to a local dataset file (bypasses download)

Example:

# Use a local file (no download)
export DECRYPTX_DATA_PATH="/path/to/your/fifa_raw_data.csv"

Google Colab Note: The dataset will be automatically downloaded on first use. No manual upload required!

Quick Start

from decryptx import login, load_data, submit

# 1. Login with your team credentials
session = login(team_name="YourTeamName", password="your_password")

# 2. Load the raw FIFA dataset
df = load_data()

# 3. Clean your data (YOUR WORK GOES HERE)
df_clean = your_cleaning_function(df)

# 4. Submit your cleaned dataset
# This will automatically split data, train a fixed model, evaluate performance, and submit the score.
result = submit(session, df_clean)
print(f"Remaining attempts: {result['remainingAttempts']}/5")

API Reference

`login(team_name: str, password: str) -> dict`

Authenticate with the DecryptX server.

Returns: Session dictionary containing teamId, sessionId, and qualification status.

`load_data() -> pd.DataFrame`

Load the raw FIFA dataset that needs cleaning.

Returns: pandas DataFrame with the raw data.

`submit(session: dict, df: pd.DataFrame) -> dict`

Submit your cleaned dataset for evaluation.

This function:

Splits your data into train/test sets (fixed random seed)
Trains a standardized Random Forest model
Evaluates RMSE on the test set
Submits the score to the leaderboard

Returns: Submission result dictionary.

Submit your score to the leaderboard.

Returns: Submission result with remainingAttempts and status.

Rules

Fixed Parameters: The train/test split uses random_state=42 and test_size=0.2. Do not modify these.
5 Submission Limit: You have exactly 5 submission attempts total (lifetime limit).
1 Minute Cooldown: Wait at least 1 minute between submissions.
RMSE Scoring: Lower RMSE is better. The target is the player's Overall Rating (OVA).
Data Cleaning Focus: The competition is about data cleaning, not model architecture. A simple model on well-cleaned data often beats a complex model on dirty data.

Tips

Handle missing values appropriately
Parse numeric values from strings (e.g., "€103.5M" → 103500000)
Handle height/weight formats (e.g., "170cm" → 170)
Remove or encode special characters
Consider feature engineering from the available columns

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
notebooks		notebooks
src/decryptx		src/decryptx
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DecryptX Helper Library

Installation

Dataset Access

Environment Variables (Optional)

Quick Start

API Reference

`login(team_name: str, password: str) -> dict`

`load_data() -> pd.DataFrame`

`submit(session: dict, df: pd.DataFrame) -> dict`

Rules

Tips

License

About

Uh oh!

Releases

Packages

Languages

keanesc/decryptx-helper

Folders and files

Latest commit

History

Repository files navigation

DecryptX Helper Library

Installation

Dataset Access

Environment Variables (Optional)

Quick Start

API Reference

login(team_name: str, password: str) -> dict

load_data() -> pd.DataFrame

submit(session: dict, df: pd.DataFrame) -> dict

Rules

Tips

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`login(team_name: str, password: str) -> dict`

`load_data() -> pd.DataFrame`

`submit(session: dict, df: pd.DataFrame) -> dict`

Packages