feature_schema

feature_schema is a lightweight Python package that automatically extracts and documents feature metadata from a pandas DataFrame.
It’s designed for machine learning workflows where you need to understand, validate, or dynamically generate user inputs for model features.

Features

Extract feature name
Auto-detect feature types (int, float, string, bool, datetime)
Numeric metadata: min, max, range
Categorical metadata: unique values & counts
Nullability check: detect if features contain missing values
Human-readable docs (__str__) for quick schema inspection
Exportable schema to dict / DataFrame for further use

Installation

pip install feature_schema

Usage

1. Create the Schema for a DataFrame

import pandas as pd
from feature_schema import FeatureSchema

# Sample dataset
df = pd.DataFrame({
    "age": [25, 30, 40, 22],
    "salary": [50000.0, 60000.5, 80000.2, 45000.0],
    "city": ["NY", "SF", "LA", "NY"]
})

# Create Feature schema object
fs = FeatureSchema(df)

# Print schema (human readable)
print(fs.to_dict())

Output:

[
    {'column_name': 'age', 'dtype': 'int64', 'type': 'int', 'nullable': np.False_, 'min': 22.0, 'max': 40.0, 'unique_values': 4}, {'column_name': 'salary', 'dtype': 'float64', 'type': 'float', 'nullable': np.False_, 'min': 45000.0, 'max': 80000.2, 'unique_values': 4}, {'column_name': 'city', 'dtype': 'object', 'type': 'string', 'nullable': np.False_, 'unique_values': 3, 'unique_list': ['NY', 'SF', 'LA']}
]

2. Export Schema as Dictionary / DataFrame

# As dictionary
schema_dict = fs.to_dict()
print(schema_dict)

# As Object
schema_df = fs.schema
print(schema_df)

# As DataFrame
schema_df = fs.to_dataframe()
print(schema_df)

3. Save the Model with Feature Schema

model = LinearRegression()
model.fit(X, y)

# Extract feature schema
fs = FeatureSchema(df)

# Bundle model + schema
package = {
    "model": model,
    "schema": fs.to_dict()  }

# Save with pickle
with open("model_with_schema.pkl", "wb") as f:
    pickle.dump(package, f)

print("✅ Model + schema saved!")

4. Use the Pre-trained Model with Schema for Dynamic Feature Input

Load the pickled package (model + schema)

import pickle
import streamlit as st

uploaded_file = st.file_uploader("Upload your trained ML model (.pkl)", type=["pkl","pickle"])

if uploaded_file is not None:
    package = pickle.load(uploaded_file)
    model = package["model"]
    schema = package["schema"]

    st.success("✅ Model loaded successfully!")
    st.subheader("Enter Input Features:")

    feature_values = []

    # Dynamically generate input widgets based on schema
    for feat in schema:
        col_name = feat["column_name"]
        dtype = feat["type"]
        min_val = feat.get("min", 0)   # default 0 if None
        max_val = feat.get("max", 100) # default 100 if None

        # Unique key to avoid Streamlit widget conflicts
        key = f"input_{col_name}"

        # Render input widgets based on feature type
        if dtype == "int":
            val = st.number_input(
                col_name, min_value=int(min_val), max_value=int(max_val),
                value=int(min_val), step=1, key=key
            )
        elif dtype == "float":
            val = st.number_input(
                col_name, min_value=float(min_val), max_value=float(max_val),
                value=float(min_val), key=key
            )
        else:  # string or other types
            val = st.text_input(col_name, key=key)

        feature_values.append(val)

Why Use feature_schema?

Eliminate hardcoding of feature names, types, and value ranges.
Automatically generate dynamic input forms for Streamlit or validation schemas for FastAPI.
Save and bundle schema with ML models for reproducibility and consistency.
Instantly document datasets for your team or project.
Validate incoming data to prevent type or value mismatches before predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
feature_schema		feature_schema
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

feature_schema

Features

Installation

Usage

1. Create the Schema for a DataFrame

Output:

2. Export Schema as Dictionary / DataFrame

3. Save the Model with Feature Schema

4. Use the Pre-trained Model with Schema for Dynamic Feature Input

Load the pickled package (model + schema)

Why Use feature_schema?

About

Uh oh!

Releases

Packages

Languages

License

chaanakyaaM/Feature_Schema

Folders and files

Latest commit

History

Repository files navigation

feature_schema

Features

Installation

Usage

1. Create the Schema for a DataFrame

Output:

2. Export Schema as Dictionary / DataFrame

3. Save the Model with Feature Schema

4. Use the Pre-trained Model with Schema for Dynamic Feature Input

Load the pickled package (model + schema)

Why Use feature_schema?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages