ncxlib Dataset storage and loader

This repository provides a structured approach for loading, processing, and saving datasets in a binary format using Python. It is designed to work with popular datasets (such as MNIST) stored in binary formats and allows for easy serialization with pickle. The code processes images and labels into structured data, which can be loaded into memory as needed.

This repo is mainly for internal usage but also has perma links for preprocesssed and pickle loaded popular datasets.

Storage Format

Each data file is named as ncxlib..data inside the data// folder. Every pickle file contains data in the following structure once loaded:

    {
        "X_train": list[],
        "X_test": list[],
        "y_train": list[],
        "y_test": list[],
    }

Getting started

You can directly download the dataset using curl:

curl -o ncxlib.mnist.data <perma-link>

Datasets

Dataset	Description	Permanent Link
MNIST	A dataset for handwritten number images and labels by the NIST foundation.	Link

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dtypes.py		dtypes.py
main.py		main.py
mnist.py		mnist.py
requirements.txt		requirements.txt
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ncxlib Dataset storage and loader

Storage Format

Getting started

Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ncxlib/dataset-parser

Folders and files

Latest commit

History

Repository files navigation

ncxlib Dataset storage and loader

Storage Format

Getting started

Datasets

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages