Skip to content

MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.

License

Notifications You must be signed in to change notification settings

vndee/multi-mnist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

multi-mnist

MNIST data set with multiple digits. We have generated a dataset for multiple digits recognition task from MNIST (http://yann.lecun.com/exdb/mnist/index.html).

You can download generated data in release tab. See in examples folder:

examples
    - train
        + labels.csv
        + 1/
        + 2/
        ...
        + 8/
        + 9/
        
    - test
        + labels.csv
        + 1/
        + 2/
        ...
        + 8/
        + 9/    

Each folder 1, 2, 3, 4, ..., contains generated images with exactly number of digits as the name of folder. labels.csv list name of image and ground truth number respectively.

labels.csv
1.png,1
2.png,4
3.png,45
4.png,785,
5.png,1479,
...

Create your own dataset

Clone this repository:

git clone https://github.com/vndee/multi-mnist

Requirements:

  • python 3
  • numpy
  • idx2numpy
  • tqdm
  • opencv-python

Install requirements:

pip3 install -r requirements.txt

Change some parameter in main.py:

  • output_dir: Path to your expected output directory.
  • number_of_samples_per_class: Number of samples for each number of digit.

Run python3 main.py and take a look at output_dir.

About

MNIST dataset with multiple digits. This dataset can be use for learning number (more than 1 digit) regconizer model.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages