Added --input-file-encoding as a command line argument #10

JohanNorberg · 2023-07-22T22:14:05Z

I wanted to train the program on making more Swedish names. They contain special characters like Å and Ö, so I need to read the file using utf-8. On windows (at least on my machine) this is a problem since default encoding is cp1252, so it doesn't work. So I added a command line argument so I can specify the encoding.

Wrong
python .\makemore.py -i .\swe_names.txt -o swe_names

number of unique characters in the vocabulary: 55
vocabulary:
 -ABCDEFGHIJKLMNOPRSTUVWYabcdefghijklmnopqrstuvxy¥©¶Ã–…

Correct
python .\makemore.py -i .\swe_names.txt -o swe_names --input-file-encoding utf-8

number of unique characters in the vocabulary: 54
vocabulary:
 -ABCDEFGHIJKLMNOPRSTUVWYabcdefghijklmnopqrstuvxyÅÖåéö

swe_names.txt

Btw, watching all of your videos on YT, they are great!

thbz · 2024-04-06T16:49:31Z

I agree. The first thing I did when experimenting with makemore was adding that option to let it generate French words.

Added --input-file-encoding as a command line argument

7e61719

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added --input-file-encoding as a command line argument #10

Added --input-file-encoding as a command line argument #10

Uh oh!

JohanNorberg commented Jul 22, 2023

Uh oh!

thbz commented Apr 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added --input-file-encoding as a command line argument #10

Are you sure you want to change the base?

Added --input-file-encoding as a command line argument #10

Uh oh!

Conversation

JohanNorberg commented Jul 22, 2023

Uh oh!

thbz commented Apr 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants