Skip to content

Feature: fix Korean mojibake #29

@martinblech

Description

@martinblech

It would be great if ftfy could fix cases like this:

>>> s = u'¼Ò¸®¿¤ - »ç¶ûÇÏ´Â ÀÚ¿©'
>>> print s.encode('latin1').decode('euc_kr')
소리엘 - 사랑하는 자여

but it doesn't:

>>> print ftfy.fix_text_segment(s)
14Ò ̧®¿¤ - »çûÇÏ ́ ÀÚ¿©

Source: http://media.yohan.net/7.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions