ipa2tipa is a script to convert International Phonetic Alphabet (IPA) into TeX IPA (TIPA), an IPA notation for
from ipa2tipa import IPA
# Create IPA string and convert to TIPA
ipa = IPA("ˈtʰiː")
tipa = ipa.to_tipa()
print(tipa) # "t\super{h}i:
# IPA and TIPA are str subclasses
print(isinstance(ipa, str)) # True
print(isinstance(tipa, str)) # True
print(len(ipa)) # 4 (Unicode character count)from ipa2tipa import IPA
# Aspiration and length
ipa = IPA("tʰiː")
print(ipa.to_tipa()) # t\super{h}i:
# Nasalization
ipa = IPA("nãɪ̃")
print(ipa.to_tipa()) # n\~{a}\~{I}
# Tone marks
ipa = IPA("tʰjɛn˧˥")
print(ipa.to_tipa()) # t\super{h}jEn\tone{35}
# String operations work as expected
ipa = IPA("ko̞ko̞")
print(ipa.upper()) # KO̞KO̞
print(ipa + " test") # ko̞ko̞ testThe library provides two main classes:
IPA(str) → .to_tipa() → TIPA(str)
Both IPA and TIPA are subclasses of str, representing actual IPA and TIPA strings respectively.
| Component | Description |
|---|---|
IPA |
IPA string class with .to_tipa() method |
TIPA |
TIPA string class (str subclass) |
UnicodeUnit |
Internal representation of a Unicode character with base and modifiers |
UnicodeUnit is used internally to represent the structure of Unicode characters:
base: The base character codepoint (e.g.,'0074'for 't')modifiers: List of modifier codepoints (e.g., aspiration, nasalization)
Internally, the conversion follows these steps:
- Decompose IPA string into Unicode codepoints
- Group codepoints into base + modifiers (
UnicodeUnit) - Convert to TIPA notation
| Name | Content |
|---|---|
README.md |
what you are reading right now |
LICENSE.md |
this script is distributed under MIT License |
ipa2tipa.py |
main script |
ipa2tipa_test.py |
brief unittests implemented with a standard library unittest |
uni2tipa/uni2tipa-supsub.csv |
data in the format of UTF-8 (hex), tipa macro denoting next super/subscript |
uni2tipa/uni2tipa-tone.csv |
data in the format of UTF-8 (hex), tipa macro of tone letters |
uni2tipa/uni2tipa0.csv |
data in the format of UTF-8 (hex), tipa macro taking 0 args |
uni2tipa/uni2tipa1.csv |
data in the format of UTF-8 (hex), tipa macro taking 1 arg |
uni2tipa/uni2tipa2.csv |
data in the format of UTF-8 (hex), tipa macro taking 2 args |