Back to Dave Coffin's Home Page

Converting English text to Shavian script

To learn about the Shavian Alphabet, check out Shavian.info.

To transliterate English-language text files and web pages into Shavian script, I've written a C program, a Python program, and a dictionary used by both.

Note: This dictionary is intended as a guide to Shavian spelling, not orthodox spelling or usage. I've packed it with all sorts of ridiculous coinages and misspellings people use on the Internet these days, rendering their supposed pronunciations in Shavian.

To use shaw.c, do:

bunzip2 dave.dict.bz2
gcc -o shaw shaw.c
shaw dave.dict < english.txt
Or do "shaw dave.dict", type stuff on the screen, and press Enter.

For shaw.py you'll need Python version 3.5 or above and the Natural Language Toolkit:

bunzip2 dave.dict.bz2
pip3 install nltk
python3
>>> import nltk
>>> nltk.download('punkt')
>>> nltk.download('averaged_perceptron_tagger')
>>> exit()
python3 shaw.py dave.dict < english.txt
Because shaw.py runs the entire text through NLTK's part-of-speech tagger in one piece, you can't use it interactively. It outputs nothing until it receives end-of-file on input. The benefit of NLTK is that part-of-speech tagging resolves most heteronyms, e.g. "live" as an adjective is "𐑤𐑲𐑝" while as a verb it is "𐑤𐑦𐑝". shaw.c can't do this because NLTK doesn't support C.