dictionary that sorts results by word frequency and character count, with character stroke order lookup. the application is contained in a single file under compiled/ and also works offline. it is also hosted here.
see under data/
- words-by-frequency.csv
- words-by-frequency-with-pinyin.csv
- table-of-general-standard-chinese-characters.csv
- additional-characters.csv
- characters-strokes-decomposition.csv
- characters-pinyin-count.csv
- cedict.csv: filtered csv version of cedict with one translation per line
- hsk.csv: hsk 1-9
- words-by-frequency-with-pinyin-translation.csv
- hsk-pinyin-translation.csv
- characters-by-pinyin-learning.csv
- characters-by-pinyin-learning-rare.csv
- characters-by-pinyin.csv
- characters-by-pinyin-by-count.csv
- characters-by-pinyin-common.csv
- characters-overlap.csv
- characters-overlap-common.csv
- syllables-tones-character-counts.csv
- pinyin-learning.csv
- characters-learning.csv
- characters-learning-reduced.csv
- syllables-character-counts.csv
- syllables-tones-character-counts-common.csv
- extra-components.csv
- extra-stroke-counts.csv
- characters-strokes-decomposition-new.csv
- characters-composition.csv
- composition-hierarchy.txt
- words-by-type/
- characters-svg-animcjk-simple.json: contains svg for the individual strokes as simple lines and the directions of strokes
- field 1: paths ordered by stroke order
- field 2: direction vectors for each stroke
- anki decks
- hanzi.apkg, character, words -> pinyin, example words with translation, components with pinyin
- pinyin.apkg, pinyin -> word, translation
- rares.apkg
- ... and more
- character decompositions
- character graphics
- chinese to english translations
- hsk3 word list
- table of general standard chinese characters
- word frequency
creative commons share-alike
- ./exe/update-dictionary to build html/hanyu-dictionary.html from html/hanyu-dictionary-template.html
- update-characters-data collects the character data
- update-svg-graphics regenerates the character svg graphics. it is usually with sub-commands "simplify_parallel" and then "merge" to merge result files from ./tmp to data/characters-svg-animcjk-simple.json.
- the main code file is js/main.coffee
a command-line utility to convert text. at this point, some of the conversions might be quite slow.
convert from pinyin to hanzi:
echo fa1shao1 shi4 yin1 | ./exe/hanzi-convert --hanzi
发烧 是/事/试/市/式/室/世/仕/侍/势/嗜/噬/士/奭/弑/忕/恃/戺/拭/揓/柿/栻/氏/澨/示/筮/舐/莳/螫/视/誓/谥/贳/轼/逝/适/释/铈/饰/𬤊 因/阴/喑/垔/堙/姻/愔/慇/殷/氤/洇/瘖/禋/筃/茵/裀/铟/音/骃/𬘡/𬮱
alternatives are sorted by word frequency.
convert from hanzi to pinyin:
echo 发烧试音 | ./exe/hanzi-convert --pinyin
fa1shao1 shi4 yin1
convert traditional to simplified:
echo 發燒試音 | ./exe/hanzi-convert --simplify
发烧试音
convert marks to numbers:
echo fāshāo shì yīn | ./exe/hanzi-convert --numbers
fa1shao1 shi4 yin1