usage: genki2anki.py [-h] [--filter-kanji {y,n}] genki_dir output_file
positional arguments:
genki_dir Path to the extracted GENKI Vocab apk contents.
output_file Desired path to output apkg file. THIS WILL BE
OVERWRITTEN IF IT EXISTS!
optional arguments:
-h, --help show this help message and exit
--filter-kanji {y,n} Attempt to filter out kanji images that contain only
kana. Default value is y. See README for more
information.
This is a script to convert the assets from the GENKI Vocab Cards Android app into a deck for the Anki flashcard software. The assets themselves are not included in this repo. You will have to buy the application and extract them yourself.
The resulting notes will have each word in English, Japanese in Kana, Japanese in Kanji (as an image), the reading audio, and the illustration if one exists. It will also tag each card with its Genki lesson (chapter).
Prerequisites: python3 with pip
- Clone this repo and install the dependencies with
pip3 install -r requirements.txt
- Install adb on your computer and enable USB debugging on your Android device.
- Purchase the Genki Vocab App from the Play Store and install it on your device.
- Determine where the apk is stored on your device with the command
adb shell pm path jp.co.japantimes.genkivocab
. This should produce output that looks likepackage:<PATH_TO_APK>
. Copy everything after the colon. - Copy the apk to your computer with the command
adb pull <PATH_TO_APK>
, where<PATH_TO_APK>
is the path from the previous step. - Unzip the apk. Depending on how you do this you might need to change the
extension from
.apk
to.zip
. - Run this program with the path to the extracted apk contents as the first
argument and the desired output file as the second:
./genki2anki.py /path/to/genki ./anki.apkg
.
If everything goes well you should see a long list of words scroll down the screen and the file you specified will be created. You can then import it into Anki.
--filter-kanji
: Kanji in the Genki app are stored as plain png images that
include both the Kanji and furigana. Unfortunately some words don't have kanji
associated with them so the included image just contains kana, resulting in anki
cards that are duplicates (we already have cards for the kana alone). If this
flag is turned on (it is on by default) then the script will try to check for
the presence of furigana in each kanji image, and exclude any where it is not
found. This effectively filters out the 375 images that contain kana instead of
kanji, but does slow down deck generation a bit and could break if a significant
app update changes how the images are laid out. I recommend leaving it turned on
unless your generated deck turns out to be missing a bunch of kanji cards (also
if that happens, please file a bug so I can fix it).
This is in case you want to add unicode versions of the Kanji to replace the images from the Genki app. The cards are written to show the UKanji field if it is not empty, the Kanji field otherwise, and not generate the card at all if both are empty. Also, if you do fill in the UKanji field you can toggle between the two fields by tapping or clicking on the kanji on the cards. Personally I've used this feature to change some of the kanji (友だち to 友達), add some extra cards from Tae Kim's grammar guide in a way that doesn't stand out from the rest of the cards, and just to have more recognizable unicode characters instead of stylized calligraphy while reviewing. Feel free to ignore or even delete this field if it doesn't sound useful to you.
While it's possible to hack the python script to change how the deck is generated, it's probably not worth the effort right now since a lot of it exists in json embedded in sql queries and a handful of unique ids are hardcoded. A far easier option is to just generate the deck and then customize the notes and cards inside of Anki. You can find extensive documentation on how to do so on Anki's own website here. Maybe one day I'll break out the templates into their own HTML files for easier customization, but probably not.
If you don't like having certain cards then you should delete the card type after importing the cards into Anki. For reasons discussed in the previous question, it is not practical to support generating decks with some cards included or excluded, so instead the script tries to include every possible card with every piece of information from the Genki app so you can customize it however you'd like. For example, I removed the sentence card types but moved the example sentences to the backs of other cards so I could still use them for practice.
- Some of the readings have hints in parenthesis that give away the English meaning (for example "あの (um . . .)"). Your best option is just do edit the cards as they come up during study sessions.