NHK Web Easy Reader with direct dictionary content retrieval

As a Japanese language learner, any tool that makes this difficult journey easier is always welcome. I do not want to debate anything about the learning process, so I will just say that, as with any other language, the “reading” skill is very useful as a proof of understanding both grammar and vocabulary. For this reason, everyday I have to say “thank you” for the guys of NHK Web Easy, who upload news in very simple Japanese (the grammar is very simple, the vocabulary is something around “medium level”).

When I go to the university I always read on the metro as many news as I can, but very often I have to switch back to my dictionary because I do not have certain words in my vocabulary. This event makes the reading task more difficult because during the “long” process of memorizing the word (reading+writting), minimizing the browser, opening the dictionary, writting the word and understanding it, I usually forgot what I was reading before. And it is not a matter of memory. When you are reading in a language you are not good at, it is extremely difficult to keep track of everything, especially in Japanese where the grammar is absolutely different from any European language (even more different than Finnish).

Therefore, I decided to make a tool to avoid all those previously mentioned steps (except for the “understanding part” of course). This tool allows me to read very fast and make the reading way more pleasant. I called this tool “NHK Reader” and, in a few words, is a language parser tool.

Steps:

  1. Takes the text from the news
  2. Uses jisho.org to separate the words (POS tagging)
  3. Uses jisho.org to get the meanings of the words
  4. Pastes that into the webpage (using jQuery)

POS tagging is not something easy, so it does not detects all words correctly and, sadly, I cannot do anything about that since I take the output from jisho.org

A couple of screenshots

GHJWAXNq YYX98bS3

This tool is parsing directly from jisho and nhk, so the regular expressions are hardcoded and it might fail if the owners decide to change the HTML code, but it should not be difficult to fix.

The code will be available on my github when the version 1.0 is ready.

Leave a Reply

Your email address will not be published. Required fields are marked *