Show HN: I trained a neural network to write Kanji

166 points by hardmaru 6 years ago

zhte415 6 years ago

Edit: Reading some comments here - this doesn't seem to be about 'recognising' or 'predicting' existing characters, but using a dataset of characters to create a character by itself (which probably isn't an existing character).

This is quite 'clever'.

I don't understand the example shapes at the beginning. They're not correct strokes. How does that work?

The about page has some neat made up characters.

But after trying a few strokes, and doing so more carefully, it seems if you put in a clear radical, the character is well formed, kinda; if you put in a squiggle, all you get is a doodle... that makes sense.

Inputting 口 or 艹 for example, vs a random squiggle. Take care to make it reasonable accurate.

About characters, incase anyone doesn't know: A character is basically a 2x2 grid where 4 radicals get placed (there are (about) 201 radicals in modern Chinese, Japanese kanji too I guess?). Sometimes 'cells get merged' so the left column of 2 rows is merged to contain 1 radical, and the right contains 1 or 2 radicals. Or 'add a row' can happen at the top, for example adding a 艹 above the 2x2.

jfries 6 years ago

I don't think it's helpful to think of the structure as being based on a 2x2 grid. A bit down on this page (random Google hit) are ways to structure a character: http://www.guavarama.com/2015/03/07/learning-about-character...
Note that this structure can also apply recursively, further subdividing an area.
- zhte415 6 years ago
  
  I think it is quite useful.
  A character is a combination of radicals, of which there aren't that many.
  You're correct, obviously there are more than sticking a radical in a 2x2 grid, however the square framework is always there and emphasises symmetery.
Pamar 6 years ago

"a character is basically a 2x2 grid"... Do you have any reference for this type of classification/composition?
N.B.: I am not implying that it is incorrect - I am a dabbler in Japanese Brush Calligraphy (without any fluency in the language, only as an art) and I would like to read more about this so if you have links or books (preferably in English) I would be very happy to learn more.
- jcl 6 years ago
  
  The practice paper given to children learning kanji is commonly partitioned into a 2x2 grid, to help with proportions -- kind of like how children learning to write English are given paper with guidelines for baseline, ascenders, descenders, and mean line. You can see an example of the 2x2 grid in the Nintendo kanji game on the article's "Info" page, or by searching for "kanji practice paper".
  The lines are meant more to help relative placement than to exactly divide characters, but many characters do divide into left-and-right or top-and-bottom portions -- see Jack Halpern's SKIP system for example:
  https://en.wikipedia.org/wiki/Kodansha_Kanji_Learner%27s_Dic...
- zhte415 6 years ago
  
  Characters are written, not drawn. First learn to write.

peterburkimsher 6 years ago

I just finished making a Chinese traditional characters dataset, with 15 million items, 52,000 characters.

It's "only" 9.3 GB compressed; 13.47 GB uncompressed.

https://drive.google.com/open?id=18I-5wU54CG1lty0udBOpptAG49...

Soon I'll write an article explaining how I made it, and then try experimenting with TensorFlow.

sova 6 years ago

Fake* Kanji! Some of them had me questioning my knowledge.

hardmaru 6 years ago

This was related to some work I did a few years ago but recently had time to retrain models to make it work inside the browser in an interactive setting.
The dataset used to train the network had to be refined a bit as well to match how humans write on a tablet.
Some prev discussion a while back for the original non-interactive TensorFlow version:
https://news.ycombinator.com/item?id=10801712
- jason_slack 6 years ago
  
  I was trying something like this for Chinese characters to help my learning. Could I contact you to ask a few questions?
  
  hardmaru 6 years ago
  
  Sure, Jason.
  Btw this is the dataset used for training. It is a part of a larger open source project used for educational purposes, that might help you learn:
  http://kanjivg.tagaini.net
  
  peterburkimsher 6 years ago
  
  Hi Jason! Please get in touch with me too; I have a lot of Chinese data to share and discuss.
  I made https://pingtype.github.io - a program to break up sentences into words, pinyin and parallel translation, and typing characters by breaking them into glyphs.
  I also just finished making a large dataset of glyph images of 52,000 characters from 1200 fonts - see my other comment for the download link.
  
  jason_slack 6 years ago
  
  Thanks for reaching out! I will reply. I have been learning for years and I love it. I want others to love it as well.

SuperNinKenDo 6 years ago

Did not work well at all for me, and now I'm left wondering if it's the neural networks fault, or if I'm just very bad at writing kanji now.

kazinator 6 years ago

Hit or miss for me. It doesn't want to fill in anything if I draw the enclosure of "wind": 風. Also fails to find some of the simplest kanji; completions for just a stroke or two are quite complex. E.g. a simple downstroke suggests mouth 口, with just two more strokes, or 土 and whatnot, but .. 出てきてくれない. :) It is also stumped if I draw the first two strokes of 山; it wasn't trained on this character, to complete that middle stroke, and basically doesn't want to do anything else, either.
Some out-of-order inference would be cool. E.g. draw the bottom four dots (fire) of 煎, and have various top parts emerge. For that purpose, it would be good if there were a reference frame. That is to say: an underlying square box to serve as a target for the supplied input. If you draw something near the bottom of the empty box, then it's understood by the neural network to be a bottom part of the kanji requiring a top. I think the whole concept could really benefit from a precise agreement between the user and the neural net about the bounding box.
prewett 6 years ago

I don't think it's your kanji writing; I can't figure out what it's supposed to do. I'm assuming it's supposed to autocomplete my kanji, but it failed at that, for both 国, 本, and 三. I'm pretty confident I wrote them right, especially the last one :)
It seems to just write random kanji based on the last stroke or something. But, writing random kanji has a certain coolness factor. Especially if I could copy-paste a kanji and get it to write it for me so that I know what the stroke order is supposed to be.

mbeissinger 6 years ago

Great work as always from hardmaru :)

sbierwagen 6 years ago

Neat, it's like if A Book From The Sky was implemented as a neural network.

https://en.wikipedia.org/wiki/A_Book_from_the_Sky

SerLava 6 years ago

This is great! It gives off the feeling of a bizarre kind of intelligence.

Can you please input things like 感覚的 and 意識 and 私は部屋 and 自殺 and 何私 to freak some people out?

peterburkimsher 6 years ago

Please add the date to the title: (2015)

wei_jok 6 years ago

I think this demo was created recently, and the old article linked in the demo is there only for background info, as the author explained in one of the comments below.
TensorFlow.js only came out this year and the interactive sketch-rnn JavaScript browser demo that this was based off of is also quite recent.

vat 6 years ago

this website uses non-free javascript

hardmaru 6 years ago

Interesting
I think the reason that you believe the JS code is obfuscated is because the part of the code that contains the “weights” of the neural network, which contains 4-5 million floating point numbers of an LSTM recurrent neural network.
In fact I trained the neural network using the open source version of Sketch-RNN and encoded the weights using base64 to save you some bandwidth (https://github.com/tensorflow/magenta/tree/master/magenta/mo...)
Welcome to “Software 2.0”, I guess!
SuperNinKenDo 6 years ago

Thanks for bringing this to my attention, although a link to the Stallman article which someone linked below would have been helpful. I've installed LibreJS, I'm curious to see how many things stop working now.
gitgud 6 years ago

It's a software firm, which do machine learning projects...
Aren't they allowed to obfuscate they're IP?
baylearn 6 years ago

How so?
- mappu 6 years ago
  
  Even in the FLOSS world this is a somewhat extremist position - see https://www.gnu.org/philosophy/javascript-trap.en.html
  
  SuperNinKenDo 6 years ago
  
  Interesting, I'm inclined to support some way of removing code obfuscation. I wonder if a neural network could be trained to do that.
  
  mappu 6 years ago
  
  It can be done reasonably well without a neural network (look at the '{}' button in Chrome, or Hex-rays for C/C++).
  There is some work in applying an NN to make the result look more like "realistic" code samples: http://www.cs.unm.edu/~eschulte/data/katz-saner-2018-preprin...