Downes.ca ~ Stephen's Web ~ How to Automatically Generate Textual Descriptions for Photographs with Deep Learning

How to Automatically Generate Textual Descriptions for Photographs with Deep Learning

Jason Brownlee, Machine Learning Mastery, Nov 16, 2017
Commentary by Stephen Downes

OK, you're not actually going to learn how to do this simply by reading the article, but you will learn how it's done, and more importantly, that it can be done. The task breaks down into three parts: classifying images (do you see a cat, a rabbit?), describing images (providing a natural language summary of the content), and annotating images (generating text descriptions for specific parts of the image). So basically we're associating object recognition with language strings (in English, in French, whatever). Going further, the neural networks can act as feature extractors, which map images to "an internal representation of the image, not something directly intelligible." Language generation algorithms, coder-decoder algorithms, and an attention mechanism mechanism round out the picture. It's pretty interesting.

Today: 0 Total: 1168 [Direct link] [Share]

View full size