This post was co-written by Soumith Chintala and Wojciech Zaremba of Facebook AI Research.
Language is the medium of human communication. Giving machines the ability to learn and understand language enables products and possibilities that are not imaginable today.
One can understand language at varying granularities. When you learn a new language, you start with words: understanding their meaning, identifying similar and dissimilar words, and developing a sense of contextual appropriateness of a word. You start with a small dictionary of words, building up your dictionary over time, mentally mapping each newly learned word close to similar words in your dictionary. Once you get familiar with your dictionary of words, you put them together into small sentences, learning grammar and structure. You eventually combine sentences in a sensible way, to write paragraphs and pages. Once you get to this stage, you are comfortable with expressing complicated thoughts in language, letting others understand your thoughts and expression.
As an example, language understanding gives one the ability to understand that the sentences “I’m on my way home.” and “I’m driving back home.” both convey that the speaker is going home.
Word Maps and Language Models
For a machine to understand language, it first has to develop a mental map of words, their meanings and interactions with other words. It needs to build a dictionary of words, and understand where they stand semantically and contextually, compared to other words in their dictionary. To achieve this, each word is mapped to a set of numbers in a high-dimensional space, which are called “word embeddings”. Similar words are close to each other in this number space, and dissimilar words are far apart. Some word embeddings encode mathematical properties such as addition and subtraction (For some examples, see Table 1).
Word embeddings can either be learned in a general-purpose fashion before-hand by reading large amounts of text (like Wikipedia), or specially learned for a particular task (like sentiment analysis). We go into a little more detail on learning word embeddings in a later section.
|Paris – France + Italy||Rome|
|bigger – big + cold||colder|
|sushi – Japan + Germany||bratwurst|
|Cu – copper + gold||Au|
|Windows – Microsoft + Google||Android|
|Montral Canadiens – Montreal + Toronto||Toronto Maple Leafs|
After the machine has learned word embeddings, the next problem to tackle is the ability to string words together appropriately in small, grammatically correct sentences which make sense. This is called language modeling. Language modeling is one part of quantifying how well the machine understands language. Continue reading