Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
mind:word2vec [2016/09/25 16:27]
bayb2
mind:word2vec [2016/09/25 16:42]
bayb2
Line 1: Line 1:
 =Word2Vec= =Word2Vec=
 +
 +==Intro==
 +“Just as Van Gogh’s painting of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.” --DL4J
 +
 +
 +Word2Vec can guess a word’s association with other words, or cluster documents and define them by topic. It makes qualities into quantities, and similar things and ideas are shown to be “close” in its 500-dimension vectorspace.
 +
 +Word2Vec is not classified as "deep learning"​ because it is only a 2-layer neural net.
 +
 +===Examples===
 +
 + Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing
 +
 + king : queen :: man : woman
 +
 + house : roof :: castle : [dome, bell_tower, spire, crenellations,​ turrets]
 +
 + China : Taiwan :: Russia : [Ukraine, Moscow, Moldova, Armenia]
 +
  
 ==Notation== ==Notation==
Line 15: Line 34:
  
  
- 
- 
-Word2Vec is a 2-layer neural net 
 Input -> text corpus Input -> text corpus
 Output ->​ set of vectors (neural word embeddings) Output ->​ set of vectors (neural word embeddings)
-Can guess a word’s association with other words, or cluster documents and define them by topic.+
 More research: Cosine similarity, dot product equation More research: Cosine similarity, dot product equation
  
  
 ==Models== ==Models==
-===Continuous bag of words model===  +===Continuous bag of words (CBOW) ​model===  
-Using context to predict a target word. Faster.+Uses a context to predict a target word. Faster. 
 ===Skip-gram model===  ===Skip-gram model===
-Using a word to predict a target context. Produces more accurate results on large datasets.+Uses a word to predict a target context. Produces more accurate results on large datasets.
  
-“Just as Van Gogh’s painting of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.” 
  
 Each word is a point in a 500-dimensional vectorspace. Each word is a point in a 500-dimensional vectorspace.
-Qualities become quantities, and similar things and ideas are shown to be “close”.+
  
 More than three layers in a neural network (including input and output) qualifies as “deep” learning. Deep means more than one hidden layer. More than three layers in a neural network (including input and output) qualifies as “deep” learning. Deep means more than one hidden layer.
Line 43: Line 59:
  
  
-==Examples== 
- 
-Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing 
- 
-king : queen :: man : woman 
- 
-house : roof :: castle : [dome, bell_tower, spire, crenellations,​ turrets] 
- 
-China : Taiwan :: Russia : [Ukraine, Moscow, Moldova, Armenia] 
  
 ==Links== ==Links==
 *http://​deeplearning4j.org/​word2vec *http://​deeplearning4j.org/​word2vec
  
mind/word2vec.txt · Last modified: 2016/09/25 16:49 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0