Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
mind:word2vec [2016/09/25 16:32]
bayb2
mind:word2vec [2016/09/25 16:49]
bayb2
Line 2: Line 2:
  
 ==Intro== ==Intro==
-Word2Vec ​is a 2-layer neural netIt can guess a word’s association with other words, or cluster documents and define them by topic. It makes qualities into quantities, and similar things and ideas are shown to be “close” in 500-dimension vectorspace.+“Just as Van Gogh’s painting of sunflowers ​is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.” --DL4J 
 + 
 +Word2Vec ​can guess a word’s association with other words, or cluster documents and define them by topic. It makes qualities into quantities, and similar things and ideas are shown to be “close” in its 500-dimension vectorspace. 
 + 
 +Word2Vec is not classified as "deep learning"​ because it is only a 2-layer neural net. 
 + 
 +Input -> text corpus 
 +Output -> set of vectors, or neural word embeddings 
 + 
 + 
 +===Examples=== 
 + 
 + Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing 
 + 
 + king : queen :: man : woman 
 + 
 + house : roof :: castle : [dome, bell_tower, spire, crenellations,​ turrets] 
 + 
 + China : Taiwan :: Russia : [Ukraine, Moscow, Moldova, Armenia]
  
  
Line 17: Line 35:
  
  knee : leg :: elbow : arm  knee : leg :: elbow : arm
- 
- 
- 
- 
- 
-Input -> text corpus 
-Output ->​ set of vectors (neural word embeddings) 
- 
-More research: Cosine similarity, dot product equation 
- 
  
 ==Models== ==Models==
 ===Continuous bag of words (CBOW) model===  ===Continuous bag of words (CBOW) model===
-Using context to predict a target word. Faster.+*Uses a context to predict a target word. Faster. 
 +*Several times faster to train than the skip-gram, slightly better accuracy for frequent words. 
 ===Skip-gram model===  ===Skip-gram model===
-Using a word to predict a target context. ​Produces more accurate results on large datasets. +*Uses a word to predict a target context. 
- +*Works well with small amount ​of the training datarepresents well even rare words or phrases
-“Just as Van Gogh’s painting ​of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880sso 500 numbers arranged in a vector can represent a word or group of words. +*Produces ​more accurate results on large datasets.
- +
-Each word is a point in a 500-dimensional vectorspace. +
- +
- +
-More than three layers in a neural network (including input and output) qualifies as “deep” learning. Deep means more than one hidden layer.+
  
 ==Implementation== ==Implementation==
 Word2Vec can be implemented in DL4J, TensorFlow Word2Vec can be implemented in DL4J, TensorFlow
  
- +==To research== 
- +*Implementation 
- +*Cosine similaritydot product equation usage
-==Examples== +
- +
-Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing +
- +
-king : queen :: man : woman +
- +
-house : roof :: castle : [dome, bell_tower, spire, crenellations,​ turrets] +
- +
-China : Taiwan :: Russia : [Ukraine, Moscow, MoldovaArmenia]+
  
 ==Links== ==Links==
 *http://​deeplearning4j.org/​word2vec *http://​deeplearning4j.org/​word2vec
  
mind/word2vec.txt · Last modified: 2016/09/25 16:49 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0