LDAP: couldn't connect to LDAP server
Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
mind:word2vec [2016/09/25 16:27]
bayb2
mind:word2vec [2016/09/25 16:49] (current)
bayb2
Line 1: Line 1:
 =Word2Vec= =Word2Vec=
  
-==Notation== +==Intro== 
-===Algebraic notation===+“Just as Van Gogh’s painting of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.” --DL4J
  
- ​knee ​leg = elbow - arm+Word2Vec can guess a word’s association with other words, or cluster documents and define them by topic. It makes qualities into quantities, and similar things and ideas are shown to be “close” in its 500-dimension vectorspace.
  
-===English logic===+Word2Vec is not classified as "deep learning"​ because it is only a 2-layer neural net.
  
- knee is to leg as elbow is to arm+Input -> text corpus 
 +Output -> set of vectors, or neural word embeddings
  
-===Logical analogy notation=== 
  
- knee : leg :: elbow : arm+===Examples===
  
 + Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing
  
 + king : queen :: man : woman
  
 + house : roof :: castle : [dome, bell_tower, spire, crenellations,​ turrets]
  
-Word2Vec is a 2-layer neural net + ​China ​Taiwan :: Russia : [Ukraine, Moscow, MoldovaArmenia]
-Input -> text corpus +
-Output ->​ set of vectors (neural word embeddings) +
-Can guess a word’s association with other words, or cluster documents and define them by topic. +
-More researchCosine similaritydot product equation+
  
  
-==Models== +==Notation== 
-===Continuous bag of words model===  +===Algebraic notation===
-Using context to predict a target word. Faster. +
-===Skip-gram model===  +
-Using a word to predict a target context. Produces more accurate results on large datasets.+
  
-“Just as Van Gogh’s painting of sunflowers is a two-dimensional mixture of oil on canvas that represents vegetable matter in a three-dimensional space in Paris in the late 1880s, so 500 numbers arranged in a vector can represent a word or group of words.”+ ​knee ​leg = elbow arm
  
-Each word is a point in a 500-dimensional vectorspace. +===English logic===
-Qualities become quantities, and similar things and ideas are shown to be “close”.+
  
-More than three layers in a neural network (including input and output) qualifies ​as “deep” learning. Deep means more than one hidden layer.+ knee is to leg as elbow is to arm
  
-==Implementation== +===Logical analogy notation===
-Word2Vec can be implemented in DL4J, TensorFlow+
  
 + knee : leg :: elbow : arm
  
 +==Models==
 +===Continuous bag of words (CBOW) model===
 +*Uses a context to predict a target word. Faster.
 +*Several times faster to train than the skip-gram, slightly better accuracy for frequent words.
  
 +===Skip-gram model===
 +*Uses a word to predict a target context.
 +*Works well with small amount of the training data, represents well even rare words or phrases.
 +*Produces more accurate results on large datasets.
  
-==Examples== +==Implementation== 
- +Word2Vec can be implemented in DL4JTensorFlow
-Rome - Italy = Beijing - China, so Rome - Italy + China = Beijing +
- +
-king : queen :: man : woman +
- +
-house : roof :: castle : [dome, bell_tower, spire, crenellationsturrets]+
  
-China : Taiwan :: Russia : [UkraineMoscow, Moldova, Armenia]+==To research== 
 +*Implementation 
 +*Cosine similaritydot product equation usage
  
 ==Links== ==Links==
 *http://​deeplearning4j.org/​word2vec *http://​deeplearning4j.org/​word2vec
  
mind/word2vec.1474842473.txt.gz · Last modified: 2016/09/25 16:27 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0