Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
mind:down-by-the-bay [2017/06/29 22:31]
norkish
mind:down-by-the-bay [2017/07/21 07:17] (current)
norkish
Line 57: Line 57:
  
 ===To Do=== ===To Do===
-Finish draft of paper - Paul+Figure out whether parallelization can be improved somehow (is there synchronization hiding there somewhere?​) 
 +* Figure out a good threshold for the memory cap for FSL 
 +* Try running on smaller node on FSL 
 +* Improve rhyme function 
 +* Add grammar constraints 
 +* Weaken stress constraints?​
 * Genetic algorithm to find optimal weights for rhyme function using linguistic attributes - Ben * Genetic algorithm to find optimal weights for rhyme function using linguistic attributes - Ben
-** Get RhymeZone API working 
 * Get some good examples from our system * Get some good examples from our system
-* Try haiku, limerick (either abstract or specify rhymes ahead of time), kids book 
-** Roses are red, Violets are blue (use half-generated examples) 
-* Try Hirjee with alignment to improve rhyming - Paul 
 * Don't split on all punctuation,​ make sure that syllables are generalizable beyond punctuation * Don't split on all punctuation,​ make sure that syllables are generalizable beyond punctuation
 * Adjust constraints:​ * Adjust constraints:​
Line 71: Line 72:
 ** constraints typical of noun phrases ** constraints typical of noun phrases
 * Try the NOW corpus * Try the NOW corpus
-* Maybe aim for simple constraint set initially for proof concept, then discuss how to allow multiple possibilities without actually implementing 
-* Set up twitterbot - Ben 
-* Filter language - Paul 
 ===Future To Do=== ===Future To Do===
 * Find a way to assign stress for multi-syllable G2P pronunciations * Find a way to assign stress for multi-syllable G2P pronunciations
 * Allow syllable tokens to have multiple POSes for contraction words and combine Stanford POS tokens for contraction words * Allow syllable tokens to have multiple POSes for contraction words and combine Stanford POS tokens for contraction words
 ===Done=== ===Done===
 +* Finish draft of paper - Paul
 +** Get RhymeZone API working - Ben
 +* Try haiku, limerick (either abstract or specify rhymes ahead of time), kids book
 +** Roses are red, Violets are blue (use half-generated examples)
 +* Try Hirjee with alignment to improve rhyming - Paul
 +* Maybe aim for simple constraint set initially for proof concept, then discuss how to allow multiple possibilities without actually implementing
 +* Set up twitterbot - Ben
 * Normalize probabilities for sentences with multiple pronunciations * Normalize probabilities for sentences with multiple pronunciations
 * Discovered that I couldn'​t fully normalize probabilities correctly without framing sequence in terms of priors and transitions (vs just transitions) because how do you normalize initial probabilities without a prior? - Paul * Discovered that I couldn'​t fully normalize probabilities correctly without framing sequence in terms of priors and transitions (vs just transitions) because how do you normalize initial probabilities without a prior? - Paul
Line 100: Line 105:
 * Add constraints in main class - Paul * Add constraints in main class - Paul
  
-==Examples== 
-Training on 5000 sentences without including multiple pronunciations per sentence we found one satisfying solution for one rhythmic template: 
-<pre> 
-Now training on /​Users/​norkish/​Archive/​2017_BYU/​ComputationalCreativity/​data/​COCA Text DB/​text_fiction_awq/​w_fic_1990.txt 
-Trained on 5000 sentences 
-For Rhythmic Template: [0, 1, -1, -1, -1, 1, 0, -1, 0, 1, -1] 
-Creating 4-order NHMM of length 6 with constraints:​ 
- At position 0: 
- constraintStress:​0 
- POSes:​[DT,​ JJ, NN] 
- Must be first syllable in a word 
- At position 1: 
- constraintStress:​1 
- POSes:​[NNS,​ NNPS, NNP, NN] 
- Must be last syllable in a word 
- At position 2: 
- constraintStress:​1 
- At position 3: 
- constraintStress:​0 
- Must be last syllable in a word 
- At position 4: 
- constraintStress:​0 
- Current or one of previous 2 syllables must be in [IN, VBG, NN] 
- At position 5: 
- constraintStress:​1 
- POSes:​[RB,​ NNS, NNPS, NNP, JJ, NN, VBG] 
- Rhyme with syllable 4 previous 
- Must be last syllable in a word 
-.....  
-Have you ever seen iced cakes inside The Bake down by the bay? 
-</​pre>​ 
- 
-The reason I declared this as the first is that there were several bugs in which positions we were assigning constraints to, so this is the first time that the solution accurately reflected all of the constraints we had intuited needed satisfying. 
- 
-I've just now implemented training on multiple pronunciations of a given training sentence. This will amp up training a lot. Just to give you an idea, for 5 sentences, there were 43 sentence pronunciations;​ on 5000 sentences, 678400 sentence pronunciations. But it does find a few more possibilities. ​ 
- 
-<pre> 
-Now training on /​Users/​norkish/​Archive/​2017_BYU/​ComputationalCreativity/​data/​COCA Text DB/​text_fiction_awq/​w_fic_1990.txt 
-Trained on 5000 sentences, 678400 sentence pronunciations 
-For Rhythmic Template: [0, 1, -1, -1, -1, 1, 0, -1, 0, 1, -1] 
-Creating 4-order NHMM of length 6 with constraints:​ 
- At position 0: 
- constraintStress:​0 
- POSes:​[DT,​ JJ, NN] 
- Must be first syllable in a word 
- At position 1: 
- constraintStress:​1 
- POSes:​[NNS,​ NNPS, NNP, NN] 
- Must be last syllable in a word 
- At position 2: 
- constraintStress:​1 
- At position 3: 
- constraintStress:​0 
- Must be last syllable in a word 
- At position 4: 
- constraintStress:​0 
- Current or one of previous 2 syllables must be in [IN, VBG, NN] 
- At position 5: 
- constraintStress:​1 
- POSes:​[RB,​ NNS, NNPS, NNP, JJ, NN, VBG] 
- Rhyme with syllable 4 previous 
- Must be last syllable in a word 
-..... 
- Have you ever seen The top was the lookout down by the bay? 
- Have you ever seen The mote in her gemstone down by the bay? 
- Have you ever seen iced cakes inside The Bake  down by the bay? 
-</​pre>​ 
- 
-Many of these variations are on syllable stresses. That adds a lot states. Takes much longer to build the model, of course. I eliminated some pronunciations if they had the exact same phonemes with all stresses strictly greater than or equal with those of another pronunciation. This cut down the number of pronunciations per sentence, such that for 5 sentences, there are now only 18 sentence pronunciations. Saves a lot of time without hurting our model. 
- 
-As anticipated,​ our model pretty much can only regurgitate exact matches from the training data to match the constraints,​ and there are relatively few, so I think doing the abstract model may be a must for this paper. ​ 
mind/down-by-the-bay.txt · Last modified: 2017/07/21 07:17 by norkish
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0