Differences

This shows you the differences between two versions of the page.

Link to this comparison view

mind:lyristtodo [2016/12/19 07:57] (current)
bayb2 created
Line 1: Line 1:
 +=Lyrist To-Dos=
 +==Research==
 +> Look into this: http://​wordnet.princeton.edu/​
 +> Look for exitsing rhyming APIs, like Rhyme Zone datamuse.com/​api
 +> Do more research on Stanford CoreNLP, nltk, and Parsey McParseface
 +> Learn how to do vector arithmetic, learn what cosine distances are. Kahn academy.
 +> Research all my paper’s sources more in-depth, find further applications
 +> Stay up-to-date with NLP and NLG news, browse Google Scholar for ideas
 +> Hierarchical Neural Network Model? https://​pdfs.semanticscholar.org/​17f5/​c7411eeeeedf25b0db99a9130aa353aee4ba.pdf
 +> Read: word2vec parameter learning explained https://​arxiv.org/​abs/​1411.2738
 +> Read http://​multithreaded.stitchfix.com/​blog/​2015/​03/​11/​word-is-worth-a-thousand-vectors/​
 +> Read http://​hen-drik.de/​msc_thesis/​sci_2015_heuer_hendrik.pdf
 +> Read https://​districtdatalabs.silvrback.com/​modern-methods-for-sentiment-analysis
 +> LSTM (long short term memory)?
 +> Eventually learn about matrices, they seem useful too
 +> Chinese poetry generation https://​arxiv.org/​pdf/​1604.01537v1.pdf
 +> Come up with second study idea for evaluating poem scores and people’s preferences
 +> Wevi: https://​ronxin.github.io/​wevi/​ https://​github.com/​ronxin/​wevi
 +> Frank Liang -> syllibificaiton w/out phonemes
 +> Go back and study my lab from CS 236, including rules.
 +> Study and understand casting, superclasses,​ subclasses, and inheritance better
 +> Understand Clonable in Java, find out whether it’s okay or depreicated
 +> Study Stanford classes I use http://​www-nlp.stanford.edu/​nlp/​javadoc/​javanlp/​edu/​stanford/​nlp/​tagger/​maxent/​MaxentTagger.html
 +> Look up RhymeZone’s API http://​www.rhymezone.com/​
 +> Look at generators on this list http://​www.yoyogames.com/​blog/​119
  
 +==Design==
 +> Decide on really good naming conventions,​ programming-wise,​ linguistics-wise,​ and music-wise.
 + Frequency means occurences / total # of words
 + Should methods return result, or something more specific
 +> Decide the relationship between filters and jobs, look back at CS 340 and decide which design patterns to use
 + State pattern for operation-filter process on songs? Draw out possible diagrams.
 + Command pattern for filters and jobs?
 +> Use Paul’s classes for inspiration
 +> Decide how to syllibify
 +> Figure out exceptions, errors. ​
 + > What to do in a case where filters block all replacement words? NoReplacementException?​
 +
 +> Keep punctuation somehow
 +> Retain grammar across lines
 +> Use priority queue for Word2Vec operations?
 +>> Eventually think of a way to make currently relevant lyrics by using day-of data from the web. Bot compositions are great, but the day after election day they seemed out of it and unconnected to current human emotion.
 +> Decide whether to make an optional recursive option sort of like a bigger stanza that can hold lists of stanzas
 +> Decide whether protected or private is better when dealing with big class hierarchies
 +> Decide good enum practice, look at Paul’s examples
 +> Think of a way for Lyrist to theme certain lines and song sections differently
 +& While designing, imagine the project at a much larger scale. Does the current design make sense?
 +> Design time tracker by method
 +> Decide how to accept user parameters w/ command line args or an outside file w/ chosen settings.
 +> Decide if I should use Doc2Vec or Phrase2Vec or my own Java methods.
 +> Decide how to deal with punctuation,​ capitalization. Have lower-case mode and normal mode. Have no punctuation mode and normal punctuation mode.
 +> Decide scope and access rules, like should a PosTagger accept a song, or an arraylist of arraylists of strings?
 +> Look into Pop* lyric stuff, appropriate wherever possible
 +> Come up with a good plan for proper nouns (named entity recognition)
 +> Decide which POS tagger is best: Stanford CoreNLP, nltk, or Parsey McParseface
 +> Come up with lots of new W2v job ideas
 + Opposite finder—spacially?​ By analogy (White is to black as word is to opposite)?
 +> Come up with word data usage ideas
 + Occurences / total words = frequency
 +> Come up with lots of new filter ideas
 +> Figure out whether its best practically to use marked words, and if so, whether there should be multiple types of word markings.
 +> Figure out how to integrate n-grams to build actual new sentences (modify templates before replacements?​).
 +> Figure out how to use rules like my rap research to score new sentences.
 +> Figure out measures of song aesthetic like the 4 in my poem research: appropriateness,​ flamboyance,​ lyricism, relevancy.
 +> Figure out how to use comparison, analogy, metaphor, and similee like in my poem research.
 +> Figure out what interesting features could come about by Lyrist drawing content from the Internet.
 +> Figure out out how to deal with word stresses and meter
 +> Think of ways to generate new themes
 +> Decide how the high-level lyric replacement jobs are decided (all mark then replace by analogy?)
 +> Ponder ways to make program faster, somehow simply time every method’s time taken in the program. Look at proportions.
 +> Decide whether and where to use serialization
 +> Find out how to add a corpus to an already-existing model (Gensim or DL4J lets you do that), also how to weight it.
 +> Figure out how to optimally categorize data for filters use.
 +> Brainstorm in advance parameter sets for model training on my corpora (see above, word2vec parameter learning explained). Try a few different sets of parameters and compare the results.
 +> Decide if it’s okay to use only 1 model or if I should have multiple
 +> If I ever release any source code, come up with ideas for crazy cool comments I can put in it. Poetry, inspirational quotes, generated pieces, art w/ characters, scriptures, images, codes to decrypt, links to interesting external points. Don’t make it creepy; make it artistic and inspirational.
 +> Design a brain
 + > Design my intention base. Based on a general song sentiment, learn from a class of emotional progressions and create my own 500-dimensional new emotional progression to follow.
 + > Design a way for Lyrist to choose its own filters, its own templates, etc
 +
 +==Implementation==
 +> Change Sentiment to Theme. Sentiment now means something more specific.
 +> Add Google n-gram frequency filter, allowing year input and threshold input
 +> Stop variations of “to be” from being replaced
 +> Look up verbs in dictionary, then tag them with transitive, intransitive,​ or ambitransitive
 +> Make complete Stanford pipeline parse complete song before any alterations. Use these part of speech tags and named entity tags. Have templateReader read the same text and put its tagged Word objects into its own objects (Song, Stanzas, Lines).
 +>> Get Java W2v operations to read in bytes correctly
 + > Then train newer, better, word2vec models
 + >> Then Ensure my word2vec operations for analogy and sentiment are correct
 + > Then experiment with multiplication and division
 +>> use a serialized Stanford NLP pipeline object
 +>> Figure out how the c word2vec script avoids a triple nested for loop.
 +>>>​ Outline a complete contract for Lyrist: Where errors occur, required input, guarunteed output, etc.
 +> Simplify any object that there are thousands of instances of (W2vSuggestion,​ W2vWordSuggestion,​ Word, the Stanford tagging process)
 +>> Make an enum for each filter type, use this for FiltrationCommander instead.
 +> Change multiple spaces to tabs in IntelliJ
 +> Have list of filter enums that are the currently functional filters.
 +> Make a W2V job only return the resulting song, then have filters and other jobs run on it elsewhere
 +
> Recognize compound nouns, replace them with nouns or compound nouns
>​ use coreferences to get gender right
 +>> Build point system
 + Give points to words / sentences for: correct POS, similar POS, correct NER, a good rhyme, a mediocre rhyme, sticking to a meter, sticking to a grammatical structure

 +> Add ability to preserve punctuation or not
 +> Scan Pos as best as possible
 + > Scan w2v suggestions within their context
 +> Use more advanced Pos, all the categories mom taught me
 +> Recognize named entities
 +> Add a good dictionary filter
 +> Scan for phonemes accurately
 +> Scan for syllables accurately
 +> Scan for stresses accurately
 +> Add a good Rhyme filter
 +> Turn off blinking cursor
 +> Separate Rhyme-complete from Lyrist’s RhymeFilter
 +
 +> Build a class or software that interfaces w/ Twitter API
 +> Build my own scraper for a giant lyrical database. Store lyrics by artist / group, genre, date written / published, structure
 +> Implement structures using superclass to make an object of a subclass like this:
 + List<​String>​ names = new ArrayList<​String>​();​
 +> Make a bunch of useful exceptions and organize them
 +@> Allow W2vModel to be Serializable. This may speed up the model loading time.
 +> Get rid of the Stanford logging, I think it slows me down a bit when there’s one print for 
 +> Eventually set up test classes mirroring all my current classes.
 +> Eventually make my git repositiories private, especially before any publication.
 +> Change the annoying default comment when I make a new class
 +> Implement time tracker that tells the name of the method and the time it took
 +> Build functionality for the mean of a Job, and the heirarchy:
 + 1 job = x input words, 1 output word
 +> By trial and error get my software to handle large datasets like Google News.
 +> Try building huge models on the supercomputer.
 +> Test effects of different arithmetic operations on word vectors.
 + word * pi
 + word * e
 + word * constants > 1 (intensifies word? simply ruins its meaning?)
 + word * constants < 1 (weakens word? simply ruins its meaning?)
 + word * word (finds word related to both? gives unrelated/​useless result?)
 + sqrt(word)
 + log(word)
 + standard deviation(words) = range of ideas??
 + median(words)
 + averge(words) = sentiment?
 +> Eventually do test classes. Check my line coverage and eliminate functions I don’t need.
 +> Eventually start writing my crazy cool text to be inserted into Lyrist as comments.
 +> Eventually do documentation for Lyrist (Javadoc? Comments?).
 +> Eventually change i++ to ++i everywhere, Dr. Rodham once said it’s always the same speed or faster.
mind/lyristtodo.txt · Last modified: 2016/12/19 07:57 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0