Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
mind:alignment [2016/05/13 12:58]
norkish
mind:alignment [2016/05/13 13:26] (current)
norkish
Line 26: Line 26:
 Interesting that the band_noband_identity went up slightly, but only for smaller bands. What this means is that there are some cases where case makes a difference, but with sufficiently large context, it figures it out anyway. Also keep in mind that the listed times are for all 225 alignments. The per alignment time is thus on the order of .012 seconds. Also the times are averages of 10 iterations. Interesting that the band_noband_identity went up slightly, but only for smaller bands. What this means is that there are some cases where case makes a difference, but with sufficiently large context, it figures it out anyway. Also keep in mind that the listed times are for all 225 alignments. The per alignment time is thus on the order of .012 seconds. Also the times are averages of 10 iterations.
  
 +[{{ mind:​comparebandedunbandedresults.png?​1000 }}]
 +
 +The graph on the left shows how often the alignment results from the banded and unbanded alignment algorithms are identical. Even with a very, very small bandwidth (i.e., .03% of the sequence length), the alignments are identical nearly 90% of the time. That number reaches 100% when the bandwidth is 50% of the sequence length. This to me says that what we are aligning has a pretty no-duh, almost-no-indels answer most of the time. There are about 10% of the lyrics that have metadata in them that require a bit bigger bandwidth to get the optimal alignment.
 +
 +In terms of the time, the smaller the bandwidth, the faster the banded alignment is. The unbanded alignment is obviously unaffected by the bandwidth. The difference between the banded and the optimized banded is that the banded uses linear time but polynomial space; the optimized uses both linear time and linear space (p.s., this was a lot harder to implement than I expected :)). The reason that the optimized runs a little faster than the banded or (even the unbanded when bandwith ratio is greater than .6) has less to do with the time savings of linear space usage and more to do with the fact that I greased up the optimized aligner to frontload all possible repetitive calculations. Thus the red (and blue) line could also be greased to appropriately match the green line.
 +
 +Results are averages of 10 iterations on 225 alignments of Billy Joel lyrics (no tabs).
 +
 +So, my plan then will be to do a crude MSA of all lyrics to get a "gold standard"​ lyric for each song (presumably with no metadata); then align the gold standard lyric to the tabs to distinguish the actual song content in the tab from the metadata (and simultaneously check the completeness in terms of song coverage of the tablature).
 Graphs: Graphs:
-{{ :​mind:​comparebandedunbandedresults.png?​nolink&​300 |}}+
mind/alignment.1463165895.txt.gz ยท Last modified: 2016/05/13 12:58 by norkish
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0