# Homework Assignment #17

## Objective

To employ dynamic programming to solve the gene sequence alignment problem.

## Exercises

Show all work neatly.

### Question 1

Find the optimal alignment using dynamic programming (by hand only) of $x=$ ATGCC and $y=$ TACGCA . Use the same cost function we used in the lecture, namely: $c_{indel} = 5$; $c_{sub} = 1$; and $c_{match} = -3$.

• (a) Show your DP table.
• (b) Circle the optimal alignment cost.
• (c ) Extract an optimal alignment from the table in the form of a sequence of edit operators. Break ties among optimal alternatives by preferring a backpointer leading to the left over a packpointer leading diagonally over a packpointer leading downward. Write the edit operators (not their coordinates) in forward order from start $(0,0)$ to finish $(m,n)$. Use the following notation:
• to represent a match of an A with an A, for example, use Match(A)
• to represent an indel involving a gap on the $x$ side and an A on the $y$ side, use Indel(-, A)
• to represent a substitution of an A on the $x$ side for C on the $y$ side, use Sub(A, C)
• (d) Translate the optimal alignment from part (c ) into the two-row, side-by-side form, where each letter in the first sequence (or a “-” for a gap) is adjacent to a letter (or a “-”) in the second sequence.

### Question 2

Now find the optimal alignment using dynamic programming (by hand or in a spreadsheet) of the same sequences, ATGCC and TACGCA . Use a new cost function, namely: $c_{indel} = 1$; $c_{sub} = 2$; and $c_{match} = 0$.

• (a) Show your DP table.
• (b) Circle the optimal alignment cost.
• (c ) Extract an optimal alignment from the table in the form of a sequence of edit operators by following the same instructions specified for question 1, part (c ).
• (d) Translate the optimal alignment from part (c ) into the two-row, side-by-side form by following the same instructions specified for question 1, part (d). 