Automatic Sequence Annotation on Genome Compiler
Sequence annotation is one of the most important steps of a genetic project. As a project progresses, it is crucial to be able to identify regions of interest in a sequence in order to fully understand the function of the sequence. Thus, molecular biologists must create quality annotations when working with a DNA sequence.
Originally, this process was performed by hand on paper. Even with DNA design going digital, many people still follow the traditional sequence annotation method involving manually adding notes and markers to a sequence. This requires you to manually search through a sequence to find the specific region you are looking for and then add in the annotation. While manual annotation is very thorough, it is also very laborious and tedious, especially for large projects. It doesn't help that most molecular biology software tools have poor user interface, further complicating the process.
You can avoid this hassle by using the auto annotation feature on Genome Compiler. Genome Compiler uses the BLASTN single blast nucleotide algorithm that searches through different repositories and databases of parts to look for pre-existing annotations that can be found on the current sequence, and automatically identifies coding sequences, RBS sequences, promoters, restriction sites, and other useful parts in your sequence. The database includes the auto annotation library of coding sequences, promoters, and other parts that comes pre-loaded on Genome Compiler, as well as a user’s individual annotation library of uploaded parts. Before running the blast, users have the option of adjusting the stringency of the search to try to pick up annotations that are not a 100% match. There is still some level of customization in the auto annotation process, as users can choose which annotations to designate as main layer and sub-layer.
Here is an example of auto annotation of vector pET28a on Genome Compiler:
- When you open a project, a prompt appears asking if you would like Genome Compiler to auto annotate the vector. Select “yes,” and the prompt will ask you which libraries you would like your sequence to be blasted against. You can also adjust the stringency of the search at this step.
- After the blast is complete, you can see the annotations detected in your project in the annotations summary table. The table lists the annotation, its location in the sequence, the length of the feature, the feature key, the strand from which the feature is transcribed, and the percentage match between the feature and the actual sequence. You use this table to designate sub annotations and main annotations (all auto annotations automatically initially appear as sub annotations).
- You can also go to a specific part in the sequence view by clicking on its location coordinates or its length, or filter your view to see just one type of sequence (i.e., promoter). In addition to the functions already mentioned, by clicking on the “details” for a part (far right) you can see how this part matches up to the strand it was annotated against. If the part is not a 100% match, you can see exactly where it differs from the original strand.
- This is how this example of auto annotated sequence will appear in Genome Compiler:
- The circular view displays all of your annotations, indicated by the different colours and labels. From the circular view, you can go to a certain part in your project by just clicking the marked area on the circle. You can also drag and drop parts this way, both in a single project and between different projects.
For those who still prefer the precision of manual annotation, you can do that using the software as well. But as you can see, the auto annotation feature significantly reduces the time you need to dedicate to annotating your sequence. Check out this webinar for a more detailed explanation of both of these processes on Genome Compiler.