TIDE

TIDE: Tracking of Indels by DEcomposition


Upload Data:

Title plot (e.g. sample name)
Nuclease

Paste your guide sequence (from '5 to '3). Do not include the PAM.



Parameters

All parameters have default settings but can be adjusted by checking the 'advance settings' box.

Update View

Created by Bas van Steensel lab
Hosted and developed by Desktop Genetics
TIDE version 2.0.1
TIDE supports Firefox6, Chrome4, Safari6, IE10 or higher

NKI logo Desktop Genetics logo

Welcome

Welcome to the TIDE webtool, now hosted, maintained and supported by Desktop Genetics.

Desktop Genetics is committed to developing the best design and analysis tools for genome editing. As part of this commitment, Desktop Genetics has acquired exclusive rights to TIDE from NKI, and will now be hosting the webtool as part of its DESKGEN platform.

You can continue accessing the webtool exactly as you did before, and log into the DESKGEN platform to access new features to enhance you analysis and user experience.

Remember, the link has now changed to tide.deskgen.com


Purpose

TIDE is a web tool which rapidly assesses genome editing of a target locus by CRISPR-Cas9. Based on the quantitative sequence trace data from two standard capillary (Sanger) sequencing reactions, the TIDE software quantifies editing efficacy and identifies the predominant types of insertions and deletions (indels) in the DNA of a targeted cell pool. See Brinkman et al. 2014 Nucl. Acids Res. for a detailed explanation and examples.

sequence trace

Instructions

1. Upload Data:

Enter a DNA character string ('5-'3) representing the used guide RNA sequence immediately upstream of the PAM sequence for the Cas9 nucleases, and downstream for the Cpf1 nucleases (PAM not included). Numbers and other invalid (non-IUPAC) DNA characters will be automatically removed. The guide sequence should be 20nt for all nucleases, except for Staphylococcus aureus Cas9 (SaCas9) and Francisella novicida Cpf1 (FnCpf1), of which the guide sequences should be 22nt and 23nt respectively.

Next, upload the chromatogram sequence files of respectively the control sample (i.e. transfected without Cas9 or without the guide RNA) and the test sample (i.e. treated with both Cas9 and the guide RNA).

We advise that you sequence a ~700 bp stretch of DNA inclusive of the target locus. The projected cut site should ideally be located ~200 bp downstream from the sequencing start site. The region upstream of the cut site is used to align the sequencing data of the test sample with that of the control sample.

Currently, ABIF (.ab1) files are supported. If you are using another file format, please contact us.

2. Enter Parameters for Analysis:

The following parameters have default settings but can be adjusted if necessary in the panel to the left by checking the 'Advanced Settings' box.

Alignment window

These settings determine the window in which the control and test sequences are aligned to determine any offset between the two reads. There is usually no need to deviate from the default settings except when sequencing long repetitive sequences.

left boundary: By default, this is set to 100. This is because base-calling at the start of a Sanger sequence read is often of poor quality.
right boundary: This is automatically set at cut site (-10 bp).

Decomposition window

These settings determine the sequence segment used for decomposition. The default setting is the largest window possible for uploaded sequences:

left boundary: max indel size +5 bp downstream of the cut site.
right boundary: max indel size +5 bp before the end of the shortest sequence read.

The decomposition window can be adjusted if part of the sequence read is of low quality or contains repetitive sequences.

Indel size range

Set the maximum size of deletions and insertions to be modeled. The default value is 10.

P-value threshold

Significance cutoff. Any value between 0 and 1 is accepted. Default is p<0.001.

3. Results:

Once the data are uploaded and parameters are set, submit the data by clicking on the "Update View" button and the plots will appear in the two tabs: "Decomposition” and "+1 insertion". If the settings are incorrect or too stringent, warnings will be displayed in the "Decomposition" tab.

Quality measures: Results depend on the quality of the sequence reads. As a rule of thumb, we recommend to aim for an average aberrant sequence signal strength before the cut site of <10% (both control and test sample), and an Rsquared value of >0.9 for the decomposition result. Sequencing the opposite strand is recommended to confirm results.


Citation and Privacy

If you use this software for data analysis in a publication, please cite Brinkman et al., Nucl. Acids Res. (2014) . Your uploaded data are only used for the duration of the analysis session and are not stored or used for any other purpose. Any data you share, upload or save (e.g. via "Save Experiment") will never be shared or sold without your explicit written consent. Desktop Genetics will not use your data for any commercial purpose without your explicit written consent. Anonymous usage analytics are collected via third party tools in order to improve user experience. For more information on what data is collected, please click here.


Contact

This web tool was developed by Desktop Genetics, in collaboration with the Bas van Steensel lab. Special thanks to Eva Brinkman, Bas van Steensel from NKI, and Mira Davidson from Desktop Genetics. For more information and to report bugs, please contact Desktop Genetics.



A plot will be shown here when the valid sequencing files and guide string have been uploaded.

The prediction of the +1 inserted nucleotide will be shown here when the valid sequencing files and guide string have been uploaded.

FAQ

Troubleshooting

FAQ

What is the minimum sequence length you need?

The requirements of sequence length are flexible. This region upstream of the cut site is used to align the sequencing data of the test sample with that of the control sample. The region behind the cut site is used for a decomposition to determine the various indels in the pool of cells. In general, with a bigger stretch of sequence trace, a better estimation can be performed by TIDE. We advise that users sequence a stretch of DNA ~700 bp inclusive of the designed editing site. The predicted cut site should ideally be located ~200 bp downstream from the sequencing start site.

What happens if the sequence is shorter than 700 bp?

TIDE should work with shorter sequences if the quality of the sequence reads is high. In that case, the start of sequence read (alignment window) might have to be set lower than the default setting of 100. Often with shorter sequences, the cut site is too close to the start of the sequence read in the default settings (see figure). The alignment window can be changed in Advanced Settings. short sequence

Does the cut site really need to be 200 bp away from the primers?

TIDE should work with sequences with a cut site closer to or further than 200 bp from the start of the sequence trace if the quality of the sequence reads is high. In the case where the cut site is closer to beginning, the start of sequence read (alignment window) should be set lower than the default setting of 100. The cut site is too close to the start of the sequence read in the default setting to perform an alignment (see figure). The alignment window can be changed under Advanced Settings.
If the cut site is further from the beginning, the decomposition window will become smaller. Note, selecting a larger stretch of nucleotides often makes the estimation of the indels more reliable. The minimal decomposition window spans from two times the size of the indel and has to be at least 5 bp from either the guide target site or the end of the sequence. short sequence

How is overall cutting efficiency calculated?

The overall efficiency doesn't need to add up to 100% because there is also noise in the data. The overall efficiency is calculated as Rsquared less the percent of wild type (zero indels). For example if the R2 value is 0.95, it tells that 95% of the variance can be explained by the model; the remainder 5% is noise or large indels.

What do the different indel bars indicate when the cell pool is sequenced?

The different indel bars represent the insertion and deletions in the population. You can’’t tell for an individual cell what the specific indels for each allele are. To determine allele-specific information you have to isolate a cell clone and the perform TIDE analysis.

What do the different indel bars indicate when the cell clone is sequenced?

The different indel bars represent the insertion and deletions detected in the alleles in a cell clone. With a diploid cell, you should get a percentage of ~50% per indel.

Can we get the sequence of different indels?

The sequences of specific indels cannot be deduced from the sequence trace. We tested this option, but Sanger sequencing is not sensitive enough to give reliable results on this. To know the precise sequence of these mutations, you can use next-generation sequencing or sequence the DNA of individual mutated clones.

Troubleshooting

Incorrectly annotated nucleotides

Sometimes the quality of the peaks in chromatogram looks fine, but the file displays some unannotated or incorrectly annotated nucleotides. These will interfere with the indel spectra (see figure). TIDE generates a warning when the spacing between the nucleotides in the chromatogram of the sequence trace is inconsistent. Inconsistent spacing is often an indication for unannotated or incorrectly annotated nucleotides. In this case, the sequence file cannot be used for a reliable TIDE analysis. If possible, try to set the right boundary of the decomposition window lower. If the warning remains, carefully investigate your chromatogram.
wrongly unannotated nucleotide(s)

Low R-squared value

A low Rsquared can be the result of optimized settings or poor sequence quality.

Settings
By default, the decomposition window is set to its maximum size and the Indel size range is set to 10. The settings can be adjusted in advanced settings.

  • Large indels are present in the sample. By default the decomposition is calculated with a maximal indel size of 10. When larger indels are present, they will cannot be modeled, which will result in a low Rsquared. Try to increase the Indel size range to test if this improves the fit (see figure Indel size range)
  • Poor local quality of the sequence trace. Often the end of the sequence is of low quality. This can be observed in the quality plot that shows a high aberrant sequence signal at the end of the sequence trace (see figure Poor quality sequence end). This compromises the decomposition of the sequence trace. Adjust the boundaries of the decomposition window in such a way that it will not overlap with the region that is of low quality.
  • Repetitive regions in the sequence trace. These regions can be observed in the quality plot as a sudden stretch without aberrant nucleotides (see figure Repetitive region). This region might interfere with the decomposition of the sequence trace. Adjust the boundaries of the decomposition window in such a way that it will not overlap with the repetitive sequence part.
Poor sequence quality
Poor sequence quality can be observed in chromatogram (see figure Poor sequence quality). There is more noise present in the data with results in a lower Rsquared.

Figure: Indel size range
maxshift
Figure: Poor quality sequence end
poor local quality
Figure: Repetitive region
repetitive region
Figure: Poor sequence quality poor sequence quality

Forward and reverse indel spectra are not identical

If the forward and reverse indel spectra don't match, one of the results is not reliable. Often there is a misannotation in one of the sequence files (see “Incorrectly annotated nucleotides”).

No guide RNA match because there is a mismatch in the control sequence

Sometimes a mismatch occurs in the control sequence at the location of the guide RNA. This will stop the TIDE analysis. In this case, change the guide into identical IUPAC nucleotides as the control sequence.
For example

no sgRNA

Error boundaries of decomposed region are not acceptable

This error means that the decomposition cannot be performed with the current settings. This can result from unoptimized settings or when the cut site is too close to the beginning or end of the sequence. If possible, try to set the decomposition boundaries further apart, use a smaller indel size or use a smaller alignment window.
If that doesn't help, you might have to resequence your sample to perform TIDE analysis. It can also help to sequence the opposite strand for confirmation of results. We advise that you sequence a stretch of DNA ~700 bp inclusive of the target locus. The projected cut site should ideally be located ~200 bp downstream of the sequencing start site.

Poor alignment

When the beginning of the sequence is of poor quality, the alignment function may not work as intended. This can be observed in the quality plot if a highly aberrant sequence signal stretches over the whole length of the sequence trace (see figure). Normally, the aberrant sequence signal should only increase around the expected cut site (blue dotted line). If you observe, poor alignment, try to shift the start of sequence read (alignment window) higher or lower. The alignment window can be changed in Advanced Settings. poor alignment