Top tips for spike-in normalization
See a spike in your DNA–protein interaction quantification results with these guidelines for spike-in normalization.
A team of researchers at the University of California San Diego (CA, USA) have taken it upon themselves to identify and troubleshoot common pitfalls associated with a technique known as spike-in normalization. This technique is used to quantify DNA–protein interactions and is used alongside chromatin immunoprecipitation with sequencing (ChIP-seq), which produces genomic mapping data of DNA-associated proteins. By improving the utilization of spike-in normalization, the team hopes to standardize a commonly used technique that is vital to our understanding of numerous aspects of biology.
ChIP-seq is used to create maps of the binding sites of DNA-associated proteins in a sample. The identification and location of these proteins is a highly valuable enterprise itself, but to increase the utility of this data, researchers developed spike-in normalization, which enables users to quantify DNA–protein interaction when the concentration of the target DNA-associated proteins significantly varies between samples.
The process involves ‘spiking in’ a known amount of exogenous chromatin to the sample to act as an internal control, reducing variability between replicates and aiding the capture of changes in signal intensity that may be missed by other normalization techniques. This makes the technique highly useful for comparing the variation of DNA–protein interactions in a sample source under two different conditions, for instance comparing the impact of a drug or the presence of a mutation.
Novel technique detects early DNA mutations
A new long-read sequencing technique has helped researchers investigate how DNA mutations arise in a variety of contexts.
However, the team identified the spread of spike-in normalization’s misuse and, using open-access datasets, were able to outline the common errors being made in the technique’s deployment. One particularly common error highlighted was the lack of effective quality control to validate the assumption that the proportion of spike-in chromatin to sample chromatin was identical in the two conditions being compared.
Through a detailed review and analysis of these errors, examination of the initial papers outlining the proper use of spike-in normalization and by conducting investigational experiments of their own, the team developed a list of key tips to improve the use of spike-in normalization, summarized briefly below:
- Conduct thorough quality control of the spike-in. Measure the spike-in-to-target ratio for each sample by isolating and sequencing the unenriched input sample. Visually interrogate the ChIP-seq signal for the spike-in using a genome browser, alongside metagenome analysis and peak calling.
- Use spike-in material from a model species with an annotated, complete genome assembly.
- During experimental design, consider: the raw spike-in material required to verify a successful immunoprecipitation during data analysis; whether the quantity of target chromatin, relative to spike-in chromatin, is sufficient to sequence mixed species while staying within practical sequencing depths; and the correct depth of sequencing needed for mixed sequencing that accounts for the additional genome of the spike-in, while following the ENCODE guidelines.
- Quantify DNA before combining chromatin from each species to decrease variation in spike-in-to-target ratios.
- Include 3–4 replicates to ensure reproducibility.
- Apply the irreproducible discovery rate (IDR) calculation included in the ENCODE guidelines to identify an acceptable level of variation between conditions for the ChIP signal of the exogenous chromatin.
- Use stringent filtering when aligning to a merged spike-in/target genome, retaining only primary alignments with a mapping quality score of ten or higher.
- Validate experimental conclusions using an orthogonal assay such as mass spectrometry or an immunofluorescence assay.
Commenting on the implications of this paper and its advice to the research community, senior author Alon Goren noted that, “many studies utilize spike-in normalization, and our results call the biological conclusions drawn from this approach into question. Our recommendations can help account for some of the pitfalls of spike-in normalization so we can still reap the benefits of this valuable technique.”