Quality Metrics for eccDNA Sequencing: Enrichment Efficiency, Background, and Reproducibility

Defining success in an eccDNA (Circle-Seq and related) project starts with what you measure and how consistently you measure it. In research-use-only workflows, the core of "good data" is straightforward: strong removal of linear DNA, demonstrable retention of circular DNA, low background, and reproducible results across replicates and batches—without making external performance promises. This practical guide summarizes measurable checkpoints and reporting norms for CRO QA teams and core facility leads using hybrid strategies (Illumina short-read plus ONT/PacBio long-read) to meet the demands of mechanism, discovery, and application studies.
Pre-Sequencing QC: HMW DNA Integrity and Input Readiness
High molecular weight (HMW) genomic DNA integrity is the first gate. Fragmented input elevates digestion artifacts and complicates enrichment. Automated electrophoresis systems (Agilent TapeStation or Bioanalyzer) help you visualize whether you have a dominant HMW band with minimal smear—an acceptance cue for proceeding.

Figure 1. Capillary electrophoresis (TapeStation/Bioanalyzer) trace showing high‑molecular‑weight (HMW) genomic DNA (top) compared with a degraded sample (bottom); peak position indicates fragment size and peak height indicates relative quantity.
What to document before enrichment:
- DNA integrity trace: a dominant HMW peak and clean baseline; avoid pronounced low-molecular-weight smear.
- Concentration and volume: sufficient input for the chosen enrichment method (exonuclease-only, exonuclease+rolling circle amplification, or capture).
- Contamination controls: mock extraction (process) controls and no-template controls to detect carry-over.
- Chain-of-custody and metadata: sample source, extraction protocol, reagent lots, operator, and planned enrichment approach.
SOP-style acceptance checklist (illustrative):
- Receive and log samples with batch/run IDs and storage conditions.
- Inspect TapeStation/Bioanalyzer trace; proceed if HMW band dominates and smear is minimal.
- Quantify DNA; confirm input meets the minimum for selected protocol.
- Prepare contamination controls; include mock extraction and no-template.
- Record metadata in LIMS: source, lot numbers, operator ID, planned enrichment method.
Additional pre-sequencing QA notes for hybrid strategies:
- For ONT/PacBio runs, confirm gentle extraction and minimal shearing; consider size selection only if justified by project goals (e.g., focusing on >1 kb circles).
- For Illumina Circle-Seq, plan for conservative RCA cycles and post-amplification cleanup to minimize concatemeric artifacts.
- If capture-based enrichment is used, pilot probe sets with synthetic circles to confirm target recovery and specificity before scaling.
Measuring eccDNA Enrichment Efficiency with qPCR and Spike-ins
Effective eccDNA enrichment shows two signatures: depletion of nuclear linear DNA and retention of circular DNA (including mitochondrial). qPCR of a nuclear gene versus mitochondrial DNA is a routine, fast check. Adding exogenous circular and linear plasmid spike-ins provides a calibrated measure of recovery.
Key set-up elements:
- Nuclear depletion marker: choose a single-copy nuclear gene unlikely to form eccDNA under baseline conditions (e.g., COX5B or ACTB). Track its qPCR signal before and after digestion.
- Circular retention control: use a mitochondrial gene (mtDNA is circular) to verify that circular molecules persist.
- Exogenous spike-ins: add a known circular plasmid and a linearized plasmid prior to enrichment to compute recovered circular:linear ratios post-process.

Figure 2. qPCR fold‑change before and after enrichment: depletion of a linear nuclear marker (illustrative ΔCt 22 → 30) versus retention of mitochondrial circular DNA (illustrative ΔCt 20 → 21); fold changes computed by the 2^−ΔΔCt method (error bars = technical replicates).
Illustrative, non-promissory calculation:
- Suppose pre-enrichment Ct values are Nuclear = 22 and mtDNA = 20; post-digestion Ct values are Nuclear = 30 and mtDNA = 21.
- Relative quantity (RQ) via ΔCt (2^-ΔCt): Nuclear depletion ≈ 2^(22−30) ≈ 1/256; mtDNA retention ≈ 2^(20−21) ≈ 1/2.
- Report a "circular:linear recovery ratio" (defined explicitly in your SOP) using mtDNA and spike-ins: the ratio of retained circular signal over depleted linear signal post-enrichment. In this example, mtDNA vs nuclear ≈ 128, indicating strong retention relative to depletion.
Considerations across platforms:
- Illumina short-read Circle-Seq with exonuclease+RCA offers high sensitivity but can inflate duplicates; keep RCA cycles conservative and verify library complexity.
- Oxford Nanopore direct long-read sequencing of enriched circles captures junction-spanning reads and size distributions, aiding structural validation.
- PacBio HiFi capture strategies can deliver high-accuracy reads of larger circles; plan input requirements and capture probe designs accordingly.
- Hybrid strategy: pair Illumina for depth with ONT/PacBio for structural confirmation. Use consistent spike-in and qPCR checks across all paths to compare enrichment efficiency.
Worked example with spike-ins (illustrative):
- Add 10 ng of a known circular plasmid (e.g., 2–3 kb) and 10 ng of the same plasmid linearized by restriction digest to the extraction prior to exonuclease treatment.
- After enrichment and cleanup, quantify by qPCR with primers spanning unique plasmid regions. If post-enrichment estimates return 7.5 ng circular and 0.01 ng linear, the circular:linear recovery ratio is 750, showing preferential retention of circular forms.
- Note: mtDNA abundance varies across tissues; compare spike-in ratios across batches to monitor process stability.
Post-Sequencing QC: Mapping Support, Background, and Abundance Normalization
Once sequencing is complete, eccDNA calls must be supported by clear alignment evidence and reasonable background levels. This is where "eccdna sequencing qc" practices converge across short- and long-read data.
Mapping and evidence thresholds:
- Split/soft-clipped read support: junction-spanning evidence is essential in short-read libraries (e.g., minimum split-read counts tuned to genome complexity).
- Discordant pairs: supportive signals from paired-end anomalies strengthen confidence.
- Long-read junctions: direct junction-spanning reads reduce ambiguity; inspect per-circle coverage and read-length distributions.
Abundance normalization and background tracking:
- Normalize counts as eccDNAs per million mapped reads (EPM) or a similar depth-normalized metric.
- Track residual linear content: report the fraction of reads mapping to canonical linear regions versus circular junctions; include nuclear gene markers or selected genomic windows as a background proxy.
- Define a custom "Circle-Enriched Ratio" for your reports, with the formula stated clearly (for example, the number of junction-supporting reads divided by total mapped reads in the enrichment library). Treat it as a project-specific metric rather than a community standard.
Platform-specific QC notes:
- Illumina short-read: monitor duplication rates and over-represented sequences in FastQC; inspect insert size distributions; ensure junction-spanning read counts cross your internal acceptance thresholds.
- ONT long-read: review per-read accuracy and homopolymer performance; confirm junction calls via multiple reads; consider adaptive sampling logs if used.
- PacBio HiFi: check read accuracy and length distributions; verify that capture targets are represented with sufficient coverage; inspect consensus accuracy for junction sequences.
Hybrid pipeline reproducibility:
- Standardize read QC (FastQC/fastp) and alignment (BWA-MEM for short reads; Minimap2 for long reads). Use containerized environments and fixed parameter files.
- For detection, consider complementary tools (Circle-Map, CReSIL, ecc_finder, eccDNA-pipe), and record exact versions.
- Generate QC PDFs with thresholds and evidence counts per circle; include a data dictionary describing fields and filters.
Bioinformatics QC steps are further detailed in CD Genomics' service overview: Bioinformatics Services.
Example commands (illustrative; pin versions in your SOP):
# Short-read alignment
bwa mem -t 16 reference.fa sample_R1.fastq.gz sample_R2.fastq.gz | samtools view -Sb - > sample.bam
samtools sort -@ 8 -o sample.sorted.bam sample.bam
samtools index sample.sorted.bam
# Long-read alignment
minimap2 -t 16 -ax map-ont reference.fa sample_ONT.fastq.gz | samtools sort -o sample_ONT.sorted.bam -
samtools index sample_ONT.sorted.bam
# Circle-Map detection (example; adjust parameters)
Circle-Map Realign -i sample.sorted.bam -o sample.circlemap.bam -r reference.fa
Circle-Map Detect -i sample.circlemap.bam -o sample_circles.bed
# ecc_finder (conceptual)
python ecc_finder.py --in sample.sorted.bam --ref reference.fa --out eccfinder_calls.bed
Document tool versions, parameters, and thresholds alongside these commands in your QC report.
Reproducibility: Biological Replicates, Correlations, and Acceptance
Reproducibility is the anchor for confidence. Your acceptance logic should distinguish biological replicates from technical repeats and include batch metadata.

Figure 3. Replicate concordance for eccDNA abundance (EPM): scatter plot of Replicate 1 versus Replicate 2 with annotated R² to indicate reproducibility across genomic loci.
Recommended reporting, framed as practice rather than promises:
- Presence/absence overlap: compute reciprocal interval overlap (e.g., ≥50% overlap thresholds) between biological replicates.
- Correlations: report Pearson or Spearman correlations for depth-normalized eccDNA counts (EPM) across genomic bins or genes.
- Variability: summarize coefficient of variation (CV) for read support per circle and flag outliers.
- Batch notes: record operator, reagent lot numbers, instrument run IDs, and re-run conditions.
Hybrid strategy considerations for reproducibility:
- Use Illumina to stabilize depth-driven detection and ONT/PacBio to validate structure, then reconcile calls across platforms.
- When agreement is partial, prioritize circles with multi-evidence support (split reads, discordant pairs, and long-read junctions). Document decision rules in the QC report.
Acceptance guidance (illustrative):
- Accept a batch when replicate overlap and correlations meet internally defined thresholds and contamination controls are clean.
- Flag for review when duplication rates or residual linear background exceed internal expectations.
- Rerun digestion or resequence when junction evidence is consistently below internal cutoffs despite sufficient depth.
Cross-institution reproducibility summary (illustrative, non‑promissory):
While formal multi‑laboratory benchmarks for eccDNA remain scarce, independent studies provide useful illustrative signals. For example, single‑cell parallel sequencing in González et al., scEC&T‑seq (2023) reported strong cross‑platform concordance (Illumina vs. Nanopore; Pearson R ≈ 0.95 for matched datasets). A separate 2024 method comparison reported higher replicate concordance for WGS-style short/long reads (>50% and ~54%, respectively) versus some Circle‑Seq variants, highlighting protocol dependence rather than universal reproducibility. Report these figures as study‑level observations (include sample type, platform, and n where available) and treat them as illustrative, not service guarantees.
Troubleshooting Low-Quality Data: Common Symptoms and Corrective Actions
When QC flags appear, move quickly with symptom-based diagnostics and targeted corrections. Below is a compact troubleshooting matrix you can adapt.
| Symptom | Likely cause | Immediate check | Corrective action |
|---|---|---|---|
| High linear background after digestion | Incomplete exonuclease activity; buffer or enzyme decay | qPCR nuclear vs mtDNA; verify enzyme lot and incubation | Repeat digestion with fresh enzyme; optimize buffer/incubation; include process controls |
| Low-complexity libraries (duplication high) | RCA over-amplification; concatemer formation | FastQC duplication; library size profile; RCA cycle review | Titrate RCA input; shorten amplification; purify circles; consider capture-based enrichment |
| Size/sequence bias | Restriction-site dependence or capture probe bias | Coverage vs flanks; repeat content analysis | Use tagmentation or enzyme mixes; redesign capture probes; note expected bias in report |
| Contamination in controls | Carry-over or reagent contamination | No-template and mock extraction control signals | Audit workspace; replace reagents; reprocess with stricter contamination controls |
| Insufficient junction support | Depth too low; aggressive filters | Per-circle read support tables; filter review | Increase sequencing depth; relax filters conservatively; aggregate replicates |
Decision guidance (illustrative):
- If digestion fails (qPCR nuclear signal remains high), halt downstream prep and repeat digestion.
- If duplication exceeds expectations and structural artifacts appear, adjust RCA or pivot to capture-based enrichment and resequence.
- If contamination is detected in controls, abort batch, decontaminate, and rerun with clean reagents.
- If junction support is marginal, increase depth or combine replicates; note caveats in deliverables.
Validation and Audit: Documentation, LIMS, and Reproducible Pipelines
A QC program is only as strong as its documentation. Prepare auditable records that capture methods, parameters, and evidence.
Recommended metadata fields (example table):
| Field | Description |
|---|---|
| Sample ID, batch/run ID | Unique identifiers for traceability |
| Source and storage | Tissue/cell line, preservation method |
| Extraction protocol | Kit or reagents; version; operator |
| Enrichment method | Exonuclease-only, exonuclease+RCA, capture; rationale |
| Spike-in design | Circular and linear controls; input amounts |
| qPCR targets | Nuclear gene, mtDNA, primer sets |
| Library prep parameters | Input mass, cycles, kit version |
| Sequencer/run details | Platform, flow cell/chip, chemistry version |
| Bioinformatics versions | Aligners and detectors; container tags; parameters |
| Acceptance notes | Overlap/correlation metrics; decision flags |
Pipeline reproducibility:
- Containerize all software (Docker/Singularity); version pin every tool.
- Keep parameter files under Git; export a human-readable "methods" section.
- Produce standardized deliverables: FASTQ, aligned BAM, eccDNA call tables (BED-like), a QC report PDF, and a concise data dictionary.
Template QC summary page (illustrative):
- Overview: project name, batch IDs, platforms used (Illumina + ONT/PacBio), sample list.
- Pre-QC: TapeStation/Bioanalyzer acceptance notes, input quantities, contamination controls.
- Enrichment QC: qPCR nuclear vs mtDNA values, spike-in recovery ratios, digestion parameters.
- Library QC: concentration, fragment distribution, duplication metrics.
- Sequencing QC: depth, mapping rates, junction-support counts.
- Detection summary: number of eccDNAs, EPM, Circle-Enriched Ratio definition and values.
- Reproducibility: replicate overlap, correlation (R, R^2), CV distributions.
- Decisions: accept/reprocess/resequence flags and rationales.
External adoption references: For wet‑lab SOP structure and steps, see JOVE's 2016 Genome‑wide purification of extrachromosomal circular DNA protocol. For training on circularization‑based sequencing workflows applicable to Circle‑Seq, refer to JOVE's 2024 CIRCLE‑seq off‑target assay protocol; cite these as illustrative third‑party materials rather than service guarantees.
Conclusion: Turn QC into SOPs You Can Trust
Quality is a process, not a promise. Define explicit, measurable steps: HMW DNA integrity, qPCR-based enrichment checks, evidence-backed post-sequencing metrics, and replicate-level reproducibility—all captured in an auditable report. If you need a practical frame for designing and documenting eccDNA study cohorts, controls, and deliverables, the blueprint-style guidance in eccDNA study design considerations can help translate these QC standards into project plans.
Author
Yang H. — Senior Scientist, CD Genomics; University of Florida.
Yang is a genomics researcher with over 10 years of research experience in genetics, molecular and cellular biology, sequencing workflows, and bioinformatic analysis. Skilled in both laboratory techniques and data interpretation, Yang supports RUO study design and NGS-based projects.
References:
- Quantitative assessment reveals the dominance of endogenous eccDNAs in the human genome — Mouakkad-Montoya et al., PNAS (2021). DOI: 10.1073/pnas.2102842118.
- eccDNA-pipe: an integrated pipeline for identification, analysis and annotation of extrachromosomal circular DNA — Fang et al., Briefings in Bioinformatics (2024). DOI: 10.1093/bib/bbae034.
- ecc_finder: A Robust and Accurate Tool for Detecting Extrachromosomal Circular DNA — Zhang et al., Frontiers in Plant Science (2021). DOI: 10.3389/fpls.2021.743742.
- Extrachromosomal circular DNA: Current status and future perspectives — Zhao et al., eLife (2022). DOI: 10.7554/eLife.81412.
- Parallel sequencing of extrachromosomal circular DNAs and full-length transcriptomes at single-cell resolution (scEC&T-seq) — González et al., Nature Communications/PMC (2023).
- Extrachromosomal circular DNA as a novel biomarker for the progression of colorectal cancer — Qiu et al., Molecular Medicine (2025). DOI: 10.1186/s10020-025-01164-y.
- EccDNA atlas in male mice reveals features protecting genes against transcription-induced eccDNA formation — Liang et al., Nature Communications (2025). DOI: 10.1038/s41467-025-57042-y.
- Microhomology-Mediated Circular DNA Formation from Genomic DNA — Hu et al., eLife reviewed preprint (2023).
- A distinct circular DNA profile intersects with proteome changes in disease contexts — Gerovska et al., 2023, PMC10498603.
- Quantitative Analysis with the Agilent TapeStation Systems — Agilent Technologies, Technical Overview 5994-8050EN (2025).
- Sample Quality Control in Agilent NGS Solutions (TapeStation/Bioanalyzer) — Agilent Technologies, Application Note 5994-0127EN (2018).
- Circle-Map repository — Prada-Luengo et al., GitHub (2020–2025).