TL;DR – What an eccDNA bioinformatics pipeline actually does
An eccDNA bioinformatics pipeline is the end-to-end analysis workflow that converts raw Circle-seq or WGS reads into high-confidence circular DNA calls, annotations, and visual reports that teams can trust. In this article, we walk through each step, show how Circle-Map and CReSIL fit into eccDNA analysis, and explain how a standardized pipeline reduces noise, accelerates projects, and connects directly to validation and decision-making.
Figure 1. Overview of an eccDNA bioinformatics pipeline, from Circle-seq/WGS reads through QC, Circle-Map and CReSIL calling, filtering, annotation, and validation to support data-driven decisions.
An eccDNA bioinformatics pipeline is a structured sequence of QC, mapping, calling, filtering, and annotation steps tailored to circular DNA detection. Unlike many standard variant-calling workflows, eccDNA analysis must deal with chimeric reads, rolling circle amplification (RCA) bias, and uneven coverage that can easily generate false circles if not handled carefully.
Bioinformatics cores and CRO data teams often meet eccDNA for the first time in late-stage projects. The sequencing data have already been generated, but the analysis scripts are still experimental. Without a clear eccDNA bioinformatics pipeline, results become hard to reproduce, and reviewers quickly question the robustness of the findings.
Typical pain points include:
These problems are rarely fixed by "one more run" of Circle-Map or CReSIL alone. They are solved by designing an eccDNA analysis workflow that defines every step, from raw FASTQ to final report.
Typical failure modes in eccDNA analysis include unstable call sets, inflated artifact counts, and poor cross-sample comparability. They usually reflect pipeline design choices rather than flaws in the eccDNA callers themselves.
Common issues we see in Circle-seq and WGS eccDNA projects are:
Recognizing these patterns early is critical. They are strong signals that the eccDNA pipeline needs more structure, more QC checkpoints, and clearer filtering rules.
A mature eccDNA bioinformatics pipeline guarantees transparency, reproducibility, and clear decision points. It does not promise "perfect" circle detection, but it makes the limitations visible and manageable.
In practice, a robust eccDNA pipeline should provide:
This is the foundation on which Circle-Map eccDNA analysis, CReSIL eccDNA calling, and downstream statistics can be safely layered.
An eccDNA bioinformatics pipeline is best understood as a playbook of ordered steps that turn raw reads into interpretable eccDNA profiles. The core stages are: QC → Mapping → eccDNA calling → Filtering → Annotation → Downstream statistics and visualization.
A useful way to present this to project stakeholders is as a simple flow:
Raw reads → QC & trimming → Genome mapping → eccDNA caller (Circle-Map / CReSIL) → Filtering → Annotation → Multi-sample analysis & plots
Figure 2. Schematic overview of the ECCsplorer pipeline, highlighting preparation, mapping, clustering, and comparative modules and their typical outputs for eccDNA candidates (Mann L. et al. (2022) BMC Bioinformatics).
Raw data QC for eccDNA sequencing checks whether Circle-seq or WGS libraries have the quality needed for reliable eccDNA detection. Standard NGS metrics still apply, but their interpretation changes slightly in the eccDNA context.
Key checks include:
From experience, projects that skip rigorous QC often spend more time explaining odd results than interpreting biology. Building QC reports into the eccDNA analysis workflow from day one saves time later.
Mapping strategies for circular DNA determine how well eccDNA junction reads are captured. eccDNA callers such as Circle-Map and CReSIL depend on accurate alignment of soft-clipped reads and discordant pairs.
Important considerations include:
For Circle-seq eccDNA data analysis, we often recommend a stable set of mapping parameters that have been validated across multiple projects. Changing aligners midway through a study can introduce artificial differences in eccDNA profiles.
An eccDNA caller is a specialized algorithm that inspects mapped reads to identify circular DNA junctions. Circle-Map and CReSIL are two widely used tools that implement different strategies for eccDNA detection.
Selecting which eccDNA caller to run—and in what order—is a key design choice in any eccDNA bioinformatics pipeline.
Filtering and prioritizing eccDNA calls removes noise and highlights biologically plausible circles. The filtering strategy should be explicit and defensible.
Typical filters include:
Instead of chasing every possible circle, the goal is to define a "high-confidence" eccDNA set that can support downstream hypotheses and validation.
Annotation and visualization translate raw eccDNA calls into functional insights. This stage often determines whether collaborators and non-bioinformatics stakeholders understand the results.
Common annotation and visualization outputs include:
At this point, an eccDNA bioinformatics pipeline connects naturally to broader eccDNA Research Solutions, where multi-omics integration, pathway analysis, and biological interpretation can be added.
Figure 3. Workflow of the TeCD eccDNA Collection Database, showing how eccDNA loci and annotations from many studies are standardized, stored, and exposed through a searchable web interface (Guo J. et al. (2023) BMC Genomics).
Circle-Map is an eccDNA caller that detects circular junctions by re-mapping discordant and soft-clipped reads. For many teams, Circle-Map eccDNA analysis is the first step into dedicated circular DNA bioinformatics.
In practice, Circle-Map performs best when the upstream mapping and QC have been tailored for eccDNA detection. Parameter choices also have a strong impact on sensitivity and specificity.
Circle-Map detects eccDNA by reconstructing junctions where read pairs or soft-clipped segments suggest a circular connection. It evaluates these candidates and assigns a confidence score that reflects read support and mapping quality.
For readers new to Circle-Map, it helps to think of the tool as:
Understanding this logic makes it easier to explain Circle-Map outputs to experimental collaborators and reviewers.
Recommended Circle-Map settings for eccDNA projects depend on library type, read length, and sequencing depth. There is no single "best" configuration, but experience shows that starting from a documented baseline is far safer than tuning everything manually.
Practical suggestions include:
In our eccDNA Sequencing (Circle-seq) projects, we usually integrate Circle-Map into a stable Snakemake or Nextflow pipeline so that re-runs remain consistent over time.
Debugging Circle-Map outputs means distinguishing real biology from technical artifacts. When results look counterintuitive, we recommend checking the following:
Walking through this checklist with BAM files and QC reports often reveals whether the issue lies in library prep, mapping, or Circle-Map configuration.
CReSIL is an eccDNA caller designed to improve detection in RCA-based libraries such as Circle-seq. It aims to reduce specific artifacts while retaining sensitivity to real eccDNA junctions.
In many bioinformatics cores, CReSIL and similar tools are deployed alongside Circle-Map rather than as strict replacements. This complementary approach gives a richer, more conservative view of eccDNA profiles.
CReSIL changes the eccDNA detection workflow by modeling the sequencing patterns created by rolling circle amplification. It looks for junction signals consistent with circular templates while down-weighting patterns likely to be noise.
For Circle-seq eccDNA data analysis, this often leads to:
Positioning CReSIL correctly in your eccDNA bioinformatics pipeline means clarifying when it runs, how its outputs are combined with Circle-Map, and how results are reported.
Circle-Map vs CReSIL comparisons are most useful when framed as complementary strengths rather than a winner-takes-all benchmark. In practical datasets, teams often see:
A simple comparison table in your documentation—columns for sensitivity, artifact resistance, runtime, and integration effort—supports internal discussions and AI summarization alike.
Using multiple eccDNA callers in one pipeline allows teams to define confidence tiers based on call concordance. This is particularly helpful when designing follow-up experiments or eccDNA Validation Service panels.
Figure 4. Comparison of eccDNA candidates detected by ECCsplorer, Circle-Map, and originally published call sets in Arabidopsis and human circSeq data, including skyline plots, length distributions, and Venn overlaps (Mann L. et al. (2022) BMC Bioinformatics).
Typical strategies include:
Documenting this logic helps collaborators understand why some circles are prioritized for validation while others remain exploratory.
Designing a robust eccDNA pipeline for Circle-seq projects means thinking about reference selection, library bias, and multi-sample comparison from day one. It turns a collection of tools into a reusable framework that supports many experiments.
A well-designed pipeline also makes it easier to plug in external partners such as eccDNA Research Solutions, where additional analyses or custom modules may be added.
Reference genomes, blacklists, and mappability tracks are foundational to any eccDNA bioinformatics pipeline. They define which regions of the genome can be trusted for mapping and which require caution.
Practical recommendations:
These practices reduce the risk of misinterpreting technical artifacts as biological hotspots.
Handling rolling circle amplification bias and PCR duplicates is critical for Circle-seq data. RCA can generate highly uneven coverage across circles, and PCR can amplify these patterns further.
To control for these effects:
From experience, teams that track RCA and PCR effects explicitly are better positioned to defend their eccDNA findings in publications and high-stakes scientific reviews.
Multi-sample comparison and differential eccDNA analysis turn single-sample call sets into project-level insights. This can include case–control contrasts, treatment responses, or time-course designs.
Figure 5. Multi-sample eccDNA/ecDNA patterns in osteoporosis versus normal bone, including PCA, sample correlations, chromosomal eccDNA load, Venn overlaps, and length distributions (Zhu Q. et al. (2023) Aging).
Best practices include:
In many of our Circle-seq collaborations, bioinformatics cores ultimately deliver web-based dashboards where experimental teams can explore eccDNA patterns without touching raw files.
Quality control, validation, and reporting standards are what turn an eccDNA bioinformatics pipeline into a trusted decision tool. They also make it easier for external reviewers to evaluate your work.
Instead of adding QC at the end, we recommend designing QC and validation as first-class modules in the eccDNA analysis workflow.
Key QC metrics in an eccDNA bioinformatics pipeline span from raw reads to final call sets. Monitoring them across runs reveals trends and outliers early.
Useful metrics include:
Summarizing these metrics in standardized tables and plots not only builds authority but also simplifies cross-study comparisons.
Figure 6. Length distributions, chromosomal locations, and genomic-region classes of eccDNA/ecDNA detected in osteoporotic versus normal human bone tissue (Zhu Q. et al. (2023) Aging).
Experimental validation for eccDNA typically involves targeted assays that confirm predicted junctions. eccDNA qPCR is a common approach and fits naturally with a bioinformatics-driven shortlisting process.
A practical validation workflow is:
Our eccDNA Validation Service is often used at this stage to translate computational findings into wet-lab confirmation, especially for high-value therapeutic or biomarker candidates.
Reporting eccDNA results for collaborators, reviewers, and regulators requires a structure that separates methods, QC, and biological interpretation. A clear template also helps generative AI engines extract key facts reliably.
A typical report might include:
This level of transparency shows that the pipeline is not a black box but a documented, auditable process.
Partnering with CD Genomics for eccDNA analysis connects your raw data to a validated eccDNA bioinformatics pipeline, sequencing expertise, and orthogonal validation options. For bioinformatics cores, CRO data teams, and advanced eccDNA researchers, this can accelerate projects and standardize workflows.
End-to-end eccDNA sequencing and analysis packages combine library preparation, eccDNA Sequencing (Circle-seq), and a standardized eccDNA bioinformatics pipeline based on Circle-Map, CReSIL, or both. This reduces integration overhead for teams that prefer a single partner from bench to report.
Typical deliverables include:
All services are for research use only and are tailored to each project's study design.
Custom eccDNA Research Solutions support teams that already have data but need help with pipeline design, optimization, or multi-omics integration. This can include:
For projects prioritizing experimental confirmation, our eccDNA Validation Service provides targeted validation assays based on the bioinformatics shortlist.
Getting started with an eccDNA bioinformatics pipeline does not require a full-scale commitment. Many teams begin with a pilot analysis of a small dataset to evaluate fit and reporting.
A typical engagement path is:
From there, the pipeline can be scaled to full cohorts and integrated into your standard operating procedures.
An eccDNA bioinformatics pipeline is a step-by-step analysis workflow that takes raw Circle-seq or WGS reads and produces a curated list of circular DNA molecules with annotations and QC metrics. It combines standard NGS steps (QC, mapping) with specialized eccDNA callers such as Circle-Map and CReSIL plus downstream visualization and reporting.
Circle-Map and CReSIL address slightly different needs. Circle-Map is a flexible eccDNA caller that works well across many datasets, while CReSIL focuses on improving robustness in RCA-based Circle-seq libraries. Many teams run both callers in the same eccDNA analysis workflow, treating overlapping circles as high confidence and using tool-specific calls for exploratory work.
You can usually send either, but raw FASTQ files give more flexibility for mapping and QC, especially if the original alignment was not optimized for circular DNA. When only BAM files are available, the eccDNA bioinformatics pipeline will focus on caller choice, filtering, and annotation using the existing alignments.
There is no strict minimum, but eccDNA projects benefit from biological replicates and well-matched controls. Even small pilot sets with 3–5 samples per group can reveal clear patterns if the study design and eccDNA pipeline are sound. Larger cohorts allow more powerful statistics and more confident interpretation of subtle differences.
eccDNA calls are often validated using targeted assays such as eccDNA qPCR with junction-spanning primers or Sanger sequencing of specific junctions. A common strategy is to select candidates from different confidence tiers defined by Circle-Map and CReSIL outputs, then use an eccDNA Validation Service to run orthogonal experiments that confirm or refine the computational results.
References
CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.