Bioinformatics Pipelines for eccDNA Detection: From Circle-Map to CReSIL

TL;DR – What an eccDNA bioinformatics pipeline actually does

An eccDNA bioinformatics pipeline is the end-to-end analysis workflow that converts raw Circle-seq or WGS reads into high-confidence circular DNA calls, annotations, and visual reports that teams can trust. In this article, we walk through each step, show how Circle-Map and CReSIL fit into eccDNA analysis, and explain how a standardized pipeline reduces noise, accelerates projects, and connects directly to validation and decision-making.

eccDNA detection workflow: Circle-seq/WGS data processed via QC, Circle-Map/CReSIL calling, filtering, annotation, and validation for data-driven analysis. Figure 1. Overview of an eccDNA bioinformatics pipeline, from Circle-seq/WGS reads through QC, Circle-Map and CReSIL calling, filtering, annotation, and validation to support data-driven decisions.

eccDNA Bioinformatics Pipelines: From Pain Points to Reliable Results

An eccDNA bioinformatics pipeline is a structured sequence of QC, mapping, calling, filtering, and annotation steps tailored to circular DNA detection. Unlike many standard variant-calling workflows, eccDNA analysis must deal with chimeric reads, rolling circle amplification (RCA) bias, and uneven coverage that can easily generate false circles if not handled carefully.

Bioinformatics cores and CRO data teams often meet eccDNA for the first time in late-stage projects. The sequencing data have already been generated, but the analysis scripts are still experimental. Without a clear eccDNA bioinformatics pipeline, results become hard to reproduce, and reviewers quickly question the robustness of the findings.

Typical pain points include:

  • Call sets that collapse when thresholds change slightly
  • Strong "hotspots" that turn out to be mapping artifacts
  • Circle-seq replicates that show little overlap in eccDNA calls

These problems are rarely fixed by "one more run" of Circle-Map or CReSIL alone. They are solved by designing an eccDNA analysis workflow that defines every step, from raw FASTQ to final report.

Typical failure modes in eccDNA analysis projects

Typical failure modes in eccDNA analysis include unstable call sets, inflated artifact counts, and poor cross-sample comparability. They usually reflect pipeline design choices rather than flaws in the eccDNA callers themselves.

Common issues we see in Circle-seq and WGS eccDNA projects are:

  • Unreproducible eccDNA calls – Small parameter changes in Circle-Map or CReSIL produce very different call lists, suggesting borderline read support or poor QC.
  • Inflated artifacts in low-mappability regions – Circles cluster in centromeres, telomeres, or repeats, where mapping is unreliable and chimeric reads accumulate.
  • Batch-driven differences – Samples processed on different days or sequencers show large shifts in circle size distributions and counts, masking true biology.

Recognizing these patterns early is critical. They are strong signals that the eccDNA pipeline needs more structure, more QC checkpoints, and clearer filtering rules.

What a mature eccDNA bioinformatics pipeline should guarantee

A mature eccDNA bioinformatics pipeline guarantees transparency, reproducibility, and clear decision points. It does not promise "perfect" circle detection, but it makes the limitations visible and manageable.

In practice, a robust eccDNA pipeline should provide:

  • Documented steps from QC to reporting – Each stage has defined tools, parameters, and expected outputs.
  • Explicit thresholds and filters – Read support, mapping quality, blacklist usage, and size cutoffs are predefined and justified.
  • Reusable configs and workflows – Pipelines can be applied to new batches with minimal changes, supporting long-term eccDNA research programs.

This is the foundation on which Circle-Map eccDNA analysis, CReSIL eccDNA calling, and downstream statistics can be safely layered.

Stepwise Overview of an eccDNA Bioinformatics Pipeline

An eccDNA bioinformatics pipeline is best understood as a playbook of ordered steps that turn raw reads into interpretable eccDNA profiles. The core stages are: QC → Mapping → eccDNA calling → Filtering → Annotation → Downstream statistics and visualization.

A useful way to present this to project stakeholders is as a simple flow:

Raw reads → QC & trimming → Genome mapping → eccDNA caller (Circle-Map / CReSIL) → Filtering → Annotation → Multi-sample analysis & plots

ECCsplorer pipeline schematic: Core modules for sequence preparation, mapping, clustering, and comparative analysis producing eccDNA candidates (Mann L. et al. 2022 BMC Bioinformatics). Figure 2. Schematic overview of the ECCsplorer pipeline, highlighting preparation, mapping, clustering, and comparative modules and their typical outputs for eccDNA candidates (Mann L. et al. (2022) BMC Bioinformatics).

Step 1 – Raw data QC and pre-processing

Raw data QC for eccDNA sequencing checks whether Circle-seq or WGS libraries have the quality needed for reliable eccDNA detection. Standard NGS metrics still apply, but their interpretation changes slightly in the eccDNA context.

Key checks include:

  • Read quality profiles and adapter contamination
  • Insert size distributions, especially for Circle-seq libraries
  • Levels of PCR duplicates and library complexity

From experience, projects that skip rigorous QC often spend more time explaining odd results than interpreting biology. Building QC reports into the eccDNA analysis workflow from day one saves time later.

Step 2 – Mapping strategies for circular DNA

Mapping strategies for circular DNA determine how well eccDNA junction reads are captured. eccDNA callers such as Circle-Map and CReSIL depend on accurate alignment of soft-clipped reads and discordant pairs.

Important considerations include:

  • Choice of aligner and parameters that preserve soft-clipped segments
  • Use of the correct reference genome build and annotation version
  • Consistent handling of multi-mapped reads and low-complexity regions

For Circle-seq eccDNA data analysis, we often recommend a stable set of mapping parameters that have been validated across multiple projects. Changing aligners midway through a study can introduce artificial differences in eccDNA profiles.

Step 3 – eccDNA calling with Circle-Map, CReSIL, and other tools

An eccDNA caller is a specialized algorithm that inspects mapped reads to identify circular DNA junctions. Circle-Map and CReSIL are two widely used tools that implement different strategies for eccDNA detection.

  • Circle-Map re-analyzes discordant and soft-clipped reads to locate circular junctions and assigns confidence scores to each candidate circle.
  • CReSIL is designed to handle Circle-seq libraries with rolling circle amplification, focusing on reducing characteristic artifacts and improving detection in noisy datasets.

Selecting which eccDNA caller to run—and in what order—is a key design choice in any eccDNA bioinformatics pipeline.

Step 4 – Filtering and prioritizing high-confidence eccDNA

Filtering and prioritizing eccDNA calls removes noise and highlights biologically plausible circles. The filtering strategy should be explicit and defensible.

Typical filters include:

  • Minimum numbers of supporting reads per junction
  • Mapping quality thresholds and exclusion of low-mappability regions
  • Blacklists for known artifact-prone loci or technical sequences
  • Consistency filters across biological replicates or technical duplicates

Instead of chasing every possible circle, the goal is to define a "high-confidence" eccDNA set that can support downstream hypotheses and validation.

Step 5 – Annotation, visualization, and downstream eccDNA analysis

Annotation and visualization translate raw eccDNA calls into functional insights. This stage often determines whether collaborators and non-bioinformatics stakeholders understand the results.

Common annotation and visualization outputs include:

  • Overlaps with genes, promoters, enhancers, and repeats
  • Circle size distributions and genomic feature enrichment plots
  • Genome browser tracks and circos-style displays of eccDNA hotspots
  • Summary tables for differential eccDNA between conditions

At this point, an eccDNA bioinformatics pipeline connects naturally to broader eccDNA Research Solutions, where multi-omics integration, pathway analysis, and biological interpretation can be added.

TeCD Database workflow: Integration, standardization, storage, and web-based querying of multi-study eccDNA loci and annotations (Guo J. et al. 2023 BMC Genomics). Figure 3. Workflow of the TeCD eccDNA Collection Database, showing how eccDNA loci and annotations from many studies are standardized, stored, and exposed through a searchable web interface (Guo J. et al. (2023) BMC Genomics).

Circle-Map in Practice: Tuning an eccDNA Caller for Circle-Seq

Circle-Map is an eccDNA caller that detects circular junctions by re-mapping discordant and soft-clipped reads. For many teams, Circle-Map eccDNA analysis is the first step into dedicated circular DNA bioinformatics.

In practice, Circle-Map performs best when the upstream mapping and QC have been tailored for eccDNA detection. Parameter choices also have a strong impact on sensitivity and specificity.

How Circle-Map detects eccDNA junctions

Circle-Map detects eccDNA by reconstructing junctions where read pairs or soft-clipped segments suggest a circular connection. It evaluates these candidates and assigns a confidence score that reflects read support and mapping quality.

For readers new to Circle-Map, it helps to think of the tool as:

  • A targeted search for circular junction evidence
  • A scoring system that ranks candidate eccDNA events
  • A filter that converts scores into a final call set

Understanding this logic makes it easier to explain Circle-Map outputs to experimental collaborators and reviewers.

Recommended Circle-Map settings for common eccDNA projects

Recommended Circle-Map settings for eccDNA projects depend on library type, read length, and sequencing depth. There is no single "best" configuration, but experience shows that starting from a documented baseline is far safer than tuning everything manually.

Practical suggestions include:

  • Fix a standard minimum read support threshold per circle before the project starts.
  • Align Circle-seq and WGS projects to the same reference build when comparing across datasets.
  • Record Circle-Map versions and configuration files with each run for traceability.

In our eccDNA Sequencing (Circle-seq) projects, we usually integrate Circle-Map into a stable Snakemake or Nextflow pipeline so that re-runs remain consistent over time.

Debugging Circle-Map: what to check when outputs look wrong

Debugging Circle-Map outputs means distinguishing real biology from technical artifacts. When results look counterintuitive, we recommend checking the following:

  • Do samples with unusually low circle counts also show poor QC metrics or low coverage?
  • Are apparent hotspots concentrated in low-mappability or blacklisted regions?
  • Do circle size distributions and genomic feature enrichments match expectations for the model system?

Walking through this checklist with BAM files and QC reports often reveals whether the issue lies in library prep, mapping, or Circle-Map configuration.

CReSIL and New eccDNA Callers: When to Go Beyond Circle-Map

CReSIL is an eccDNA caller designed to improve detection in RCA-based libraries such as Circle-seq. It aims to reduce specific artifacts while retaining sensitivity to real eccDNA junctions.

In many bioinformatics cores, CReSIL and similar tools are deployed alongside Circle-Map rather than as strict replacements. This complementary approach gives a richer, more conservative view of eccDNA profiles.

What CReSIL changes in the eccDNA detection workflow

CReSIL changes the eccDNA detection workflow by modeling the sequencing patterns created by rolling circle amplification. It looks for junction signals consistent with circular templates while down-weighting patterns likely to be noise.

For Circle-seq eccDNA data analysis, this often leads to:

  • Fewer low-confidence circles in problematic regions
  • Improved robustness in libraries with variable enrichment efficiency
  • More stable call sets across replicates

Positioning CReSIL correctly in your eccDNA bioinformatics pipeline means clarifying when it runs, how its outputs are combined with Circle-Map, and how results are reported.

Circle-Map vs CReSIL: complementary strengths in real datasets

Circle-Map vs CReSIL comparisons are most useful when framed as complementary strengths rather than a winner-takes-all benchmark. In practical datasets, teams often see:

  • Circle-Map providing broad coverage and sensitivity
  • CReSIL offering additional robustness in noisy or challenging libraries
  • Overlaps between tools forming a "core" high-confidence eccDNA set

A simple comparison table in your documentation—columns for sensitivity, artifact resistance, runtime, and integration effort—supports internal discussions and AI summarization alike.

Using multiple eccDNA callers in one pipeline

Using multiple eccDNA callers in one pipeline allows teams to define confidence tiers based on call concordance. This is particularly helpful when designing follow-up experiments or eccDNA Validation Service panels.

Comparative analysis of eccDNA callers: ECCsplorer vs Circle-Map vs published sets in Arabidopsis/human data, featuring skyline plots, length distributions, and Venn overlaps (Mann L. et al. 2022 BMC Bioinformatics). Figure 4. Comparison of eccDNA candidates detected by ECCsplorer, Circle-Map, and originally published call sets in Arabidopsis and human circSeq data, including skyline plots, length distributions, and Venn overlaps (Mann L. et al. (2022) BMC Bioinformatics).

Typical strategies include:

  • Intersection set – Circles detected by both Circle-Map and CReSIL, treated as high confidence.
  • Union set with tiers – Calls unique to one tool are labeled as lower confidence but kept for exploratory analysis.
  • Caller-specific reports – Separate tables for each tool plus a combined summary, making it easier to track decisions.

Documenting this logic helps collaborators understand why some circles are prioritized for validation while others remain exploratory.

Designing a Robust eccDNA Pipeline for Circle-Seq Projects

Designing a robust eccDNA pipeline for Circle-seq projects means thinking about reference selection, library bias, and multi-sample comparison from day one. It turns a collection of tools into a reusable framework that supports many experiments.

A well-designed pipeline also makes it easier to plug in external partners such as eccDNA Research Solutions, where additional analyses or custom modules may be added.

Reference genomes, blacklists, and mappability considerations

Reference genomes, blacklists, and mappability tracks are foundational to any eccDNA bioinformatics pipeline. They define which regions of the genome can be trusted for mapping and which require caution.

Practical recommendations:

  • Use a single reference build for all samples in a project, and clearly document it.
  • Apply community-curated blacklists to remove known artifact-prone regions.
  • Inspect mappability tracks to understand why certain genomic areas show dense eccDNA calls.

These practices reduce the risk of misinterpreting technical artifacts as biological hotspots.

Handling rolling circle amplification bias and PCR duplicates

Handling rolling circle amplification bias and PCR duplicates is critical for Circle-seq data. RCA can generate highly uneven coverage across circles, and PCR can amplify these patterns further.

To control for these effects:

  • Apply duplicate marking carefully and report the fraction of duplicates per sample.
  • Use coverage normalization strategies when comparing circle abundances across conditions.
  • Interpret extreme hotspots in the context of both biology and library preparation notes.

From experience, teams that track RCA and PCR effects explicitly are better positioned to defend their eccDNA findings in publications and high-stakes scientific reviews.

Multi-sample comparison, differential eccDNA, and dashboards

Multi-sample comparison and differential eccDNA analysis turn single-sample call sets into project-level insights. This can include case–control contrasts, treatment responses, or time-course designs.

eccDNA patterns in osteoporosis: Multi-sample comparison showing PCA, correlations, chromosomal load, Venn overlaps, and length distributions versus normal bone (Zhu Q. et al. 2023 Aging). Figure 5. Multi-sample eccDNA/ecDNA patterns in osteoporosis versus normal bone, including PCA, sample correlations, chromosomal eccDNA load, Venn overlaps, and length distributions (Zhu Q. et al. (2023) Aging).

Best practices include:

  • Normalizing circle counts across samples with consistent methods
  • Using statistical frameworks that account for library size and dispersion
  • Providing interactive dashboards or reports to non-bioinformatics stakeholders

In many of our Circle-seq collaborations, bioinformatics cores ultimately deliver web-based dashboards where experimental teams can explore eccDNA patterns without touching raw files.

Quality Control, Validation, and Reporting Standards for eccDNA Analysis

Quality control, validation, and reporting standards are what turn an eccDNA bioinformatics pipeline into a trusted decision tool. They also make it easier for external reviewers to evaluate your work.

Instead of adding QC at the end, we recommend designing QC and validation as first-class modules in the eccDNA analysis workflow.

Key QC metrics across the eccDNA pipeline

Key QC metrics in an eccDNA bioinformatics pipeline span from raw reads to final call sets. Monitoring them across runs reveals trends and outliers early.

Useful metrics include:

  • Total reads, usable reads after trimming, and mapping rate
  • Library complexity and duplication rate
  • Fraction of reads contributing to eccDNA calls
  • Distribution of circle sizes and genomic feature categories
  • Caller-specific confidence scores and their distributions

Summarizing these metrics in standardized tables and plots not only builds authority but also simplifies cross-study comparisons.

eccDNA characterization in bone: Length distributions, chromosomal locations, and genomic-region classes differentiating osteoporotic and normal human bone tissue (Zhu Q. et al. 2023 Aging). Figure 6. Length distributions, chromosomal locations, and genomic-region classes of eccDNA/ecDNA detected in osteoporotic versus normal human bone tissue (Zhu Q. et al. (2023) Aging).

Experimental validation: eccDNA qPCR and orthogonal assays

Experimental validation for eccDNA typically involves targeted assays that confirm predicted junctions. eccDNA qPCR is a common approach and fits naturally with a bioinformatics-driven shortlisting process.

A practical validation workflow is:

  1. Use the eccDNA bioinformatics pipeline to generate a ranked list of candidate circles.
  2. Select candidates from different confidence tiers (e.g., detected by both Circle-Map and CReSIL vs single-caller hits).
  3. Design junction-spanning primers and run targeted eccDNA qPCR or complementary assays.

Our eccDNA Validation Service is often used at this stage to translate computational findings into wet-lab confirmation, especially for high-value therapeutic or biomarker candidates.

Reporting eccDNA results for collaborators, reviewers, and regulators

Reporting eccDNA results for collaborators, reviewers, and regulators requires a structure that separates methods, QC, and biological interpretation. A clear template also helps generative AI engines extract key facts reliably.

A typical report might include:

  • A concise overview of the eccDNA bioinformatics pipeline used
  • QC summaries with predefined acceptance ranges
  • Tables of high-confidence eccDNA, including Circle-Map and CReSIL concordance
  • Example genome browser views and derived statistics (e.g., enrichment near promoters)
  • Explicit criteria for calling an eccDNA "high confidence"

This level of transparency shows that the pipeline is not a black box but a documented, auditable process.

From Data to Decisions: Partnering with CD Genomics for eccDNA Pipelines

Partnering with CD Genomics for eccDNA analysis connects your raw data to a validated eccDNA bioinformatics pipeline, sequencing expertise, and orthogonal validation options. For bioinformatics cores, CRO data teams, and advanced eccDNA researchers, this can accelerate projects and standardize workflows.

End-to-end eccDNA sequencing and analysis (primary CTA)

End-to-end eccDNA sequencing and analysis packages combine library preparation, eccDNA Sequencing (Circle-seq), and a standardized eccDNA bioinformatics pipeline based on Circle-Map, CReSIL, or both. This reduces integration overhead for teams that prefer a single partner from bench to report.

Typical deliverables include:

  • Raw and processed sequencing data
  • Documented pipeline configurations and versioning
  • Annotated eccDNA call sets and visual summary reports

All services are for research use only and are tailored to each project's study design.

Custom eccDNA research and validation services (secondary CTAs)

Custom eccDNA Research Solutions support teams that already have data but need help with pipeline design, optimization, or multi-omics integration. This can include:

  • Reviewing existing Circle-Map or CReSIL workflows
  • Building modular pipelines that plug into your infrastructure
  • Adding downstream analyses such as pathway enrichment or integration with epigenomic marks

For projects prioritizing experimental confirmation, our eccDNA Validation Service provides targeted validation assays based on the bioinformatics shortlist.

How to get started: data sharing, pilot runs, and next steps

Getting started with an eccDNA bioinformatics pipeline does not require a full-scale commitment. Many teams begin with a pilot analysis of a small dataset to evaluate fit and reporting.

A typical engagement path is:

  1. Share representative FASTQ or BAM files under a secure agreement.
  2. Run a pilot eccDNA analysis using Circle-Map, CReSIL, or both.
  3. Review the pipeline, QC, and reports with your team and refine requirements.

From there, the pipeline can be scaled to full cohorts and integrated into your standard operating procedures.

FAQs: eccDNA Bioinformatics Pipelines, Circle-Map, and CReSIL

Q1. What is an eccDNA bioinformatics pipeline, in simple terms?

An eccDNA bioinformatics pipeline is a step-by-step analysis workflow that takes raw Circle-seq or WGS reads and produces a curated list of circular DNA molecules with annotations and QC metrics. It combines standard NGS steps (QC, mapping) with specialized eccDNA callers such as Circle-Map and CReSIL plus downstream visualization and reporting.

Q2. How do I choose between Circle-Map and CReSIL for my project?

Circle-Map and CReSIL address slightly different needs. Circle-Map is a flexible eccDNA caller that works well across many datasets, while CReSIL focuses on improving robustness in RCA-based Circle-seq libraries. Many teams run both callers in the same eccDNA analysis workflow, treating overlapping circles as high confidence and using tool-specific calls for exploratory work.

Q3. Can I send existing BAM files for eccDNA analysis, or do I need raw FASTQ?

You can usually send either, but raw FASTQ files give more flexibility for mapping and QC, especially if the original alignment was not optimized for circular DNA. When only BAM files are available, the eccDNA bioinformatics pipeline will focus on caller choice, filtering, and annotation using the existing alignments.

Q4. How many samples do I need to see meaningful eccDNA patterns?

There is no strict minimum, but eccDNA projects benefit from biological replicates and well-matched controls. Even small pilot sets with 3–5 samples per group can reveal clear patterns if the study design and eccDNA pipeline are sound. Larger cohorts allow more powerful statistics and more confident interpretation of subtle differences.

Q5. How are eccDNA calls typically validated in the lab?

eccDNA calls are often validated using targeted assays such as eccDNA qPCR with junction-spanning primers or Sanger sequencing of specific junctions. A common strategy is to select candidates from different confidence tiers defined by Circle-Map and CReSIL outputs, then use an eccDNA Validation Service to run orthogonal experiments that confirm or refine the computational results.

Related reading

References

  1. Mann, L., Seibt, K.M., Weber, B. et al. ECCsplorer: a pipeline to detect extrachromosomal circular DNA (eccDNA) from next-generation sequencing data. BMC Bioinformatics 23, 40 (2022).
  2. Zhu, Z. et al. Extrachromosomal circular DNA causes type H vessel loss and bone loss in age-related osteoporosis. Aging (Albany NY) 15, 205388 (2023).
  3. Guo, J., Zhang, Z., Li, Q. et al. TeCD: The eccDNA Collection Database for extrachromosomal circular DNA. BMC Genomics 24, 47 (2023).
  4. Prada-Luengo, I., Krogh, A., Maretty, L. et al. Sensitive detection of circular DNAs at single-nucleotide resolution using guided realignment of partially aligned reads. BMC Bioinformatics 20, 663 (2019).
  5. Wanchai, V., Jenjareonpun, P., Leangapichat, T. et al. CReSIL: accurate identification of extrachromosomal circular DNA from long-read sequences. Briefings in Bioinformatics 23, bbac422 (2022).
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.


Related Services
Inquiry
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.

Copyright © CD Genomics. All Rights Reserved.
Top