Home
Resource
Hi-C Budget Planning: How Much Hi-C Sequencing Depth Do You Really Need?

Hi-C Budget Planning: How Much Hi-C Sequencing Depth Do You Really Need?

Summary

Hi-C depth should be planned from the biological question and expected output, not from resolution ambition alone.

There is no single "correct" Hi-C sequencing depth for every project. The right depth depends on the biological question and the output you need to review. Overshooting depth can burn budget without changing the next decision, while undershooting can leave you with a map that can't support interpretation.

In practice, Hi-C budget planning works best when you plan from the biological decision the dataset must support—not from a vanity resolution label. Whole-genome architecture questions, domain-scale questions, and loop-focused questions do not place the same demand on depth, replication, or data quality.

The more useful question is not "How deep can we sequence?" but "What level of Hi-C evidence does this project actually need to make the next decision?"

Research use only. This is study design and project planning—not clinical interpretation.

Why Hi-C sequencing depth should be planned from the question, not the number

Teams often start with a target depth or a target resolution because it sounds concrete: X read pairs, Y "valid pairs", Z kb bins. The problem is that depth is not an isolated quality metric. It only matters in relation to the feature you want to recover and the comparison you need to make.

Sequencing depth only becomes meaningful when it is linked to the biological feature the team wants to recover with confidence.

A dataset that is "good enough" for compartments can still be frustrating for loop calling. A dataset that produces a plausible TAD narrative can still be underpowered for differential loops across donors. And the same raw sequencing volume can yield very different usable contacts depending on duplicates, mapping, and library complexity.

In practice, teams often ask for the deepest design they can afford before they define what outcome would actually count as success. That usually leads to vague expectations rather than a study design that can be reviewed and defended.

A more workable starting point is to define, in plain language:

What you need to conclude (e.g., broad architecture shift vs specific looping changes)
What output proves that conclusion (matrices, compartment tracks, domain calls, loop calls, differential comparisons)
What would make the result reviewable by your PI and bioinformatics lead

Once those are written down, the "how much depth?" question becomes budgetable.

What Hi-C depth really affects: resolution, confidence, and interpretability

Hi-C depth changes three practical things: the resolution you can work at, the confidence you can assign to features, and the interpretability of the final outputs.

Resolution is a usable bin size, not a promised label

Hi-C resolution is often discussed as if it were guaranteed ("we'll get 5 kb"). In practice, it's the bin size at which the contact matrix has enough data to be stable and reproducible.

That matters because the main feature classes sit at different scales:

Compartments: broad genome-wide patterns; typically visible at coarser binning
Domains/TADs: mid-scale organization; stability depends on coverage and noise
Loops: local enrichments; the bar for evidence is much higher

Confidence depends on usable contacts and replication

More depth does not just increase resolution; it changes how confidently the team can interpret the kinds of features they care about.

A key planning point is that "depth" in discussions is often raw reads, while analysis power is more closely tied to usable contacts after filtering. Parker, Davis, and Phanstiel emphasize that their recommendations are stated in "Hi-C contacts" that pass filtering—and that raw sequencing must be higher because the contact fraction varies with library quality (Parker, Davis & Phanstiel, 2023).

One practical implication for Hi-C data quality is that "more reads" is only helpful when the library can convert them into new unique contacts; otherwise, you mainly buy duplicates and unstable fine-bin maps.

Interpretability depends on what you promise the dataset will do

The same depth can be "enough" for one goal and misleading for another. This is why Hi-C study design should be written as a deliverable statement, not a read-count statement.

If your intent is loop-level inference, you need to be explicit about what you will and will not claim from the data, and how you will judge reproducibility. Yardımcı et al. benchmarked reproducibility and quality measures across depths and resolutions, showing that "quality" is multi-dimensional and depends on how you plan to use the data (Yardımcı et al. (2019) on Hi-C reproducibility and quality metrics).

When a modest Hi-C design is enough

Many teams assume they need the maximum depth they can afford. Often, a modest design is the more rational first step—if it is aligned to the right outcome.

A modest Hi-C design may be enough when…

the question is broad rather than ultra-local
the dataset is meant to guide next-step assay selection
the project is in a pilot phase, or sample type is uncertain
the expected deliverable is architectural context rather than robust loop recovery

Examples where modest depth is often defensible

Condition-to-condition architecture comparisons If the aim is to compare overall genome organization or compartment behavior across conditions, the planning target is stable matrices and reproducibility—not recovering every loop.

Domain/TAD-scale mapping for context If your next decision is "which regions deserve deeper follow-up," a modest design can provide the map you need without committing to a loop-level interpretation.

Phased project planning A minimum viable Hi-C project planning approach looks like:

Phase 1: baseline map + QC + a small set of reviewable outputs
Decision gate: decide whether deeper sequencing will change the conclusion
Phase 2: deepen (or switch method) only if Phase 1 supports it

This is often the cleanest way to control Hi-C sequencing cost without underpowering the work.

When underpowered Hi-C becomes a problem

Underpowering isn't just "less detail." It is a mismatch between what the project expects to claim and what the dataset can support.

The real risk of shallow Hi-C is not simply lower detail; it is building a dataset that cannot support the conclusion the project was designed around.

How underpowered designs fail in practice

1) A resolution promise that isn't usable Teams budget for a label ("we want 5–10 kb") and only later discover the data are too sparse at that bin size to support stable loop calling or differential comparisons.

2) Sparse data invites overinterpretation When contacts are limited, any bright pixel can look like a loop. A common outcome is a compelling-looking figure that doesn't replicate or doesn't survive reasonable QC thresholds.

3) Rework is the hidden cost If the dataset can't support the planned figures (loop lists, differential loops, boundary changes), the budget "savings" disappear. You either accept weak interpretation or pay again for deeper sequencing and reprocessing.

A common planning mistake is to budget for Hi-C as if any contact map is automatically informative. The real issue is whether the final dataset can support the specific comparisons, figures, or mechanistic claims the team expects to make.

Underpowering is often caused by QC, not just reads

Even with a large read count, projects can underdeliver if:

library complexity is low (extra sequencing buys duplicates)
ligation efficiency is weak (few usable long-range contacts)
sample quality compromises digestion/ligation
replication is insufficient for the biology's variability

ENCODE's standards make this explicit by framing reviewable outputs (matrices, loops, compartments) alongside QC expectations like unique contacts, duplicate rates, and long-range contact fractions (ENCODE's Hi-C data standards and pipeline outputs). This is why "sample requirements" and QC planning should be part of the budget conversation.

How to match budget with a realistic Hi-C deliverable

Budget discussions go faster when everyone agrees on what the team needs to review at the end.

Budget becomes easier to justify when the team can describe what final output package would make the project scientifically usable.

Step 1: define the output package (what you will review)

For many teams, a practical deliverable set includes:

normalized interaction matrices at defined resolutions
compartment tracks (and compartment switching if comparing conditions)
domain/TAD calls (and boundary comparisons if relevant)
loop calls only if the question truly requires them
a QC summary that makes usable contacts and replicate agreement easy to judge

This turns "how much depth?" into a scoped requirement: the depth is whatever makes these outputs stable enough to review without hand-waving.

Step 2: budget around contacts and replicates, not just reads

Parker, Davis, and Phanstiel recommend thinking in terms of contacts per condition for differential loop questions, noting that contacts are post-filtering and that replication matters—especially as biological variability increases (Parker, Davis & Phanstiel, 2023).

You don't need to treat their numbers as universal thresholds to use the lesson: loop-level goals are expensive, and power comes from a combination of usable contacts and replication.

Step 3: use phased planning to reduce budget risk

Phased planning is not "doing less." It is spending later, when you can justify what the additional depth will change.

Phase 1: generate enough data to evaluate QC, complexity, and signal at the intended scale
Gate: decide whether deeper sequencing will materially change the conclusion
Phase 2: deepen or pivot method based on evidence

For teams that want a concrete example of what "reviewable deliverables" can look like in a service context, CD Genomics describes outputs such as matrices at multiple resolutions, compartment analysis, domain calls, loop calling, differential comparisons, and QC reporting in its Hi-C workflow pages.

Hi-C budget planning framework showing sequencing depth project fit and usable deliverables A useful Hi-C budget balances sequencing depth, project goals, sample realities, and deliverables that support real decisions.

Why "highest possible resolution" is often the wrong budget goal

"Highest possible resolution" is a weak planning goal because it is not tied to a decision. It can also create expectations that the project design cannot actually support.

The most expensive Hi-C design is not automatically the most useful one if the project would reach the same next decision with a simpler baseline map.

Resolution inflation creates interpretation pressure

Once a project is labeled "high resolution," stakeholders often expect stable loop lists, differential loops, and mechanistic stories. Those goals can be valid, but they require depth, replication, and QC to match.

Sometimes the correct next step is a different method, not more reads

If your real question is loop-centric or locus-specific, the right move may be to choose a method that concentrates signal rather than pushing whole-genome Hi-C deeper.

In practice, teams get better outcomes when method choice happens before depth choice. If you are comparing options such as Hi-C, Micro-C, Capture Hi-C, and HiChIP, a structured decision framework like CD Genomics' method-selection guide can keep the trade-offs explicit.

Teams often ask for "highest possible resolution" when the real need is to protect the 3D genome analysis budget from rework—so it helps to decide which assay concentrates evidence where you actually need it, before you price in depth.

For example, if you are genuinely paying for finer local contact structure, you may want to evaluate whether a Micro-C design is a better fit than simply driving Hi-C deeper (see the Micro-C XL service as a reference point for what "higher-resolution" is usually trying to achieve).

Common Hi-C budgeting mistakes teams make early

This section is intentionally practical—because most budget problems start before the first lane is sequenced.

Planning from a resolution label instead of a biological endpoint
If you can't write down what decision the map will support, the label isn't a plan.
Treating raw reads as equivalent to usable evidence
Reads become useful contacts only after filtering. If the library cannot generate new unique contacts, deeper sequencing won't rescue it.
Expecting one dataset to answer every structural question
Trying to maximize compartments, domains, loops, and differential calls in one first study often leads to budget cliff behavior.
Underestimating the importance of sample and library QC
Hi-C sample requirements (fixation, integrity, complexity) determine what depth can buy. Quality is multi-dimensional and depends on intended use (Yardımcı et al., 2019).
Failing to define usable outputs before budgeting
If "done" is not defined as a reviewable deliverable package, budget discussions become guesswork.
Overdesigning the first study instead of planning a minimum viable phase
A baseline map with strong QC and a clear decision gate often teaches you more than a maximal design that produces ambiguous calls.

For a grounded overview of how artifacts, filtering, and binning choices affect what you can credibly call, Wingett and Andrews summarize practical processing and QC considerations for Hi-C and related methods (Wingett and Andrews' review of Hi-C processing and QC).

Conclusion: budget for the answer you need, not the biggest number you can request

There isn't a single right Hi-C sequencing depth. The right design is the one that supports your biological question at a confidence level you can defend, producing outputs your team can review without stretching the inference.

A modest plan can be scientifically valid when it targets architecture-level deliverables. Underpowered designs fail when they drift into loop-level expectations without the evidence to support them. And "highest resolution" is often the wrong budgeting logic if it does not change the next decision.

If your team is scoping a Hi-C sequencing service request, start by defining the deliverable package (what you will review) and the decision the data must support. Then budget depth, replication, and QC as one system—not as a single number.

For teams that want to align method choice and depth planning early, you can start with CD Genomics' Decision Guide: Hi-C vs Micro-C vs Capture Hi-C vs HiChIP and, when you're ready to specify outputs and QC expectations, review our Hi-C Sequencing Service.

FAQ

How much sequencing depth does a Hi-C project need?

There isn't one number. Depth should be planned from the feature you need to call and the comparison you need to make. A practical starting point is to define what outputs must be stable and reviewable (matrices, compartments, domains/TADs, and loops only if justified), then choose depth and replication to support those outputs.

Does higher Hi-C depth always improve resolution?

Not always. Depth helps until you hit limits set by library complexity and QC. After that point, extra sequencing can mostly buy duplicates rather than new unique contacts. That's why phased planning with an early QC gate is often more reliable than committing to maximum depth upfront.

Can a modest Hi-C design still be useful?

Yes—when the question is broad, when the dataset is meant to guide next-step assays, or when you need architectural context rather than loop-level mechanistic claims. A modest, well-QC'd baseline can be more actionable than a shallow attempt at high-resolution loop calling.

What happens if a Hi-C dataset is underpowered?

The risk is not just "lower detail." The failure mode is a dataset that cannot support the conclusion the study was built around—leading to unstable calls at the intended resolution, weak differential comparisons, and rework.

How should teams connect Hi-C budget with deliverables?

Use deliverables as the budgeting unit: define what you need to review at the end (QC summary, matrices at defined resolutions, compartment/domain outputs, and loop outputs only if required). Then decide what additional depth would change if you sequenced more—and only buy that depth when it changes a decision.

Author

Dr. Yang H.
Senior Scientist at CD Genomics

Dr. Yang H. is a Senior Scientist at CD Genomics, specialising in 3D genome technologies, chromatin interaction study design, and interpretation strategies for research-use-only projects. He supports research teams in planning fit-for-purpose sequencing workflows and translating complex chromatin architecture data into reviewable scientific outputs.

LinkedIn: Dr. Yang H.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.