
When an acetylome project goes sideways, it usually isn't because the dataset lacks "enough IDs." It's because the package can't survive scrutiny: no clear QC summary, thresholds that aren't disclosed, and tables that aren't reviewer-ready.
For PIs, postdocs, PMs, and platform engineers, the practical question is simple: What should I get at handoff so I can verify the result, defend it to reviewers, and avoid a second round of rework?
This guide answers that question by defining acetylome deliverables as an acceptance-ready package: a one-page QC snapshot, transparent decision thresholds, and tables that preserve evidence (not just "final lists"). In other words: reporting transparency is part of the deliverable, not a courtesy.
Key Takeaway: In acetylome studies, deliverables are not "results." Deliverables are the audit trail that makes results defensible.
Why deliverables matter more than "more IDs" in acetylome studies
In lysine acetylation proteomics, it's easy to over-index on the headline metric: how many acetylated sites were identified. That's why the intent behind searches like "acetylome QC" and "lysine acetylation proteomics report" is often practical: what should the delivery package contain so the results are defensible?
But reviewers and project stakeholders rarely reject a paper because you reported too few sites.
They reject—or send you back for major revisions—because they can't tell:
- whether the quantitative signal is stable across replicates
- whether batch drift or missing values are driving the pattern
- whether site-level claims are supported by localization evidence
- which thresholds were predefined vs chosen after looking at outcomes
That's why "deliverables" are a higher-intent acceptance criterion than "more IDs." If your handoff package makes key decisions reproducible and your QC interpretable in minutes, you reduce reviewer friction and rework.
One sentence that matters operationally: in common acetylation project sizes (often a dozen-ish samples), whether deliverables are reviewer-ready frequently determines how many cycles of re-analysis (or re-running) the team goes through.
Acetylome deliverables: what should be included at minimum
"Deliverables" should be defined as table-ready, reviewer-auditable outputs—not an email summary and a single spreadsheet.
A useful minimum package separates:
- QC (can I trust the data?)
- Results tables (what changed?)
- Metadata / definitions (what exactly was compared?)
- Disclosure (what was filtered and why?)
The minimum deliverable set (table-ready)
At minimum, a good acetylome deliverables package should include:
- A one-page QC summary (single figure + small table) that covers identification, intensity, missingness, replicate agreement, batch checks, and localization confidence at-a-glance.
- Sample metadata table: sample IDs, group labels, replicate structure, batch/run identifiers, and key covariates that could affect interpretation.
- Contrast definition table: every comparison tested, the exact grouping logic, and how replicates were handled.
- Site-level results table (the core): one row per acetylation site (and optionally per site-form), including effect size, uncertainty/statistics, multiple-testing control, and localization confidence fields.
- Filtering / flagging table: what was excluded or down-weighted (and why), and what remains exploratory.
- Figure set: QC one-pager + main results plot + batch check + missingness summary.
To make this acceptance-ready, each file should have a clear job:
- The QC summary lets a reviewer decide "do I trust the quant layer?" without opening supplementary files.
- The metadata + contrast tables let a reader reconstruct the experimental logic (and detect hidden confounders).
- The site-level results table preserves the full evidence needed to re-filter and re-plot.
- The flagging table/fields demonstrate reporting transparency: you didn't hide borderline items; you labeled them.
The aim is not to drown readers in files—it's to ensure any downstream reviewer, collaborator, or internal auditor can reconstruct how you got to your claims.
What must be reproducible vs what can be exploratory
A clean way to reduce conflict is to label deliverables as reproducible (acceptance) vs exploratory (hypothesis).
Reproducible deliverables should have predefined comparisons, declared filters, and documented thresholds. They're the basis for manuscript claims.
Exploratory outputs are allowed—sometimes valuable—but must be clearly labeled as such (e.g., relaxed missingness thresholds, post hoc subgrouping, or lenient localization classes used for pathway exploration). Exploratory results should not be presented as final unless they are elevated with additional evidence.
QC Summary: what a reviewer can understand in 60 seconds
A reviewer does not want QC scattered across five figures and two supplementary files. They want one page that answers:
- Was the dataset technically stable?
- Are the replicates behaving?
- Is missingness controlled and disclosed?
- Is batch drift present or ruled out?
- Are site-localization claims honest?
A good QC summary reads like an acceptance checklist: it tells the reviewer what passed, what was borderline, and what was flagged.
QC items that must be shown
Include these items explicitly in the QC summary.
1) Identification trend (IDs over runs / batches)
- Show identification counts across runs (e.g., PSMs, peptides, proteins, acetyl-sites).
- The question: are IDs stable, or do you see run-to-run cliffs?
2) Intensity distribution and spread
- Show intensity distributions across samples (boxplots/violin plots).
- The question: are there global shifts suggesting injection/loading differences or normalization problems?
3) Missingness overview
- Show missing-value fraction per sample and per feature group.
- The question: is missingness concentrated in one batch/condition, or broadly distributed?
4) Replicate agreement
- Show replicate correlation and/or replicate clustering.
- The question: do replicates cluster together more than conditions do?
Example-only note: Many teams treat very high replicate correlation (e.g., Pearson/Spearman >0.9) as reassuring, but the right target depends on sample type, workflow, and whether missingness is high.
5) Batch drift / batch effects check
- Include an unsupervised view (e.g., PCA/UMAP) colored by batch and condition.
- The question: is the strongest separation explained by biology—or by run order/batch?
6) Localization confidence summary
- Provide a compact summary of localization confidence classes (e.g., high/medium/low confidence bins).
- The question: what fraction of reported sites meet the "high-confidence localization" definition used for claims?
For a general QC framework that emphasizes adaptable, transparent QC gates in quantitative proteomics, see "A framework for quality control in quantitative proteomics" in PMC11030400.
How QC should be presented
The most reviewer-friendly structure is:
- One QC figure (multi-panel, one page)
- One small QC table (key counts + key QC thresholds + pass/flag)
This prevents a common failure mode: a manuscript that claims "QC was performed" but forces reviewers to hunt for what actually happened.
A good QC one-pager also makes internal project management easier: PMs can compare runs across time using the same template.
Rework triggers
QC isn't just descriptive—it should define rework triggers.
Typical "stop and rethink" signals include:
- a clear ID cliff that aligns with a specific run segment
- a strong intensity shift in one group that looks like loading, not biology
- missingness that is asymmetric by condition (suggesting dropout-driven effects)
- replicates that don't cluster together in unsupervised views
- apparent biology that disappears when you color by batch
If you can't define rework triggers, you can't define acceptance.
Transparent thresholds: effect size + FDR (and what to disclose)
Threshold transparency is where many acetylome packages become reviewer-hostile.
A reviewer wants to know exactly how a site moved from "detected" to "reported as regulated." That requires more than a p-value.
Why effect size matters (not just p-values)
P-values are sensitive to sample size and variance. In small cohorts, real effects can look non-significant; in large cohorts, tiny effects can look significant.
Effect size (e.g., log2 fold change) tells the reader what changed in a biologically interpretable way. A reviewer-ready report makes both visible:
- effect size (direction + magnitude)
- uncertainty / significance
- multiple testing control (FDR)
What to disclose (minimum transparency checklist)
At minimum, disclose:
- How contrasts were defined (groups, replicates, pairing if any)
- What is being quantified (site intensity vs site occupancy vs protein-normalized site signal—state the interpretation clearly)
- What normalization was applied (conceptually: within-run, across-run, batch correction if any)
- What filtering was applied before testing (and whether it was contrast-specific)
- How missing values were handled (no imputation / imputation method / partial filtering) — and where this is documented
- Which multiple testing method was used (e.g., Benjamini–Hochberg)
- Which thresholds define claim tiers: "regulated" vs "reported but not claimed"
- What was excluded (and whether exclusions were global or condition-dependent)
Pro Tip: If a threshold isn't disclosed, reviewers will assume it was chosen after seeing the volcano plot.
Fields that should appear in the results table
A reviewer-ready acetylome results table should carry enough fields that downstream readers can re-filter without re-running the pipeline.
Recommended fields (site-level table):
- Protein identifiers (stable IDs) and gene symbol
- Modified residue + position (e.g., K123), plus peptide context where possible
- Effect size: log2 fold change for each contrast
- Uncertainty: standard error / confidence interval (when available)
- Statistical significance: p-value and BH-FDR (q-value)
- Quantification completeness: counts of quantified samples per group
- Missingness pattern: flags such as "missing mostly in group A" vs "random missingness"
- Localization confidence field(s): probability/score and class bin
- Evidence counts: supporting PSMs/peptide observations per site (where available)
- Filter flags: pass/fail for each declared gate (and the reason)
Example-only thresholds (must be labeled as examples):
- BH-FDR: many manuscripts use q<0.05 or q<0.01 depending on context
- Effect size: some teams pair FDR with a minimum |log2FC| (e.g., ≥0.58 for 1.5×)
These are not universal standards. The reviewer-ready move is to declare what you used and why.
Avoiding threshold fishing
Threshold fishing is when teams try multiple combinations of FDR, fold-change cutoffs, missingness filters, and localization thresholds until the story looks good.
You reduce this risk—and reduce reviewer skepticism—by:
- defining contrasts before analysis
- predefining a primary threshold set for claims
- providing a "flagged/exploratory" category instead of silently changing gates
- including all fields needed for re-filtering (so reviewers can see you didn't hide alternative cuts)
Site localization confidence: how to keep claims honest
In acetylome projects, the temptation is to treat a "site list" as if every site is equally certain.
Reviewers often focus on localization confidence when the biological conclusion depends on a specific lysine (e.g., motif claims, enzyme inference, or site-specific functional interpretation).
A robust way to communicate localization is to report probability-based confidence—not just a binary "localized / not localized."
For example frameworks and tools that compute site localization probabilities and false localization concepts across PTMs, see:
- PTMProphet (probability-based localization): PMC6898736
- LuciPHOr2 (generic PTM localization scoring): PMC4382907
- Multi-evidence confidence framing for PTM reporting: PMC10243108
What to report (not just what to claim)
A reviewer-friendly approach is to treat localization as a field you report, and only secondarily as a claim you make.
Include:
- a localization probability/score for each candidate site assignment
- a clear definition of what counts as "high confidence" vs "moderate" vs "low"
- the number of supporting observations (not just the best-scoring PSM)
Example-only bins (illustrative):
- High-confidence localization: probability ≥0.75 or ≥0.9 (depending on method)
- Moderate: 0.5–0.75
- Low: <0.5
(Again: these are examples; your report must state the actual method-specific mapping.)
When localization confidence is not strong enough, downgrade the wording using a simple "claim ladder":
- High-confidence site → site-level claim is acceptable (e.g., "K123 increased").
- Moderate-confidence site → qualify (e.g., "evidence supports K123/K126 region; site assignment is moderate confidence").
- Low-confidence site → do not make a site-specific claim; report peptide-level evidence and keep interpretation conservative.
A practical rule: if the biological conclusion depends on a specific lysine, require your "high-confidence" class and show how many observations support it.
That discipline—matching claim strength to localization evidence—often saves a revision cycle.
Reviewer-ready tables and figures: a recommended acetylome deliverables package
A reviewer-ready package is small, structured, and complete. It should let a reader understand what changed, verify QC, and re-filter without guessing.

Recommended figures
A compact, high-value set:
1. QC one-pager
- IDs trend
- intensity distribution
- missingness overview
- replicate agreement
- batch check
- localization confidence summary
2. Main result plot (effect size + BH-FDR)
- volcano or effect-size plot where thresholds are explicitly drawn and labeled
3. Batch check plot
- PCA/UMAP colored by batch and condition
4. Missingness summary plot
- per-sample missingness plus a feature-level completeness histogram
Recommended tables
Tables should be designed for auditability.
1. Sample metadata
- sample ID, group, replicate type, batch/run order, key covariates
2. Contrast definitions
- comparison name, group definitions, replicate aggregation rules
3. Site-level results table (core)
If you only standardize one thing, standardize the column schema. A reviewer-ready acetylome table typically benefits from a structure like:
| Column group | Example fields (illustrative) | Why it matters |
|---|---|---|
| Identity | protein ID, gene, site (K position), peptide sequence window | stable referencing in text and review |
| Quant | group means, log2FC, SE/CI | separates magnitude from significance |
| Stats | p-value, BH-FDR, test name | makes thresholds auditable |
| Completeness | n quantified per group, missingness % | prevents "significant because missing" artifacts |
| Localization | site probability/score, class (high/mod/low) | aligns claims with evidence |
| Flags | filter pass/fail, reason codes | preserves transparency and re-filterability |
4. Filtered/flagged entries with reasons
- a separate table (or a flag column) that preserves excluded entries for transparency
Red-flag packaging patterns
If you see these, expect reviewer pushback:
- No metadata: groups/replicates/batches are not documented
- Thresholds hidden: only "significant sites" are delivered
- Final-only lists: no fields that allow re-filtering
- Localization omitted: site claims without localization evidence
- Imputation undisclosed: missing values handled but not documented
Consultation-only closing
If you share (1) your group/contrast structure, (2) sample type and approximate sample count, and (3) how strong your intended claims need to be (protein-level vs site-level), we can recommend a deliverables structure and QC/threshold disclosure that is acceptance-ready and reviewer-friendly.
For research use only. Not for clinical diagnosis, treatment, or individual health assessments.
Author
CAIMEI LI — Senior Scientist at Creative Proteomics
LinkedIn: CAIMEI LI on LinkedIn
Our products and services are for research use only.