*RefereeBio

METHODOLOGY

How RefereeBio evaluates manuscripts.

RefereeBio is purpose-built for the peer-review process. It independently evaluates manuscript quality, journal fit, reviewer variability, and revision progress as distinct dimensions — then surfaces the real limiting factors and gives authors a precise, actionable path to strengthen their submission.

FROM MANUSCRIPT TO REVISION PLAN

How the assessment is assembled.

  1. Read the manuscript as an argument.Identify the central claim, proposed advance, evidence chain, experimental logic, stated limitations, and the gap the work says it fills.
  2. Interrogate the evidence.Check whether controls, validation, replicate structure, statistical analysis, figures, and methods support the strength and specificity of the conclusions.
  3. Position the contribution.Examine relevant literature, nearby claims, novelty risk, competing explanations, and whether the manuscript distinguishes its contribution precisely enough.
  4. Apply journal context separately.Compare scope, audience, selectivity, evidence depth, reporting expectations, and review culture without allowing journal prestige to masquerade as manuscript quality.
  5. Stress-test through distinct reviewer lenses.Surface major and minor concerns, including places where reasonable reviewers may disagree about significance, mechanism, rigor, or interpretation.
  6. Rank the work.Turn the analysis into high-leverage revisions, reporting fixes, literature checks, and a submission strategy the author can inspect and override.

INDIVIDUAL MANUSCRIPT SCORES

Seven dimensions, seven different questions.

The dimensions are diagnostic, not interchangeable. A manuscript can be technically strong but conceptually incremental, novel but statistically fragile, or compelling overall while poorly matched to a particular journal. The purpose is to show where the manuscript is limited and why.

01

Novelty

What it asks: How unexpected, original, or field-advancing is the central contribution relative to the manuscript's own literature context?

What strengthens it: A clearly defined gap, a contribution distinguishable from nearby work, evidence that changes understanding rather than merely extending a known observation, and appropriately bounded priority claims.

What lowers confidence: Incremental extensions presented as breakthroughs, missing adjacent literature, novelty resting mainly on a new system or dataset, or claims that depend on an incomplete prior-art search.

02

Conceptual depth

What it asks: Does the work develop a meaningful model, framework, principle, or testable explanation—or does it stop at reporting a pattern?

What strengthens it: A coherent idea that connects the experiments, explains why the findings matter, generates predictions, and survives plausible alternative interpretations.

What lowers confidence: A collection of results without a unifying argument, a conceptual claim introduced only in the discussion, or a framework broad enough to explain any outcome.

03

Mechanistic depth

What it asks: How directly does the evidence establish how or why an effect occurs, rather than showing association alone?

What strengthens it: Perturbation, rescue, temporal order, dose or dependency evidence, orthogonal validation, exclusion of competing mechanisms, and a causal chain supported at each important step.

What lowers confidence: Correlation described as causation, pathway language inferred from markers alone, a single perturbation with off-target ambiguity, or conclusions that skip intermediate evidence.

04

Technical strength

What it asks: How credible, reproducible, and well-controlled is the experimental execution?

What strengthens it: Appropriate positive and negative controls, independent validation, clear biological versus technical replication, transparent methods, suitable sample quality, and results that hold across relevant systems or conditions.

What lowers confidence: Missing controls, under-described procedures, validation in only one model, unclear exclusions, dependence on one assay, or reproducibility claims unsupported by the reported design.

05

Statistical confidence

What it asks: Are the analysis and reporting appropriate for the design, data structure, and strength of the claim?

What strengthens it: Correct experimental units, justified tests, adequate sample size, multiple-comparison handling, effect sizes and uncertainty, assumption checks where relevant, and transparent treatment of missing data and outliers.

What lowers confidence: Pseudoreplication, unclear n, selective significance reporting, p-values without effect magnitude, uncorrected repeated testing, or analyses that ignore pairing, nesting, censoring, or repeated measures.

06

Figure strength

What it asks: Do the figures clearly, completely, and logically carry the manuscript's evidentiary argument?

What strengthens it: A readable sequence, representative and quantitative evidence presented together, complete legends, visible uncertainty and sample information, consistent labels, and panels that directly support the claims made in the text.

What lowers confidence: Decorative rather than evidentiary panels, unreadable axes, missing controls, selective examples, unsupported schematic conclusions, figure–text mismatches, or key results available only in prose.

07

Scope fit

What it asks: Unlike the other dimensions, scope fit is journal-dependent: does this manuscript belong in the selected journal's audience, remit, article mix, and expected level of advance?

What strengthens it: Direct relevance to the journal's readers, the right balance of breadth and specialization, an article type the venue publishes, and evidence depth consistent with that journal's editorial expectations.

What lowers confidence: Topic overlap without audience relevance, a manuscript that is too narrow or too broad for the venue, a mismatch in article type, or a level of mechanistic and conceptual support below what that journal typically demands.

How to read the scores: use the pattern across dimensions to find the limiting factor. Changes across versions and the evidence behind each score are more informative than the absolute number. Scope fit should never be used as a substitute for reading the journal's current author instructions.