Section 5 of the review
The infrastructure gap
Reproducibility, benchmarking, and model reuse — diagnosed across the 58 reviewed methods using a uniform scoring rubric. The headline numbers below are the field-level summary; per-method audit data lives on each method page and in Supplementary Table S2.
43
of 58
Release public code
27
of 58
Independently rerunnable (Repro 4/4)
14
of 43 with code
Cross FAIR4RS threshold (≥ 3/5)
121
of 129
Validation datasets used by exactly one method
Scoring rubric
Each method is scored on two complementary axes whose sub-criteria are applied uniformly across the 58 reviewed methods.
Reproducibility (0–4)
- Public tool availability
- README with instructions
- Bundled example data
- Step-by-step tutorial
A score of 4/4 is treated as independently runnable.
FAIR4RS (0–5)
Implements the FAIR Principles for Research Software. One point each for: an OSI-approved license, versioned releases, an archival DOI, an explicit environment specification, and citation metadata. A score ≥ 3/5 is treated as crossing the FAIR4RS threshold.
Validation-data reuse
Across the 58 reviewed methods, 129 distinct validation datasets are catalogued. 121 (93.8%) are used by exactly one method. Only eight datasets are shared, and never by more than five methods.
ENCODE
compendium
LINCS L1000
compendium
CMAP
compendium
FANTOM5
compendium
GPL570
compendium
GSE70138
GEO record
GSE140203
GEO record
GSE7307
GEO record
Interactive figures
Interactive versions of Figure 2 (modality coverage / UpSet) and Figure 3 (reproducibility by level) are scheduled for the next site milestone, alongside a dedicated datasets page showing the full reuse graph. For now, refer to Figures 2 and 3 in the published review.