Expanding the Human Proteome with Microproteins and Peptideins
Why It Matters
Standardizing microprotein annotation will expand the human proteome, unlocking new targets for drug discovery and immunotherapy.
Key Takeaways
- •TransCODE consortium defined standards to annotate ncORF‑encoded microproteins as reference proteins
- •PeptideAtlas found 183 microproteins; 30 validated with synthetic peptides
- •HLA‑I immunopeptidomics revealed 1,785 ncORFs, showing microproteins are common antigen sources
- •ORBL tool measures evolutionary “ORFness,” distinguishing constrained microproteins from random ORFs
- •C‑terminal peptides from microproteins are enriched 20‑fold in HLA presentation
Pulse Analysis
The human proteome has long been thought to comprise roughly 19,500 protein‑coding genes, but mounting evidence of translation from thousands of non‑canonical open reading frames (ncORFs) challenges that view. These tiny polypeptides—often called microproteins or small ORF‑encoded peptides—populate what researchers term the “dark proteome.” Their elusive nature stems from their size, low abundance, and limited evolutionary conservation, which have historically kept them out of reference databases such as GENCODE and UniProt. Recognizing their potential to reveal novel biology and therapeutic targets, the TransCODE consortium set out to create a unified annotation framework.
Leveraging the PeptideAtlas infrastructure, the consortium aggregated 3.5 billion tryptic MS/MS spectra and 240 million HLA‑focused spectra, applying ultra‑stringent false‑discovery‑rate thresholds (<0.1 % at protein level). This effort surfaced 183 microproteins in conventional proteomics, with 30 receiving orthogonal validation via synthetic peptide matching and parallel reaction monitoring. The HLA‑I immunopeptidomics arm uncovered peptides from 1,785 ncORFs, highlighting microproteins as a surprisingly common source of antigenic peptides, particularly from C‑terminal regions—a 20‑fold enrichment versus canonical proteins. These findings suggest that microproteins could fuel next‑generation cancer vaccines and personalized immunotherapies.
Beyond detection, the study introduced ORBL, a novel metric that quantifies evolutionary “ORFness” by tracking conservation of start/stop codons and reading‑frame integrity across mammals. By separating truly constrained microproteins from stochastic translation events, ORBL offers a principled way to prioritize candidates for functional studies. The consortium’s roadmap calls for broader adoption of these standards, integration of peptidein classifications into major databases, and systematic exploration of microprotein functions in disease contexts. As annotation pipelines evolve, the expanded proteome will likely reshape drug target discovery, biomarker development, and our fundamental understanding of human biology.
Expanding the human proteome with microproteins and peptideins
Comments
Want to join the conversation?
Loading comments...