Phage genomes are typically compact and modular, encoding structural proteins for heads and tails, replication and packaging enzymes, lysis proteins, tRNAs, and often auxiliary metabolic genes. Unlike cellular genomes, phage genomes show mosaic architecture due to recombination across lineages. Gene prediction and functional annotation rely on similarity to known phage genes, structural motif modelling, and databases; however, a large fraction of predicted genes are of unknown function. Read-based metagenomics and single-phage sequencing enable discovery of novel lineages and genome architectures, while long-read sequencing helps resolve complete genomes and terminal repeats.
Taxonomy and data resources: the International Committee on Taxonomy of Viruses (ICTV) governs phage classification, which increasingly relies on whole-genome comparisons and hallmark gene content. Public resources include NCBI RefSeq Viral, IMG/VR, PhageDB, and extensive viromics datasets. Host prediction methods use CRISPR spacer matches, sequence similarity, and host-range analyses to map phages to their bacterial targets.
Applications and methods: phagenomics underpins phage therapy research and biocontrol, where phages are used to target pathogenic bacteria. In ecology, viromics reveals phage roles in regulating bacterial populations, nutrient cycling, and horizontal gene transfer. Common approaches include culture-based isolation of lytic or temperate phages; culture-independent viromic sequencing with short- and long-read technologies; genome assembly; and comparative genomics to identify auxiliary genes and regulatory elements.
Challenges and outlook: ongoing work addresses distinguishing infectious from defective particles, resolving highly mosaic genomes, and standardizing quality metrics for assemblies. Many phage genes lack functional annotation, which limits interpretation. Advances in sequencing, databases, and computational methods promise deeper insights into phage biology and their applications in medicine and the environment.