Home

GenomNER

GenomNER is a software framework for automatic recognition and tagging of genomic entities in text and sequence data. It aims to extract genes, proteins, regulatory elements, variants, and related concepts from scientific literature and genomic resources, supporting researchers in curation and data integration.

GenomNER emerged from collaboration between computational linguists and genomics researchers and is maintained as an open-source

Techniques: The system combines tokenization, tagging, and classification using traditional methods (CRF) and neural models (BiLSTM-CRF,

Output and formats: GenomNER produces labeled spans with entity type, text, and cross-references to databases. It

Availability: GenomNER is released under an open-source license and hosted on a public repository with documentation

Impact: The tool supports scalable annotation workflows for literature and genomic resources and has been cited

project.
It
provides
models
and
tools
to
perform
NER
on
domain-specific
corpora
and
to
map
entities
to
standard
identifiers
in
resources
such
as
Entrez
Gene
and
UniProt.
transformers).
It
can
be
trained
on
annotated
corpora
with
gene
and
protein
names
and
adapted
to
biomedical
text,
enabling
cross-resource
transfer.
exports
BIO
tags,
JSON,
and
stand-off
annotations
suitable
for
integration
with
genome
browsers
and
downstream
pipelines.
and
example
datasets.
in
studies
of
automated
curation.