Home

Zeroshot

Zeroshot, often written zero-shot, refers to the ability of a model to perform a task or recognize classes without any task-specific training examples. In machine learning and artificial intelligence, zeroshot learning aims to generalize to unseen categories or tasks by leveraging auxiliary information such as semantic attributes, word or sentence embeddings, or natural language descriptions that relate seen and unseen concepts. In natural language processing and the use of large language models, zero-shot performance is achieved by prompts that specify the task without providing task-specific examples.

In computer vision, zeroshot learning maps visual representations to a semantic space and uses compatibility scores

Zero-shot is distinct from few-shot learning, where a small number of labeled examples for the target task

to
assign
an
instance
to
an
unseen
class.
Techniques
include
attribute-based
models
that
describe
classes
with
predefined
attributes,
embedding-based
methods
that
learn
a
joint
space
for
images
and
semantics,
and
generative
approaches
that
synthesize
feature
representations
for
unseen
classes.
In
natural
language
processing
and
multimodal
systems,
zero-shot
capabilities
emerge
from
prompting
or
architecture
that
leverage
learned
general
knowledge,
enabling
tasks
without
labeled
data
for
the
target
domain.
are
provided.
Common
applications
include
image
classification
with
unseen
categories,
cross-domain
transfer,
zero-shot
transfer
in
robotics,
and
zero-shot
evaluation
of
multimodal
systems.
Challenges
include
the
semantic
gap
between
the
provided
descriptions
and
real
data,
biases
toward
concepts
with
stronger
representations,
and
scalability
to
large
or
nuanced
concept
sets.
Evaluation
protocols
vary
and
can
be
sensitive
to
the
chosen
semantic
representations,
prompts,
or
embedding
spaces.