People
Abstract
We present a co-clustering framework that can be used
to discover multiple semantic and visual senses of a
given Noun Phrase (NP). Unlike traditional clustering ap-
proaches which assume a one-to-one mapping between the
clusters in the text-based feature space and the visual space,
we adopt a one-to-many mapping between the two spaces.
This is primarily because each semantic sense (concept)
can correspond to different visual senses due to viewpoint
and appearance variations. Our structure-EM style opti-
mization not only extracts the multiple senses in both se-
mantic and visual feature space, but also discovers the
mapping between the senses. We introduce a challenging
dataset (CMU Polysemy-30) for this problem consisting of
30 NPs (~5600 labeled instances out of ~22K total in-
stances). We have also conducted a large-scale experiment
that performs sense disambiguation for ~2000 NPs.
Keywords
- Sense Discovery, Sense Disambiguation
- Clustering, Co-Clustering, Unsupervised learning
- Vision for the Web
Paper

CVPR paper (pdf, 9.4MB)
Supplementary Material (pdf, 316MB)
Poster (pdf, 4.5MB)
Citation
Xinlei Chen, Alan Ritter, Abhinav Gupta and Tom Mitchell. Sense Discovery via Co-Clustering on Images and Text. In CVPR 2015.
@inproceedings{chen_cvpr15,
Author = {Xinlei Chen and Alan Ritter and Abhinav Gupta and Tom Mitchell},
Title = {{S}ense {D}iscovery via {C}o-{C}lustering on {I}mages and {T}ext},
Booktitle = {Computer Vision and Pattern Recognition (CVPR)},
Year = 2015,
}
Related Papers
Downloads
We provide the CMU Polysemy 30 Dataset used in the paper here (tgz, 938MB). A few notes:
- XX/info.mat: meta information of the images/websites downloaded, most importantly URLs;
- XX/images/: images;
- XX/docs/: raw website texts, including boilerplates, the image and text is paired;
- XX/texts/: distilled website texts using Boilerpipe;
- XX/dfeas/: text features to represent each website, formatted in [Word]_[POS-Tag].
- XX/labels.mat: labels for each image/website pair.
Funding
This research was supported by:
- ONR N000141010934
- Yahoo!
- Google
- XC is supported by Yahoo-InMind Fellowship
- AG is supported by Bosch Young Faculty Fellowship