People

Abstract

We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP). Unlike traditional clustering ap- proaches which assume a one-to-one mapping between the clusters in the text-based feature space and the visual space, we adopt a one-to-many mapping between the two spaces. This is primarily because each semantic sense (concept) can correspond to different visual senses due to viewpoint and appearance variations. Our structure-EM style opti- mization not only extracts the multiple senses in both se- mantic and visual feature space, but also discovers the mapping between the senses. We introduce a challenging dataset (CMU Polysemy-30) for this problem consisting of 30 NPs (~5600 labeled instances out of ~22K total in- stances). We have also conducted a large-scale experiment that performs sense disambiguation for ~2000 NPs.

Keywords

Paper


CVPR paper (pdf, 9.4MB)
Supplementary Material (pdf, 316MB)
Poster (pdf, 4.5MB)

Citation

Xinlei Chen, Alan Ritter, Abhinav Gupta and Tom Mitchell. Sense Discovery via Co-Clustering on Images and Text. In CVPR 2015.

@inproceedings{chen_cvpr15,
    Author = {Xinlei Chen and Alan Ritter and Abhinav Gupta and Tom Mitchell},
    Title = {{S}ense {D}iscovery via {C}o-{C}lustering on {I}mages and {T}ext},
    Booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    Year = 2015,
}

Related Papers

Downloads

We provide the CMU Polysemy 30 Dataset used in the paper here (tgz, 938MB). A few notes:

Funding

This research was supported by: