Hey! I am a Research Scientist at Facebook AI Research, Menlo Park.

I was a PhD student at Language Technology Institute, Carnegie Mellon University, from September 2012 to Feburary 2018, working mainly with Prof. Abhinav Gupta on computer vision, computational linguistics and the combination of both. Durthing the time at CMU, I had also worked with Prof. Tom Mitchell. In spring 2014, I did an internship in MSR with Prof. C. Lawrence Zitnick. Then in summer 2016 I did an internship in Prof. William T. Freeman's VisCAM group at Google. Before graduation, I also spent time at Google Cloud AI team working with Prof. Fei-Fei Li and Dr. Jia Li.

I graduated with a bachelor's degree in computer science from Zhejiang University, China. During my undergraduate study, I was mainly under the supervision of Prof. Deng Cai in the State Key Laboratory of CAD & CG. I was a summer intern at UCLA in 2011, mainly work with Prof. Jenn Wortman Vaughan.

Meta Info

  • E-mail: xinleic [at] fb [dot] com
  • Google Scholar: here
  • Github: here
  • Resume: here

Publications

Recent Work

Info Links
Xinlei Chen, Ross Girshick, Kaiming He, Piotr Dollár. TensorMask: A Foundation for Dense Object Segmentation. ArXiv, 2019. [Link]
Jianwei Yang*, Zhile Ren*, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra. Embodied Visual Recognition. ArXiv, 2019. [Link]
Yuyin Zhou*, Zhe Li*, Song Bai, Chong Wang, Xinlei Chen, Mei Han, Elliot Fishman, Alan Yuille. Prior-aware Neural Network for Partially-Supervised Multi-Organ Segmentation. ArXiv, 2019. [Link]
Meet Shah, Xinlei Chen, Marcus Rohrbach, Devi Parikh. Cycle-Consistency for Robust Visual Question Answering. The 32nd Conference on Computer Vision and Pattern Recognition (CVPR), 2019. Oral. [Link]
Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach. Grounded Video Description. The 32nd Conference on Computer Vision and Pattern Recognition (CVPR), 2019. Oral. [Link] [Code(Data)]
Licheng Yu, Xinlei Chen, Georgia Gkioxari, Mohit Bansal, Tamara L. Berg, Dhruv Batra. Multi-Target Embodied Question Answering. The 32nd Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Link] [Video]
Amanpreet Singh, Vivek Natarajan, Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, Marcus Rohrbach. Towards VQA Models That Can Read. The 32nd Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [Link] [Project]
Harsh Agrawal, Karan Desai, Xinlei Chen, Rishabh Jain, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson. nocaps: novel object captioning at scale. ArXiv, 2018. [Link] [Project]
Jin-Hwa Kim, Nikita Kitaev, Xinlei Chen, Marcus Rohrbach, Yuandong Tian, Dhruv Batra, Devi Parikh. CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication. The 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019. [Link]
Yu Jiang*, Vivek Natarajan*, Xinlei Chen*, Marcus Rohrbach, Dhruv Batra, Devi Parikh. Pythia v0.1: the Winning Entry to the VQA Challenge 2018. ArXiv, 2018. [Link] [Code]

Previous Work

Info Links
Xinlei Chen, Li-Jia Li, Li Fei-Fei, Abhinav Gupta. Iterative Visual Reasoning Beyond Convolutions. The 31st Conference on Computer Vision and Pattern Recognition (CVPR), 2018. Spotlight. [Link] [PDF] [Code]
Xinlei Chen. Visual Knowledge Learning. Doctoral Dissertation, CMU-LTI-18-001. [PDF] [Slides]
Xinlei Chen, Abhinav Gupta. Spatial Memory for Context Reasoning in Object Detection. The 15th International Conference on Computer Vision (ICCV), 2017. [Link] [PDF] [Poster]
Xinlei Chen, Abhinav Gupta. An Implementation of Faster RCNN with Study for Region Sampling. ArXiv, 2017. [Link] [Code]
Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan. PixelNet: Representation of the pixels, by the pixels, and for the pixels. ArXiv, 2017. [Link] [Project] [Code] [Code(Old)]
Gunnar A. Sigurdsson, Xinlei Chen, Abhinav Gupta. Learning Visual Storylines with Skipping Recurrent Neural Networks. The 14th European Conference on Computer Vision (ECCV), 2016. [Link] [PDF] [Poster] [Code]
Jiwei Li, Xinlei Chen, Eduard Hovy, Dan Jurafsky. Visualizing and Understanding Neural Models in NLP. Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2016. [Link] [PDF] [Code]
Xinlei Chen, Abhinav Gupta. Webly Supervised Learning of Convolutional Networks. The 15th International Conference on Computer Vision (ICCV), 2015. Oral. [Link] [PDF] [Slides] [Poster] [Video] [Talk] [Code] [Project]
Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedantam, Saurabh Gupta, Piotr Dollar, C. Lawrence Zitnick. Microsoft COCO Captions: Data Collection and Evaluation Server. Arxiv Preprint, 2015. [Link] [Code]
Xinlei Chen, C. Lawrence Zitnick. Mind's Eye: A Recurrent Visual Representation for Image Caption Generation. The 28th Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [PDF] [Poster]
Xinlei Chen, Alan Ritter, Abhinav Gupta, Tom Mitchell. Sense Discovery via Co-Clustering on Images and Text. The 28th Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [PDF] [Poster] [Project]
Tom M. Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Justin Betteridge, Andrew Carlson, Bhavana Dalvi, Matt Gardner, Bryan Kisiel, Jayant Krishnamurthy, Ni Lao, Kathryn Mazaitis, Thahir Mohamed, Ndapa Nakashole, Emmanouil Platanios, Alan Ritter, Mehdi Samadi, Burr Settles, Richard Wang, Derry Wijaya, Abhinav Gupta, Xinlei Chen, Abulhair Saparov, Malcom Greaves and Joel Welling. Never-Ending Learning. The 29th AAAI Conference on Artificial Intelligence (AAAI), 2015. [PDF]
Elissa M. Aminoff, Mariya Toneva, Abhinav Shrivastava, Xinlei Chen, Ishan Misra, Abhinav Gupta, Michael Tarr. Applying artificial vision models to human scene understanding. Frontiers in Computational Neuroscience, 2015. [PDF]
Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta. Enriching Visual Knowledge Bases via Object Discovery and Segmentation. The 27th Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [PDF][Poster][Project][Code]
Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta. NEIL: Extracting Visual Knowledge from Web Data. The 14th International Conference on Computer Vision (ICCV), 2013. Oral. [PDF] [Slides] [Poster] [Web] [Talk] [Code] [Test Code]
Xinlei Chen, Deng Cai. Large Scale Spectral Clustering with Landmark-based Representation. The 26th AAAI Conference on Artificial Intelligence (AAAI), 2011. [PDF][Code]
Jiajun Lv, Xinlei Chen, Jin Huang, Hujun Bao. Semi-supervised Mesh Segmentation and Labeling. The 20th Pacific Conference on Computer Graphics and Applications (PG), 2012. [PDF][Project]