This project will mainly explore streaming multi-modal Knowledge Base Population (KBP) problems by considering both text and images. We expect to propose novel ideas that may help Alibaba’s technical challenges. Specifically, we aim to identify entities using both unstructured text data (e.g., image descriptions) and the corresponding images. The aim is that, by creating models to leverage multi-modal information, the accuracy of entity recognition would be higher than that using text information only. These recognized entities may be linked to the corresponding regions in the images.
The research will explore the following avenues:
- Using semi-structured data to guide the KBP by automatically generation of training data as well as enhancing the KBP process using information around object-attribute pairs
- Exploring semi-supervised learning to obtain the underlying data structures from incomplete labels. Along with the KBP models, this information can be used to automatically build knowledge graphs which currently require a significant amount of human labour using state-of-the-art technologies.
- The development of statistical inference methodology and tools to automate information extraction and knowledge base construction.
Sponsored by Alibaba