Title: Universal Information Extraction for Information Retrieval

Speaker: Heng Ji, Rensselaer Polytechnic Institute

 

Abstract

The big data boom in recent years covers a wide spectrum of heterogeneous data types, from text to image, video, speech, and multimedia. Most of the valuable information in such "big data" is encoded in natural language, which makes it accessible to some people — for example, those who can read that particular language — but much less amenable to computer processing beyond a simple keyword search. Information Extraction (IE) and Information Retrieval (IR) on a massive scale share the same goal of creating the next generation of information access in which humans can communicate with computers in any natural language beyond keyword search, by extracting and presenting the important and relevant information embedded in big data. IE aims extract structured facts from a wide spectrum of heterogeneous unstructured data types. Traditional IE techniques are limited to a certain source X (X = a particular language, domain, limited number of pre-defined fact types, single data modality, ...). When moving from X to a new source Y, we need to start from scratch again by annotating a substantial amount of training data and developing Y-specific extraction capabilities.  In this talk, I will present a new Universal IE paradigm to combine the merits of traditional IE (high quality and fine granularity) and Open IE (high scalability). This framework is able to discover schemas and extract facts from any input data in any domain, without any annotated training data, by integrating distributional semantics and symbolic semantics.  It can also be extended to hundreds of languages, thousands of fact types and multiple data modalities by constructing a multi-lingual multi-media multi-task common semantic space and then performing zero-shot transfer learning across sources. I will also discuss possible research directions toward a symbosis between universal IE and IR, using open-domai knowledg graphs constructed from this common space as an intermediate representation. 

  

Short bio

Heng Ji is the Edward P. Hamilton Chair Professor in Computer Science at Rensselaer Polytechnic Institute. She received her Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction and Knowledge Base Population. She was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016, 2017 and 2018.  She received "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, Google Research Awards in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014, and Bosch Research Awards in 2015, 2016 and 2017.  She coordinated the NIST TAC Knowledge Base Population task since 2010, and served as the Program Committee Co-Chair of several conferences including NAACL-HLT2018.

Title: The AI-Driven Life : Clova AI, Search and AiRS 

Speaker: Jaeho Choi, Naver Corp.

  

Abstract

Since the Internet and mobile devices appeared, search and recommendation have become an indispensable part of our lives. Traditionally, IR (information retrieval) stands for the task of finding relevant ducuments from a set of information consisting of text when given a query. Nowadays, the search users have changed it into a task that analyzes and combines informative signals based on AI and finds the correct answer at once. Atypical input through the camera lens of the smartphone and the microphone of the smart speaker is a challenge for Clova AI to understand the intent of search, which is unlike the inquiry of the text input mainly dealt by the conventional search engines. In addition, AI recommendation is always there with us - from the headline news every morning, music playlists on your devices, restaurants for a decent lunch, TV shows and movies you may like and so on.

As in the case of search, our daily life can be more convenient through AI recommendation which understands our preferences and lifestyle patterns and finds the information we want.

To this end, Naver AiRS (Ai Recommender System) offers a variety of personalized news, blog posts, webtoon, and video content recommendations for tens of millions of people every day. The ultimate goal of Clova AI, Search and AiRS is "The AI-Driven Life" that makes daily life easier through AI.

  

Short bio

Jaeho Choi is the leader of AiRS (Ai Recommender System) at the Search Department of Naver Corp. in South Korea. He graduated Seoul National University and joined Naver, as a software engineer in 2003. He was supervised by W. Bruce Croft, a distinguished professor of UMASS Amherst from 2011 to 2012, when he earned his master degree. In 2013, he returned to Naver and led a personalized recommendation project based on mobile search behavior as in Google Now. In 2016-17, his team launched AiRS News, AI-based personalized news recommendation service and spread AI content curation for Naver portal site. In 2018, LINE Today adapted AiRS and launched For You recommendation in Taiwan, Thailand, Indonesia and Hong-Kong. Recently, his team is working on personalization for LINE News in Japan.