Question Answering as Programming for Solving Time-Sensitive Questions

1 Tsinghua University, 2 Microsoft

Abstract

Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world. However, due to the dynamic and ever-changing nature of real-world facts, the answer can be completely different when the time constraint in the question changes. Recently, Large Language Models (LLMs) have shown remarkable intelligence in question answering, while our experiments reveal that the aforementioned problems still pose a significant challenge to existing LLMs. This can be attributed to the LLMs' inability to perform rigorous reasoning based on surface-level text semantics.

To overcome this limitation, rather than requiring LLMs to directly answer the question, we propose a novel approach where we reframe the Question Answering task as Programming (QAaP). Concretely, by leveraging modern LLMs' superior capability in understanding both natural language and programming language, we endeavor to harness LLMs to represent diversely expressed text as well-structured code and select the best matching answer from multiple candidates through programming. We evaluate our QAaP framework on several time-sensitive question answering datasets and achieve decent improvement, up to 14.5% over strong baselines.

The whole framework of QAaP

Method

Our method consists of two phases: 1. Represent all as codes, we endeavor to harness their strong coding ability to transform the question and context into well-structured codes. We first Parse the given question into a python dict, then Extract relevant information from the provided context and store these items in a python list. 2. Choose answer through programming, it is necessary to check if the extracted contents are faithful to the corresponding context. Furthermore, as there may be multiple potential answers, we need to reason out the best-matching answer to the question. Since all the obtained information is represented as codes, we can easily construct two functions Check and Match to reduce hallucination and ensure accuracy.

Results

Example

BibTeX

@inproceedings{zhu2023qaap,
  author       = {Zhu, Xinyu and Yang, Cheng and Chen, Bei  and Li, Siheng and Lou, Jian-Guang and Yang, Yujiu},
  title        = {Question Answering as Programming for Solving Time-Sensitive Questions},
  booktitle    = {{EMNLP}},
  pages        = {12775--12790},
  publisher    = {Association for Computational Linguistics},
  year         = {2023}
}