일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- deep daiv. 2주차 팀 활동 과제
- deep daiv. week3 팀활동과제
- deep daiv. WIL
- deep daiv. project_paper
- deep daiv. week4 팀활동과제
- Today
- Total
OK ROCK
[논문리뷰] Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing, Wonseok Hwang et al., 2020 본문
[논문리뷰] Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing, Wonseok Hwang et al., 2020
서졍 2023. 10. 3. 18:04Syntactic Question Abstraction and Retrieval for Data-Scarce Semantic Parsing
Deep learning approaches to semantic parsing require a large amount of labeled data, but annotating complex logical forms is costly. Here, we propose Syntactic Question Abstraction and Retrieval (SQAR), a method to build a neural semantic parser that trans
arxiv.org
Abstract
Deep Learning : approaches to semantic parsing(구문 분석) require a large amount of labeled data.
SQAR [ Syntactic Question Abstraction & Retrieval ]
- translates Natural Language query(NL) to a SQL logical form(LF) with less than 1,000 annotated examples.
- retrieves logical pattern from train data by computing the similarity between NL queries & LF.
- By using query-similarity to retrieve logical pattern, SQAR can achieve up to 5.9% higher LF accuracy compared to the case where SQAR trained by using only WikiSQL data.
- In contrast to a simple pattern classification approach, SQAR can generate unseen logical patterns of new examples without re-training the model.
- ideal way to create cost-efficient and robust trainsets when under a data-hungry setting.
1. Introduction
Semantic Parsing task : translating Natural language(NL) → machine-understandable formal logical forms(LF)
▷ Contributions summarization
- SQAR achieves the sota performance on the WikiSQL test data under data-scarce(data-hungry) environment.
- SQAR can leverage NL query similarity to improve LF generation accuracy.
- Retrieval-based parser can handle unseen new logical patterns without re-training.
- We find that it's important to design the train data distribution carefully, not merely following the data distribution.
2. Related Work
(1) WikiSQL
: Large semantic parsing Dataset consisting of 80,654 natural language and correspoinding SQL annotations.
(2) Query Similarity
[Berant and Linang, 2014] built semantic parser that uses similarity between input question and canonical(표준적인) NL representations generated from LF.
↔ In SQAR, LF and corresponding canonical NL forms do not need to be generated as input questions are directly compared to the questions in the training data.
등등 다른 논문과 비교하는 내용
3. Model
The model generates logical form L(SQL query) for a given NL query Q and its corresponding table headers H.
Logical pattern(l) is retrieved from train set by finding the most similar NL query with Q.<Retriever>
3.1. Syntactic Question Abstractor (b)
- The logical patterns of WikiSQL dataset
- 6 aggregation operators(none, max, min, count, sum, avg) + 3 where operators(=, <, >).
- where clause(절_S+V) : ranging from 0~4 and each condition is combined by 'and' unit.(6 clause patterns)
- In total, there are 210 possible SQL patterns(6 select clause patterns × 35 where clause patterns)
- generates two vector representation q and g of an input NL query Q.
- q : represent syntactic information of Q and used in the retriever module
- g : represent lexical information of Q and used in the grounder
Input : NL query Q & the queries in train set {Q_t,i}
▶ mapped to a vector space(via BERT Encoder)
▶ represented by q & {q_t,i}
- The input of the BERT Encoder
- consists of " [CLS], E, [SEP], Q, [SEP], H, [SEP]" tokens.
- E = SQL language element tokens, Q = question tokens, H = the tokens of table headers.
- q and g are extracted from the (linearly projected) encoding vector of [CLS] token.
3.2. Retriever (c)
- Similarity Search: The logical pattern is found by measuring the Euclidean L2 distance between q and {q_t,i}.
▶ In SQAR, maximum 10(n) closest q_t,i* are retrieved and the most frequently appearing logical pattern(l*) is selected for the grounding process.
▶ SQAR is trained using 'negative sampling method'.
[ Negative Sampling Method ]
(1) one positive sample, and 5 negative samples are randomly sampled from the train set.
(2) six L2 distances are calculated as above and interpreted as probability by using softmax function. + Cross Entropy function is employed for the training.
3.3. Grounder (d)
To ground(고르게 하다) retrieved logical pattern l*, following LSTM-based network is used on Grounder.
P_(t-1) : one-hot vector at time t-1,
h_(t-1) & c_(t-1) : hidden- and cell- vectors of the LSTM decoder,
W : affine transformation(*아핀 공간*으로의 변환),
d_h : hidden dimension of the LSTM(In this study, d_h=100)
p_t(i) : the probability of observing i_th input token at time t.
▶ Compared to conventional pointer network, our grounder has 3 custom properties:
- Logical pattern(l*) is already found from the retriever, -> no need to self-supervised learning
- to generate conditional values for where clause, the grounder infers only the beginning and the end token positions from given question to extract the condition values for where clause
- The multiple generation of same column on where clause is avoided by constraining the search space.
4. Experiments