rohitsroch
commited on
Commit
•
6dcd35c
1
Parent(s):
46dc28a
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,9 @@ datasets:
|
|
9 |
- wikisql
|
10 |
---
|
11 |
|
|
|
|
|
|
|
12 |
## Paper
|
13 |
|
14 |
## [NatSight: A framework for building domain agnostic Natural Language Interface to Databases for next-gen Augmented Analytics](https://dcal.iimb.ac.in/baiconf2022/full_papers/2346.pdf)
|
@@ -32,6 +35,31 @@ Experiment results on benchmark datasets show that our approach achieves a state
|
|
32 |
For weights initialization, we used [facebook/bart-base](https://huggingface.co/facebook/bart-base) and fine-tune as sequence-to-sequence task.
|
33 |
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
## Intended uses & limitations
|
36 |
|
37 |
More information needed
|
|
|
9 |
- wikisql
|
10 |
---
|
11 |
|
12 |
+
widget:
|
13 |
+
- text: "Is this review positive or negative? Review: Best cast iron skillet you will ever buy."
|
14 |
+
|
15 |
## Paper
|
16 |
|
17 |
## [NatSight: A framework for building domain agnostic Natural Language Interface to Databases for next-gen Augmented Analytics](https://dcal.iimb.ac.in/baiconf2022/full_papers/2346.pdf)
|
|
|
35 |
For weights initialization, we used [facebook/bart-base](https://huggingface.co/facebook/bart-base) and fine-tune as sequence-to-sequence task.
|
36 |
|
37 |
|
38 |
+
## Using Transformers🤗
|
39 |
+
|
40 |
+
```python
|
41 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
42 |
+
|
43 |
+
tokenizer = AutoTokenizer.from_pretrained("course5i/NatSight-bart-base-wikisql")
|
44 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("course5i/NatSight-bart-base-wikisql")
|
45 |
+
|
46 |
+
# define input
|
47 |
+
raw_nat_query = "What was the number of race that Kevin Curtain won?"
|
48 |
+
query_mention_schema = "c0 | number <eom> v4 | Kevin Curtain"
|
49 |
+
table_header_schema = "c0 | No <eom> c1 | Date <eom> c2 | Round <eom> c3 | Circuit <eom> c4 | Pole_Position <eom> c5 | Fastest_Lap <eom> c6 | Race_winner <eom> c7 | Report"
|
50 |
+
|
51 |
+
encoder_input = raw_nat_query + " </s> " + query_mention_schema + " </s> " + table_header_schema
|
52 |
+
input_ids = tokenizer.encode(encoder_input, return_tensors="pt", add_special_tokens=True)
|
53 |
+
|
54 |
+
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_length=128)
|
55 |
+
preds = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in generated_ids]
|
56 |
+
output = preds[0]
|
57 |
+
|
58 |
+
print("Output generic SQL query: {}".format(output))
|
59 |
+
|
60 |
+
# output
|
61 |
+
"SELECT COUNT(c0) FROM TABLE WHERE c4 = v4"
|
62 |
+
|
63 |
## Intended uses & limitations
|
64 |
|
65 |
More information needed
|