Support for BigQuery

#2
by shashnk - opened

Hi,

I am trying to use this model to generate queries from text in the BigQuery SQL format but it seems that it is not able to generate. Is there any way it can generate the same?

Thanks.

ChatDB org

Do you have an example snippet?

Hi @cfahlgren1

Thanks for your quick reply.

I was trying out your models chatdb/natural-sql-7b and cfahlgren1/NaturalSQL-6.7B-v0 for natural language to SQL conversion to run the generated query on the BQ table.
What I found out was that the model is generating query specific to Postgres.

For the question, what is the total Sales generated for current fiscal year?

The model , chatdb/natural-sql-7b generated output as
SELECT SUM("sales") AS total_sales
     FROM project_id.dataset_id.table_id
     WHERE "fiscal_year" = 2024;

which although is correct, is a format supported by Postgres and not BigQuery.
For using it in BigQuery, it should be

SELECT SUM(sales) AS total_sales
     FROM project_id.dataset_id.table_id
     WHERE fiscal_year =  2024

Similarly, for the question provide me top 2 department name with highest quantity in fiscal week 8 and fiscal year 2024 and with the department name childrens?

The model cfahlgren1/NaturalSQL-6.7B-v0 generated output as
SELECT l2_name, SUM(quantity) AS total_quantity
FROM project_id.dataset_id.table_id
WHERE fiscal_week = 8 AND fiscal_year = 2024 AND department ILIKE '%childrens%'
GROUP BY l2_name
ORDER BY total_quantity DESC
LIMIT 2;

The query uses function ILIKE which is available on Postgres but not Bigquery (BigQuery suppports LIKE operation instead).

The prompt that I provided to the model was

prompt = f"""
    ### Task 

    Generate a Google BigQuery SQL query to answer the following question: `{question}` 
    
    ### Big Query SQL Schema 
    The query will run on a database with the following schema: 
    For the table project_id.dataset_id.table_id_1, 
    {schema_1}
    For another table project_id.dataset_id.table_id, 
    {schema_2}
    
    Do not add quotes on variable name

    Contextual Information on Fiscal Period that can be implemented in the filter:
    The Current Fiscal Year is {current_fy}
    The Current Fiscal Month is {current_fm}
    The Current Fiscal Week is {current_fw}
    
    ### Answer 
    Here is the SQL query that answers the question: `{question}` 
    ```sql
    """

Can you please let me know if there're any tweaks in the prompt that'd work or if you have a separate model trained on BigQuery SQL dataset.

Thanks!

ChatDB org

@shashnk Thanks for the very detailed response. Natural-SQL was just finetuned on PostgreSQL queries. I can let you know if I get around to releasing one with multiple dialects. I definitely want to.

This comment has been hidden
cfahlgren1 changed discussion status to closed

Sign up or log in to comment