Skip to content

feat(bigquery): support LOAD DATA INTO TEMP TABLE syntax#7675

Open
simonlourson wants to merge 1 commit into
tobymao:mainfrom
simonlourson:feat/bigquery-load-data-into-temp-table
Open

feat(bigquery): support LOAD DATA INTO TEMP TABLE syntax#7675
simonlourson wants to merge 1 commit into
tobymao:mainfrom
simonlourson:feat/bigquery-load-data-into-temp-table

Conversation

@simonlourson
Copy link
Copy Markdown

@simonlourson simonlourson commented May 23, 2026

Summary

Adds support for BigQuery's LOAD DATA INTO TEMP TABLE syntax, which was previously failing to parse because the base _parse_load used
_match_pair(INTO, TABLE) which requires INTO to be immediately followed by TABLE, with no room for the TEMP/TEMPORARY keyword in
between.

Changes

  • sqlglot/expressions/dml.py: Added "temp": False to LoadData.arg_types
  • sqlglot/parsers/bigquery.py: Added a BigQuery-specific _parse_load_data() override that handles all three variants and registered it
    under TokenType.LOAD in STATEMENT_PARSERS
  • sqlglot/generator.py: Updated loaddata_sql() to emit INTO TEMP TABLE when temp=True
  • tests/dialects/test_bigquery.py: Added LOAD DATA INTO TEMP TABLE to the existing round-trip test

All three BigQuery LOAD DATA variants are supported and round-trip correctly

-- existing, unchanged
LOAD DATA OVERWRITE mydataset.table1 FROM FILES(...)
LOAD DATA INTO TABLE mydataset.table1 FROM FILES(...)

-- new
LOAD DATA INTO TEMP TABLE mydataset.table1 FROM FILES(...)

Ref: https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/load-statements

Copy link
Copy Markdown
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, I've a small ask but should be good to go otherwise.

Comment on lines +714 to +732
def _parse_load_data(self) -> exp.LoadData | exp.Command:
if self._match_text_seq("DATA"):
overwrite = self._match(TokenType.OVERWRITE)
temp = False
if self._match(TokenType.INTO):
temp = self._match(TokenType.TEMPORARY)
self._match(TokenType.TABLE)

return self.expression(
exp.LoadData(
this=self._parse_table(schema=True),
overwrite=overwrite,
temp=temp,
files=self._match_text_seq("FROM", "FILES")
and exp.Properties(expressions=self._parse_wrapped_properties()),
)
)
return self._parse_as_command(self._prev)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's refactor Parser._parse_load instead of overriding it in BigQuery. The core logic is more or less the same.

@georgesittas georgesittas self-assigned this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants