Google Big Query can query from external data source like GCS?
BigQuery can query data stored in Google Cloud Storage (GCS):
1. External Tables:
Concept: An external table is a virtual table that points to data residing in GCS. It doesn't physically store the data in BigQuery; instead, it defines the schema and location of the data in GCS.
Process:
Create an external table: You define the table's schema, data format (e.g., CSV, JSON, Avro), and the GCS location of the data files.
Query the external table: You can then query the external table as if it were a regular BigQuery table. BigQuery will automatically read the data from GCS when the query is executed.
Benefits:
No need to load data into BigQuery: Saves storage space and reduces data transfer costs.
Always query the latest data: External tables reflect changes made to the GCS files.
Considerations:
Performance: Queries on external tables can be slower than queries on native BigQuery tables due to the need to read data from GCS.
Data format limitations: Supported data formats are limited.
CREATE EXTERNAL TABLE `your_project.your_dataset.your_external_table`
OPTIONS (
format = 'CSV',
uris = ['gs://your_bucket/your_data/*.csv']
)
SELECT * FROM `your_project.your_dataset.your_external_table`
WHERE column_name = 'value'
Slide:
[ Ссылка ]
Document:
[ Ссылка ]
#Google#BigQuery
Ещё видео!