Google BigQuery🔗
Supported drivers:
dbapi | default | driver | connection class |
---|---|---|---|
google-cloud-bigquery | bigquery+google |
google.cloud.bigquery.dbapi.Connection |
google-cloud-bigquery🔗
google-cloud-bigquery
is the default dbapi driver for Google BigQuery in pydapper.
Installation🔗
pip install pydapper[google-cloud-bigquery]
poetry add pydapper -E google-cloud-bigquery
Google cloud storage alternate installation🔗
google-cloud-bigquery
also has support for a more performant read api using pyarrow and remoe procedural calls (RPC).- In order to get the performance benefits, no config is required, you simply install the
google-cloud-bigquery-storage
extra as well. - Read more about it in the
google docs
.
pip install pydapper[google-cloud-bigquery, google-cloud-bigquery-storage]
poetry add pydapper -E google-cloud-bigquery -E google-cloud-bigquery-storage
DSN format🔗
For bigquery, config is not actually passed through the dsn, so the dsn is extremely easy to define. The dsn simply
tells pydapper what driver to use. Please see the connect
and using
examples below on how to pass config.
dsn = "bigquery+google:////"
dsn = "bigquery:////"
Example - connect
🔗
By default the google client
will look for the GOOGLE_APPLICATION_CREDENTIALS
environment var
import pydapper
# export GOOGLE_APPLICATION_CREDENTIALS=/path/to/keystore.json
with pydapper.connect("bigquery+google:////") as commands:
print(type(commands))
# <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>
print(type(commands.connection))
# <class 'google.cloud.bigquery.dbapi.connection.Connection'>
raw_cursor = commands.cursor()
print(type(raw_cursor))
# <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>
Alternatively, you can construct a client yourself and pass it into connect...
import json
import pathlib
from google.cloud.bigquery import Client
import pydapper
credentials = (
pathlib.Path("~", "src", "pydapper", "tests", "test_bigquery", "auth", "key.json").expanduser().read_text()
)
client = Client.from_service_account_info(json.loads(credentials))
with pydapper.connect("bigquery+google:////", client=client) as commands:
print(type(commands))
# <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>
print(type(commands.connection))
# <class 'google.cloud.bigquery.dbapi.connection.Connection'>
raw_cursor = commands.cursor()
print(type(raw_cursor))
# <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>
Example - using
🔗
You might want to use a connection object that you constructed from some factory function or connection pool. In that case, you can pass that object directly into using...
import json
import pathlib
from google.cloud.bigquery import Client
from google.cloud.bigquery.dbapi import connect
import pydapper
credentials = (
pathlib.Path("~", "src", "pydapper", "tests", "test_bigquery", "auth", "key.json").expanduser().read_text()
)
client = Client.from_service_account_info(json.loads(credentials))
dbapi_connection = connect(client=client)
commands = pydapper.using(dbapi_connection)
print(type(commands))
# <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>
print(type(commands.connection))
# <class 'google.cloud.bigquery.dbapi.connection.Connection'>
raw_cursor = commands.cursor()
print(type(raw_cursor))
# <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>