Skip to content

Google BigQuery🔗

Supported drivers:

dbapi default driver connection class
google-cloud-bigquery 👍 bigquery+google google.cloud.bigquery.dbapi.Connection

google-cloud-bigquery🔗

google-cloud-bigquery is the default dbapi driver for Google BigQuery in pydapper.

Installation🔗

pip install pydapper[google-cloud-bigquery]
poetry add pydapper -E google-cloud-bigquery

Google cloud storage alternate installation🔗

  • google-cloud-bigquery also has support for a more performant read api using pyarrow and remoe procedural calls (RPC).
  • In order to get the performance benefits, no config is required, you simply install the google-cloud-bigquery-storage extra as well.
  • Read more about it in the google docs.
pip install pydapper[google-cloud-bigquery, google-cloud-bigquery-storage]
poetry add pydapper -E google-cloud-bigquery -E google-cloud-bigquery-storage

DSN format🔗

For bigquery, config is not actually passed through the dsn, so the dsn is extremely easy to define. The dsn simply tells pydapper what driver to use. Please see the connect and using examples below on how to pass config.

dsn = "bigquery+google:////"
dsn = "bigquery:////"

Example - connect🔗

By default the google client will look for the GOOGLE_APPLICATION_CREDENTIALS environment var

import pydapper

# export GOOGLE_APPLICATION_CREDENTIALS=/path/to/keystore.json

with pydapper.connect("bigquery+google:////") as commands:
    print(type(commands))
    # <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>

    print(type(commands.connection))
    # <class 'google.cloud.bigquery.dbapi.connection.Connection'>

    raw_cursor = commands.cursor()
    print(type(raw_cursor))
    # <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>

Alternatively, you can construct a client yourself and pass it into connect...

import json
import pathlib

from google.cloud.bigquery import Client

import pydapper

credentials = (
    pathlib.Path("~", "src", "pydapper", "tests", "test_bigquery", "auth", "key.json").expanduser().read_text()
)

client = Client.from_service_account_info(json.loads(credentials))

with pydapper.connect("bigquery+google:////", client=client) as commands:
    print(type(commands))
    # <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>

    print(type(commands.connection))
    # <class 'google.cloud.bigquery.dbapi.connection.Connection'>

    raw_cursor = commands.cursor()
    print(type(raw_cursor))
    # <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>

Example - using🔗

You might want to use a connection object that you constructed from some factory function or connection pool. In that case, you can pass that object directly into using...

import json
import pathlib

from google.cloud.bigquery import Client
from google.cloud.bigquery.dbapi import connect

import pydapper

credentials = (
    pathlib.Path("~", "src", "pydapper", "tests", "test_bigquery", "auth", "key.json").expanduser().read_text()
)

client = Client.from_service_account_info(json.loads(credentials))
dbapi_connection = connect(client=client)

commands = pydapper.using(dbapi_connection)
print(type(commands))
# <class 'pydapper.bigquery.google_bigquery_client.GoogleBigqueryClientCommands'>

print(type(commands.connection))
# <class 'google.cloud.bigquery.dbapi.connection.Connection'>

raw_cursor = commands.cursor()
print(type(raw_cursor))
# <class 'google.cloud.bigquery.dbapi.cursor.Cursor'>