Getting Started
Requirements
HarborSQL assumes you already have a working Databricks workspace with Unity Catalog Delta tables and users or service principals that can read those tables.
To migrate that workload to HarborSQL, you need:
- A HarborSQL runtime:
- Docker image:
ghcr.io/harborsql/harborsql:<tag> - Binary archives from GitHub Releases or
ghcr.io/harborsql/harborsql-binaries:<tag>
- Docker image:
- One extra Unity Catalog grant on each schema you want to query from HarborSQL:
GRANT EXTERNAL USE SCHEMA ON SCHEMA <catalog>.<schema> TO `<principal>`;
Your existing Unity Catalog read permissions still apply. HarborSQL does not need static cloud credentials; Unity Catalog vends temporary table credentials at query time. See Unity Catalog Permissions for the full grant model.
Run with Docker
Use the published image when you do not need to build HarborSQL from source:
export TAG="<version>"
docker run --rm \
-p 127.0.0.1:1992:1992 \
-e HARBORSQL_BIND_ADDR="0.0.0.0:1992" \
-e HARBORSQL_DATABRICKS_HOST="https://<workspace-host>" \
ghcr.io/harborsql/harborsql:$TAG
Add HARBORSQL_DEFAULT_CATALOG, HARBORSQL_DEFAULT_SCHEMA, or
HARBORSQL_AWS_REGION only when the defaults do not match your workspace.
See Docker for image tags, Compose, and production notes.
Run the Server
Use Cargo when developing HarborSQL from source:
export HARBORSQL_DATABRICKS_HOST="https://<workspace-host>"
cargo run -- server
The server listens on 127.0.0.1:1992 by default.
Production deployments should serve HarborSQL over HTTPS, or behind a TLS-terminating proxy. The HarborSQL-to-Databricks and Unity Catalog hop must use HTTPS for real Databricks workspaces.
Connect a Databricks SQL Client
HarborSQL targets a focused Databricks SQL connector-compatible Thrift-over-HTTP surface.
HarborSQL Behind HTTPS
When HarborSQL is exposed through HTTPS, for example behind a TLS-terminating
proxy, keep the existing Databricks SQL connector shape and change only
server_hostname to the HarborSQL host:
from databricks import sql
import os
connection = sql.connect(
server_hostname="sql.example.com",
http_path="/sql/1.0/warehouses/<warehouse-id>",
access_token=os.environ["DATABRICKS_TOKEN"],
catalog="workspace",
schema="analytics",
)
with connection.cursor() as cursor:
cursor.execute("SELECT count(*) FROM events")
print(cursor.fetchone())
Local or Plain HTTP HarborSQL
For a local or plain HTTP HarborSQL endpoint, override the connector's internal
connection URI with the full http:// URL:
from databricks import sql
import os
uri = "http://127.0.0.1:1992/sql/1.0/warehouses/local"
connection = sql.connect(
server_hostname="http://127.0.0.1:1992",
http_path="/sql/1.0/warehouses/local",
access_token=os.environ["DATABRICKS_TOKEN"],
catalog="workspace",
schema="analytics",
_connection_uri=uri,
)
with connection.cursor() as cursor:
cursor.execute("SELECT count(*) FROM events")
print(cursor.fetchone())
The local HTTP endpoint is for local protocol testing. Use HTTPS, or a TLS-terminating proxy, for production deployments.