🦊 Welcome to Kyuubi’s online documentation ✨, v1.10.0-SNAPSHOT

Kyuubi Hive JDBC Driver#

New in version 1.4.0: Kyuubi community maintains a forked Hive JDBC driver module and provides both shaded and non-shaded packages.

This packages aims to support some missing functionalities of the original Hive JDBC driver. For Kyuubi engines that support multiple catalogs, it provides meta APIs for better support. The behaviors of the original Hive JDBC driver have remained.

To access a Hive data warehouse or new Lakehouse formats, such as Apache Iceberg/Hudi, Delta Lake using the Kyuubi JDBC driver for Apache kyuubi, you need to configure the following:

Referencing the JDBC Driver Libraries#

Before you use the jdbc driver for Apache Kyuubi, the JDBC application or Java code that you are using to connect to your data must be able to access the driver JAR files.

Using the Driver in Java Code#

In the code, specify the artifact kyuubi-hive-jdbc-shaded from Maven Central according to the build tool you use.

Maven#

<dependency>
    <groupId>org.apache.kyuubi</groupId>
    <artifactId>kyuubi-hive-jdbc-shaded</artifactId>
    <version>1.10.0-SNAPSHOT</version>
</dependency>

sbt#

libraryDependencies += "org.apache.kyuubi" % "kyuubi-hive-jdbc-shaded" % "1.10.0-SNAPSHOT"

Gradle#

implementation group: 'org.apache.kyuubi', name: 'kyuubi-hive-jdbc-shaded', version: '1.10.0-SNAPSHOT'

Using the Driver in a JDBC Application#

For JDBC Applications, such as BI tools, SQL IDEs, please check the specific guide for detailed information.

Note

Is your favorite tool missing? Report an feature request or help us document it.

Registering the Driver Class#

Before connecting to your data, you must register the JDBC Driver class for your application.

  • org.apache.kyuubi.jdbc.KyuubiHiveDriver

  • org.apache.kyuubi.jdbc.KyuubiDriver (Deprecated)

The following sample code shows how to use the java.sql.DriverManager class to establish a connection for JDBC:

private static Connection newKyuubiConnection() throws Exception {
  Connection connection = DriverManager.getConnection(CONNECTION_URL);
  return connection;
}

Building the Connection URL#

Basic Connection URL format#

Use the connection URL to supply connection information to the kyuubi server or cluster that you are accessing. The following is the format of the connection URL for the Kyuubi Hive JDBC Driver

jdbc:subprotocol://host:port[/catalog]/[schema];<clientProperties;><[#|?]sessionProperties>
  • subprotocol: kyuubi or hive2

  • host: DNS or IP address of the kyuubi server

  • port: The number of the TCP port that the server uses to listen for client requests

  • catalog: Optional catalog name to set the current catalog to run the query against.

  • schema: Optional database name to set the current database to run the query against, use default if absent.

  • clientProperties: Optional semicolon(;) separated key=value parameters identified and affect the client behavior locally. e.g., user=foo;password=bar.

  • sessionProperties: Optional semicolon(;) separated key=value parameters used to configure the session, operation or background engines. For instance, kyuubi.engine.share.level=CONNECTION determines the background engine instance is used only by the current connection. spark.ui.enabled=false disables the Spark UI of the engine.

Important

  • The sessionProperties MUST come after a leading number sign(#) or question mark (?).

  • Properties are case-sensitive

  • Do not duplicate properties in the connection URL

Connection URL over HTTP#

New in version 1.6.0.

jdbc:subprotocol://host:port/schema;transportMode=http;httpPath=<http_endpoint>
  • http_endpoint is the corresponding HTTP endpoint configured by kyuubi.frontend.thrift.http.path at the server side.

Connection URL over Service Discovery#

jdbc:subprotocol://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi
  • zookeeper quorum is the corresponding zookeeper cluster configured by kyuubi.ha.addresses at the server side.

  • zooKeeperNamespace is the corresponding namespace configured by kyuubi.ha.namespace at the server side.

HiveServer2 Compatibility#

New in version 1.8.0.

JDBC Drivers need to negotiate a protocol version with Kyuubi Server/HiveServer2 when connecting.

Kyuubi Hive JDBC Driver offers protocol version v10 (clientProtocolVersion=9, supported since Hive 2.3.0) to server by default.

If you need to connect to HiveServer2 before 2.3.0, please set client property clientProtocolVersion to a lower number.

jdbc:subprotocol://host:port[/catalog]/[schema];clientProtocolVersion=9;

Tip

All supported protocol versions and corresponding Hive versions can be found in TProtocolVersion.java and its git commits.

Kerberos Authentication#

Since 1.6.0, Kyuubi JDBC driver implements the Kerberos authentication based on JAAS framework instead of Hadoop UserGroupInformation, which means it does not forcibly rely on Hadoop dependencies to connect a kerberized Kyuubi Server.

Kyuubi JDBC driver supports different approaches to connect a kerberized Kyuubi Server. First of all, please follow the krb5.conf instruction to setup krb5.conf properly.

Authentication by Principal and Keytab#

New in version 1.6.0.

Tip

It’s the simplest way w/ minimal setup requirements for Kerberos authentication.

It’s straightforward to use principal and keytab for Kerberos authentication, just simply configure them in the JDBC URL.

jdbc:kyuubi://host:port/schema;kyuubiClientPrincipal=<clientPrincipal>;kyuubiClientKeytab=<clientKeytab>;kyuubiServerPrincipal=<serverPrincipal>
  • kyuubiClientPrincipal: Kerberos principal for client authentication

  • kyuubiClientKeytab: path of Kerberos keytab file for client authentication

  • kyuubiClientTicketCache: path of Kerberos ticketCache file for client authentication, available since 1.8.0.

  • kyuubiServerPrincipal: Kerberos principal configured by kyuubi.kinit.principal at the server side. kyuubiServerPrincipal is available as an alias of principal since 1.7.0, use principal for previous versions.

Authentication by Principal and TGT Cache#

Another typical usage of Kerberos authentication is using kinit to generate the TGT cache first, then the application does Kerberos authentication through the TGT cache.

jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>

Authentication by Hadoop UserGroupInformation doAs (programing only)#

Tip

This approach allows project which already uses Hadoop UserGroupInformation for Kerberos authentication to easily connect the kerberized Kyuubi Server. This approach does not work between [1.6.0, 1.7.0], and got fixed in 1.7.1.

String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>"
UserGroupInformation ugi = UserGroupInformation.loginUserFromKeytab(clientPrincipal, clientKeytab);
ugi.doAs((PrivilegedExceptionAction<String>) () -> {
  Connection conn = DriverManager.getConnection(jdbcUrl);
  ...
});

Authentication by Subject (programing only)#

String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>;kerberosAuthType=fromSubject"
Subject kerberizedSubject = ...;
Subject.doAs(kerberizedSubject, (PrivilegedExceptionAction<String>) () -> {
  Connection conn = DriverManager.getConnection(jdbcUrl);
  ...
});