Kyuubi Hive JDBC Driver#
Added in version 1.4.0: Kyuubi community maintains a forked Hive JDBC driver module and provides both shaded and non-shaded packages.
This packages aims to support some missing functionalities of the original Hive JDBC driver. For Kyuubi engines that support multiple catalogs, it provides meta APIs for better support. The behaviors of the original Hive JDBC driver have remained.
To access a Hive data warehouse or new Lakehouse formats, such as Apache Iceberg/Hudi, Delta Lake using the Kyuubi JDBC driver for Apache kyuubi, you need to configure the following:
The list of driver library files - Referencing the JDBC Driver Libraries.
The Driver or DataSource class - Registering the Driver Class.
The connection URL for the driver - Building the Connection URL
Referencing the JDBC Driver Libraries#
Before you use the jdbc driver for Apache Kyuubi, the JDBC application or Java code that you are using to connect to your data must be able to access the driver JAR files.
Using the Driver in Java Code#
In the code, specify the artifact kyuubi-hive-jdbc-shaded from Maven Central according to the build tool you use.
Maven#
<dependency> <groupId>org.apache.kyuubi</groupId> <artifactId>kyuubi-hive-jdbc-shaded</artifactId> <version>1.11.0-SNAPSHOT</version> </dependency>
sbt#
libraryDependencies += "org.apache.kyuubi" % "kyuubi-hive-jdbc-shaded" % "1.11.0-SNAPSHOT"
Gradle#
implementation group: 'org.apache.kyuubi', name: 'kyuubi-hive-jdbc-shaded', version: '1.11.0-SNAPSHOT'
Using the Driver in a JDBC Application#
For JDBC Applications, such as BI tools, SQL IDEs, please check the specific guide for detailed information.
Note
Is your favorite tool missing? Report an feature request or help us document it.
Registering the Driver Class#
Before connecting to your data, you must register the JDBC Driver class for your application.
org.apache.kyuubi.jdbc.KyuubiHiveDriver
org.apache.kyuubi.jdbc.KyuubiDriver (Deprecated)
The following sample code shows how to use the java.sql.DriverManager class to establish a connection for JDBC:
private static Connection newKyuubiConnection() throws Exception {
Connection connection = DriverManager.getConnection(CONNECTION_URL);
return connection;
}
Building the Connection URL#
Basic Connection URL format#
Use the connection URL to supply connection information to the kyuubi server or cluster that you are accessing. The following is the format of the connection URL for the Kyuubi Hive JDBC Driver
jdbc:subprotocol://host:port[/catalog]/[schema];<clientProperties;><[#|?]sessionProperties>
subprotocol: kyuubi or hive2
host: DNS or IP address of the kyuubi server
port: The number of the TCP port that the server uses to listen for client requests
catalog: Optional catalog name to set the current catalog to run the query against.
schema: Optional database name to set the current database to run the query against, use default if absent.
clientProperties: Optional semicolon(;) separated key=value parameters identified and affect the client behavior locally. e.g., user=foo;password=bar.
sessionProperties: Optional semicolon(;) separated key=value parameters used to configure the session, operation or background engines. For instance, kyuubi.engine.share.level=CONNECTION determines the background engine instance is used only by the current connection. spark.ui.enabled=false disables the Spark UI of the engine.
Important
The sessionProperties MUST come after a leading number sign(#) or question mark (?).
Properties are case-sensitive
Do not duplicate properties in the connection URL
Connection URL over HTTP#
Added in version 1.6.0.
jdbc:subprotocol://host:port/schema;transportMode=http;httpPath=<http_endpoint>
http_endpoint is the corresponding HTTP endpoint configured by kyuubi.frontend.thrift.http.path at the server side.
Connection URL over Service Discovery#
jdbc:subprotocol://<zookeeper quorum>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi
zookeeper quorum is the corresponding zookeeper cluster configured by kyuubi.ha.addresses at the server side.
zooKeeperNamespace is the corresponding namespace configured by kyuubi.ha.namespace at the server side.
HiveServer2 Compatibility#
Added in version 1.8.0.
JDBC Drivers need to negotiate a protocol version with Kyuubi Server/HiveServer2 when connecting.
Kyuubi Hive JDBC Driver offers protocol version v10 (clientProtocolVersion=9, supported since Hive 2.3.0) to server by default.
If you need to connect to HiveServer2 before 2.3.0, please set client property clientProtocolVersion to a lower number.
jdbc:subprotocol://host:port[/catalog]/[schema];clientProtocolVersion=9;
Tip
All supported protocol versions and corresponding Hive versions can be found in TProtocolVersion.java and its git commits.
Kerberos Authentication#
Since 1.6.0, Kyuubi JDBC driver implements the Kerberos authentication based on JAAS framework instead of Hadoop UserGroupInformation, which means it does not forcibly rely on Hadoop dependencies to connect a kerberized Kyuubi Server.
Kyuubi JDBC driver supports different approaches to connect a kerberized Kyuubi Server. First of all, please follow
the krb5.conf instruction to setup krb5.conf
properly.
Authentication by Principal and Keytab#
Added in version 1.6.0.
Tip
It’s the simplest way w/ minimal setup requirements for Kerberos authentication.
It’s straightforward to use principal and keytab for Kerberos authentication, just simply configure them in the JDBC URL.
jdbc:kyuubi://host:port/schema;kyuubiClientPrincipal=<clientPrincipal>;kyuubiClientKeytab=<clientKeytab>;kyuubiServerPrincipal=<serverPrincipal>
kyuubiClientPrincipal: Kerberos
principal
for client authenticationkyuubiClientKeytab: path of Kerberos
keytab
file for client authenticationkyuubiClientTicketCache: path of Kerberos
ticketCache
file for client authentication, available since 1.8.0.kyuubiServerPrincipal: Kerberos
principal
configured by kyuubi.kinit.principal at the server side.kyuubiServerPrincipal
is available as an alias ofprincipal
since 1.7.0, useprincipal
for previous versions.
Authentication by Principal and TGT Cache#
Another typical usage of Kerberos authentication is using kinit to generate the TGT cache first, then the application does Kerberos authentication through the TGT cache.
jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>
Authentication by Hadoop UserGroupInformation doAs
(programing only)#
Tip
This approach allows project which already uses Hadoop UserGroupInformation for Kerberos authentication to easily connect the kerberized Kyuubi Server. This approach does not work between [1.6.0, 1.7.0], and got fixed in 1.7.1.
String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>"
UserGroupInformation ugi = UserGroupInformation.loginUserFromKeytab(clientPrincipal, clientKeytab);
ugi.doAs((PrivilegedExceptionAction<String>) () -> {
Connection conn = DriverManager.getConnection(jdbcUrl);
...
});
Authentication by Subject (programing only)#
String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=<serverPrincipal>;kerberosAuthType=fromSubject"
Subject kerberizedSubject = ...;
Subject.doAs(kerberizedSubject, (PrivilegedExceptionAction<String>) () -> {
Connection conn = DriverManager.getConnection(jdbcUrl);
...
});