Introduction to the Kyuubi Configurations System#

Kyuubi provides several ways to configure the system and corresponding engines.

Environments#

You can configure the environment variables in $KYUUBI_HOME/conf/kyuubi-env.sh, e.g, JAVA_HOME, then this java runtime will be used both for Kyuubi server instance and the applications it launches. You can also change the variable in the subprocess’s env configuration file, e.g.$SPARK_HOME/conf/spark-env.sh to use more specific ENV for SQL engine applications.

#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
# - JAVA_HOME               Java runtime to use. By default use "java" from PATH.
#
#
# - KYUUBI_CONF_DIR         Directory containing the Kyuubi configurations to use.
#                           (Default: $KYUUBI_HOME/conf)
# - KYUUBI_LOG_DIR          Directory for Kyuubi server-side logs.
#                           (Default: $KYUUBI_HOME/logs)
# - KYUUBI_PID_DIR          Directory stores the Kyuubi instance pid file.
#                           (Default: $KYUUBI_HOME/pid)
# - KYUUBI_MAX_LOG_FILES    Maximum number of Kyuubi server logs can rotate to.
#                           (Default: 5)
# - KYUUBI_JAVA_OPTS        JVM options for the Kyuubi server itself in the form "-Dx=y".
#                           (Default: none).
# - KYUUBI_CTL_JAVA_OPTS    JVM options for the Kyuubi ctl itself in the form "-Dx=y".
#                           (Default: none).
# - KYUUBI_BEELINE_OPTS     JVM options for the Kyuubi BeeLine in the form "-Dx=Y".
#                           (Default: none)
# - KYUUBI_NICENESS         The scheduling priority for Kyuubi server.
#                           (Default: 0)
# - KYUUBI_WORK_DIR_ROOT    Root directory for launching sql engine applications.
#                           (Default: $KYUUBI_HOME/work)
# - HADOOP_CONF_DIR         Directory containing the Hadoop / YARN configuration to use.
# - YARN_CONF_DIR           Directory containing the YARN configuration to use.
#
# - SPARK_HOME              Spark distribution which you would like to use in Kyuubi.
# - SPARK_CONF_DIR          Optional directory where the Spark configuration lives.
#                           (Default: $SPARK_HOME/conf)
# - FLINK_HOME              Flink distribution which you would like to use in Kyuubi.
# - FLINK_CONF_DIR          Optional directory where the Flink configuration lives.
#                           (Default: $FLINK_HOME/conf)
# - FLINK_HADOOP_CLASSPATH  Required Hadoop jars when you use the Kyuubi Flink engine.
# - HIVE_HOME               Hive distribution which you would like to use in Kyuubi.
# - HIVE_CONF_DIR           Optional directory where the Hive configuration lives.
#                           (Default: $HIVE_HOME/conf)
# - HIVE_HADOOP_CLASSPATH   Required Hadoop jars when you use the Kyuubi Hive engine.
#


## Examples ##

# export JAVA_HOME=/usr/jdk64/jdk1.8.0_152
# export SPARK_HOME=/opt/spark
# export FLINK_HOME=/opt/flink
# export HIVE_HOME=/opt/hive
# export FLINK_HADOOP_CLASSPATH=/path/to/hadoop-client-runtime-3.3.2.jar:/path/to/hadoop-client-api-3.3.2.jar
# export HIVE_HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.0.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.0.jar:${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar
# export HADOOP_CONF_DIR=/usr/ndp/current/mapreduce_client/conf
# export YARN_CONF_DIR=/usr/ndp/current/yarn/conf
# export KYUUBI_JAVA_OPTS="-Xmx10g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:MaxDirectMemorySize=1024m  -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -XX:NewRatio=3 -XX:MetaspaceSize=512m"
# export KYUUBI_BEELINE_OPTS="-Xmx2g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark"

For the environment variables that only needed to be transferred into engine side, you can set it with a Kyuubi configuration item formatted kyuubi.engineEnv.VAR_NAME. For example, with kyuubi.engineEnv.SPARK_DRIVER_MEMORY=4g, the environment variable SPARK_DRIVER_MEMORY with value 4g would be transferred into engine side. With kyuubi.engineEnv.SPARK_CONF_DIR=/apache/confs/spark/conf, the value of SPARK_CONF_DIR on the engine side is set to /apache/confs/spark/conf.

Kyuubi Configurations#

You can configure the Kyuubi properties in $KYUUBI_HOME/conf/kyuubi-defaults.conf. For example:

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

## Kyuubi Configurations

#
# kyuubi.authentication           NONE
# kyuubi.frontend.bind.host       localhost
# kyuubi.frontend.bind.port       10009
#

# Details in https://kyuubi.readthedocs.io/en/master/deployment/settings.html

Authentication#

Key Default Meaning Type Since
kyuubi.authentication NONE A comma-separated list of client authentication types.
  • NOSASL: raw transport.
  • NONE: no authentication check.
  • KERBEROS: Kerberos/GSSAPI authentication.
  • CUSTOM: User-defined authentication.
  • JDBC: JDBC query authentication.
  • LDAP: Lightweight Directory Access Protocol authentication.
The following tree describes the catalog of each option.
  • NOSASL
  • SASL
    • SASL/PLAIN
      • NONE
      • LDAP
      • JDBC
      • CUSTOM
    • SASL/GSSAPI
      • KERBEROS
Note that: for SASL authentication, KERBEROS and PLAIN auth types are supported at the same time, and only the first specified PLAIN auth type is valid.
seq 1.0.0
kyuubi.authentication.custom.class <undefined> User-defined authentication implementation of org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider string 1.3.0
kyuubi.authentication.jdbc.driver.class <undefined> Driver class name for JDBC Authentication Provider. string 1.6.0
kyuubi.authentication.jdbc.password <undefined> Database password for JDBC Authentication Provider. string 1.6.0
kyuubi.authentication.jdbc.query <undefined> Query SQL template with placeholders for JDBC Authentication Provider to execute. Authentication passes if the result set is not empty.The SQL statement must start with the SELECT clause. Available placeholders are ${user} and ${password}. string 1.6.0
kyuubi.authentication.jdbc.url <undefined> JDBC URL for JDBC Authentication Provider. string 1.6.0
kyuubi.authentication.jdbc.user <undefined> Database user for JDBC Authentication Provider. string 1.6.0
kyuubi.authentication.ldap.baseDN <undefined> LDAP base DN. string 1.7.0
kyuubi.authentication.ldap.binddn <undefined> The user with which to bind to the LDAP server, and search for the full domain name of the user being authenticated. This should be the full domain name of the user, and should have search access across all users in the LDAP tree. If not specified, then the user being authenticated will be used as the bind user. For example: CN=bindUser,CN=Users,DC=subdomain,DC=domain,DC=com string 1.7.0
kyuubi.authentication.ldap.bindpw <undefined> The password for the bind user, to be used to search for the full name of the user being authenticated. If the username is specified, this parameter must also be specified. string 1.7.0
kyuubi.authentication.ldap.customLDAPQuery <undefined> A full LDAP query that LDAP Atn provider uses to execute against LDAP Server. If this query returns a null resultset, the LDAP Provider fails the Authentication request, succeeds if the user is part of the resultset.For example: (&(objectClass=group)(objectClass=top)(instanceType=4)(cn=Domain*)), (&(objectClass=person)(|(sAMAccountName=admin)(|(memberOf=CN=Domain Admins,CN=Users,DC=domain,DC=com)(memberOf=CN=Administrators,CN=Builtin,DC=domain,DC=com)))) string 1.7.0
kyuubi.authentication.ldap.domain <undefined> LDAP domain. string 1.0.0
kyuubi.authentication.ldap.groupClassKey groupOfNames LDAP attribute name on the group entry that is to be used in LDAP group searches. For example: group, groupOfNames or groupOfUniqueNames. string 1.7.0
kyuubi.authentication.ldap.groupDNPattern <undefined> COLON-separated list of patterns to use to find DNs for group entities in this directory. Use %s where the actual group name is to be substituted for. For example: CN=%s,CN=Groups,DC=subdomain,DC=domain,DC=com. string 1.7.0
kyuubi.authentication.ldap.groupFilter COMMA-separated list of LDAP Group names (short name not full DNs). For example: HiveAdmins,HadoopAdmins,Administrators seq 1.7.0
kyuubi.authentication.ldap.groupMembershipKey member LDAP attribute name on the group object that contains the list of distinguished names for the user, group, and contact objects that are members of the group. For example: member, uniqueMember or memberUid string 1.7.0
kyuubi.authentication.ldap.guidKey uid LDAP attribute name whose values are unique in this LDAP server. For example: uid or CN. string 1.2.0
kyuubi.authentication.ldap.url <undefined> SPACE character separated LDAP connection URL(s). string 1.0.0
kyuubi.authentication.ldap.userDNPattern <undefined> COLON-separated list of patterns to use to find DNs for users in this directory. Use %s where the actual group name is to be substituted for. For example: CN=%s,CN=Users,DC=subdomain,DC=domain,DC=com. string 1.7.0
kyuubi.authentication.ldap.userFilter COMMA-separated list of LDAP usernames (just short names, not full DNs). For example: hiveuser,impalauser,hiveadmin,hadoopadmin seq 1.7.0
kyuubi.authentication.ldap.userMembershipKey <undefined> LDAP attribute name on the user object that contains groups of which the user is a direct member, except for the primary group, which is represented by the primaryGroupId. For example: memberOf string 1.7.0
kyuubi.authentication.sasl.qop auth Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
  • auth - authentication only (default)
  • auth-int - authentication plus integrity protection
  • auth-conf - authentication plus integrity and confidentiality protection. This is applicable only if Kyuubi is configured to use Kerberos authentication.
string 1.0.0

Backend#

Key Default Meaning Type Since
kyuubi.backend.engine.exec.pool.keepalive.time PT1M Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in SQL engine applications duration 1.0.0
kyuubi.backend.engine.exec.pool.shutdown.timeout PT10S Timeout(ms) for the operation execution thread pool to terminate in SQL engine applications duration 1.0.0
kyuubi.backend.engine.exec.pool.size 100 Number of threads in the operation execution thread pool of SQL engine applications int 1.0.0
kyuubi.backend.engine.exec.pool.wait.queue.size 100 Size of the wait queue for the operation execution thread pool in SQL engine applications int 1.0.0
kyuubi.backend.server.event.json.log.path file:///tmp/kyuubi/events The location of server events go for the built-in JSON logger string 1.4.0
kyuubi.backend.server.event.loggers A comma-separated list of server history loggers, where session/operation etc events go.
  • JSON: the events will be written to the location of kyuubi.backend.server.event.json.log.path
  • JDBC: to be done
  • CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a class which is a child of org.apache.kyuubi.events.handler.CustomEventHandlerProvider which has a zero-arg constructor.
seq 1.4.0
kyuubi.backend.server.exec.pool.keepalive.time PT1M Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in Kyuubi server duration 1.0.0
kyuubi.backend.server.exec.pool.shutdown.timeout PT10S Timeout(ms) for the operation execution thread pool to terminate in Kyuubi server duration 1.0.0
kyuubi.backend.server.exec.pool.size 100 Number of threads in the operation execution thread pool of Kyuubi server int 1.0.0
kyuubi.backend.server.exec.pool.wait.queue.size 100 Size of the wait queue for the operation execution thread pool of Kyuubi server int 1.0.0

Batch#

Key Default Meaning Type Since
kyuubi.batch.application.check.interval PT5S The interval to check batch job application information. duration 1.6.0
kyuubi.batch.application.starvation.timeout PT3M Threshold above which to warn batch application may be starved. duration 1.7.0
kyuubi.batch.conf.ignore.list A comma-separated list of ignored keys for batch conf. If the batch conf contains any of them, the key and the corresponding value will be removed silently during batch job submission. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering. You can also pre-define some config for batch job submission with the prefix: kyuubi.batchConf.[batchType]. For example, you can pre-define spark.master for the Spark batch job with key kyuubi.batchConf.spark.spark.master. seq 1.6.0
kyuubi.batch.session.idle.timeout PT6H Batch session idle timeout, it will be closed when it's not accessed for this duration duration 1.6.2

Credentials#

Key Default Meaning Type Since
kyuubi.credentials.check.interval PT5M The interval to check the expiration of cached pairs. duration 1.6.0
kyuubi.credentials.hadoopfs.enabled true Whether to renew Hadoop filesystem delegation tokens boolean 1.4.0
kyuubi.credentials.hadoopfs.uris Extra Hadoop filesystem URIs for which to request delegation tokens. The filesystem that hosts fs.defaultFS does not need to be listed here. seq 1.4.0
kyuubi.credentials.hive.enabled true Whether to renew Hive metastore delegation token boolean 1.4.0
kyuubi.credentials.idle.timeout PT6H The inactive users' credentials will be expired after a configured timeout duration 1.6.0
kyuubi.credentials.renewal.interval PT1H How often Kyuubi renews one user's delegation tokens duration 1.4.0
kyuubi.credentials.renewal.retry.wait PT1M How long to wait before retrying to fetch new credentials after a failure. duration 1.4.0
kyuubi.credentials.update.wait.timeout PT1M How long to wait until the credentials are ready. duration 1.5.0

Ctl#

Key Default Meaning Type Since
kyuubi.ctl.batch.log.on.failure.timeout PT10S The timeout for fetching remaining batch logs if the batch failed. duration 1.6.1
kyuubi.ctl.batch.log.query.interval PT3S The interval for fetching batch logs. duration 1.6.0
kyuubi.ctl.rest.auth.schema basic The authentication schema. Valid values are: basic, spnego. string 1.6.0
kyuubi.ctl.rest.base.url <undefined> The REST API base URL, which contains the scheme (http:// or https://), hostname, port number string 1.6.0
kyuubi.ctl.rest.connect.timeout PT30S The timeout[ms] for establishing the connection with the kyuubi server. A timeout value of zero is interpreted as an infinite timeout. duration 1.6.0
kyuubi.ctl.rest.request.attempt.wait PT3S How long to wait between attempts of ctl rest request. duration 1.6.0
kyuubi.ctl.rest.request.max.attempts 3 The max attempts number for ctl rest request. int 1.6.0
kyuubi.ctl.rest.socket.timeout PT2M The timeout[ms] for waiting for data packets after connection is established. A timeout value of zero is interpreted as an infinite timeout. duration 1.6.0
kyuubi.ctl.rest.spnego.host <undefined> When auth schema is spnego, need to config spnego host. string 1.6.0

Delegation#

Key Default Meaning Type Since
kyuubi.delegation.key.update.interval PT24H unused yet duration 1.0.0
kyuubi.delegation.token.gc.interval PT1H unused yet duration 1.0.0
kyuubi.delegation.token.max.lifetime PT168H unused yet duration 1.0.0
kyuubi.delegation.token.renew.interval PT168H unused yet duration 1.0.0

Engine#

Key Default Meaning Type Since
kyuubi.engine.connection.url.use.hostname true (deprecated) When true, the engine registers with hostname to zookeeper. When Spark runs on K8s with cluster mode, set to false to ensure that server can connect to engine boolean 1.3.0
kyuubi.engine.deregister.exception.classes A comma-separated list of exception classes. If there is any exception thrown, whose class matches the specified classes, the engine would deregister itself. seq 1.2.0
kyuubi.engine.deregister.exception.messages A comma-separated list of exception messages. If there is any exception thrown, whose message or stacktrace matches the specified message list, the engine would deregister itself. seq 1.2.0
kyuubi.engine.deregister.exception.ttl PT30M Time to live(TTL) for exceptions pattern specified in kyuubi.engine.deregister.exception.classes and kyuubi.engine.deregister.exception.messages to deregister engines. Once the total error count hits the kyuubi.engine.deregister.job.max.failures within the TTL, an engine will deregister itself and wait for self-terminated. Otherwise, we suppose that the engine has recovered from temporary failures. duration 1.2.0
kyuubi.engine.deregister.job.max.failures 4 Number of failures of job before deregistering the engine. int 1.2.0
kyuubi.engine.event.json.log.path file:///tmp/kyuubi/events The location where all the engine events go for the built-in JSON logger.
  • Local Path: start with 'file://'
  • HDFS Path: start with 'hdfs://'
string 1.3.0
kyuubi.engine.event.loggers SPARK A comma-separated list of engine history loggers, where engine/session/operation etc events go.
  • SPARK: the events will be written to the Spark listener bus.
  • JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
  • JDBC: to be done
  • CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a subclass of org.apache.kyuubi.events.handler.CustomEventHandlerProvider which has a zero-arg constructor.
seq 1.3.0
kyuubi.engine.flink.extra.classpath <undefined> The extra classpath for the Flink SQL engine, for configuring the location of hadoop client jars, etc string 1.6.0
kyuubi.engine.flink.java.options <undefined> The extra Java options for the Flink SQL engine string 1.6.0
kyuubi.engine.flink.memory 1g The heap memory for the Flink SQL engine string 1.6.0
kyuubi.engine.hive.event.loggers JSON A comma-separated list of engine history loggers, where engine/session/operation etc events go.
  • JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
  • JDBC: to be done
  • CUSTOM: to be done.
seq 1.7.0
kyuubi.engine.hive.extra.classpath <undefined> The extra classpath for the Hive query engine, for configuring location of the hadoop client jars and etc. string 1.6.0
kyuubi.engine.hive.java.options <undefined> The extra Java options for the Hive query engine string 1.6.0
kyuubi.engine.hive.memory 1g The heap memory for the Hive query engine string 1.6.0
kyuubi.engine.initialize.sql SHOW DATABASES SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. i.e. use SHOW DATABASES to eagerly active HiveClient. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. seq 1.2.0
kyuubi.engine.jdbc.connection.password <undefined> The password is used for connecting to server string 1.6.0
kyuubi.engine.jdbc.connection.properties The additional properties are used for connecting to server seq 1.6.0
kyuubi.engine.jdbc.connection.provider <undefined> The connection provider is used for getting a connection from the server string 1.6.0
kyuubi.engine.jdbc.connection.url <undefined> The server url that engine will connect to string 1.6.0
kyuubi.engine.jdbc.connection.user <undefined> The user is used for connecting to server string 1.6.0
kyuubi.engine.jdbc.driver.class <undefined> The driver class for JDBC engine connection string 1.6.0
kyuubi.engine.jdbc.extra.classpath <undefined> The extra classpath for the JDBC query engine, for configuring the location of the JDBC driver and etc. string 1.6.0
kyuubi.engine.jdbc.java.options <undefined> The extra Java options for the JDBC query engine string 1.6.0
kyuubi.engine.jdbc.memory 1g The heap memory for the JDBC query engine string 1.6.0
kyuubi.engine.jdbc.type <undefined> The short name of JDBC type string 1.6.0
kyuubi.engine.operation.convert.catalog.database.enabled true When set to true, The engine converts the JDBC methods of set/get Catalog and set/get Schema to the implementation of different engines boolean 1.6.0
kyuubi.engine.operation.log.dir.root engine_operation_logs Root directory for query operation log at engine-side. string 1.4.0
kyuubi.engine.pool.name engine-pool The name of the engine pool. string 1.5.0
kyuubi.engine.pool.selectPolicy RANDOM The select policy of an engine from the corresponding engine pool engine for a session.
  • RANDOM - Randomly use the engine in the pool
  • POLLING - Polling use the engine in the pool
string 1.7.0
kyuubi.engine.pool.size -1 The size of the engine pool. Note that, if the size is less than 1, the engine pool will not be enabled; otherwise, the size of the engine pool will be min(this, kyuubi.engine.pool.size.threshold). int 1.4.0
kyuubi.engine.pool.size.threshold 9 This parameter is introduced as a server-side parameter controlling the upper limit of the engine pool. int 1.4.0
kyuubi.engine.session.initialize.sql SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. seq 1.3.0
kyuubi.engine.share.level USER Engines will be shared in different levels, available configs are:
  • CONNECTION: engine will not be shared but only used by the current client connection
  • USER: engine will be shared by all sessions created by a unique username, see also kyuubi.engine.share.level.subdomain
  • GROUP: the engine will be shared by all sessions created by all users belong to the same primary group name. The engine will be launched by the group name as the effective username, so here the group name is in value of special user who is able to visit the computing resources/data of the team. It follows the Hadoop GroupsMapping to map user to a primary group. If the primary group is not found, it fallback to the USER level.
  • SERVER: the App will be shared by Kyuubi servers
string 1.2.0
kyuubi.engine.share.level.sub.domain <undefined> (deprecated) - Using kyuubi.engine.share.level.subdomain instead string 1.2.0
kyuubi.engine.share.level.subdomain <undefined> Allow end-users to create a subdomain for the share level of an engine. A subdomain is a case-insensitive string values that must be a valid zookeeper subpath. For example, for the USER share level, an end-user can share a certain engine within a subdomain, not for all of its clients. End-users are free to create multiple engines in the USER share level. When disable engine pool, use 'default' if absent. string 1.4.0
kyuubi.engine.single.spark.session false When set to true, this engine is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database. boolean 1.3.0
kyuubi.engine.spark.event.loggers SPARK A comma-separated list of engine loggers, where engine/session/operation etc events go.
  • SPARK: the events will be written to the Spark listener bus.
  • JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
  • JDBC: to be done
  • CUSTOM: to be done.
seq 1.7.0
kyuubi.engine.spark.python.env.archive <undefined> Portable Python env archive used for Spark engine Python language mode. string 1.7.0
kyuubi.engine.spark.python.env.archive.exec.path bin/python The Python exec path under the Python env archive. string 1.7.0
kyuubi.engine.spark.python.home.archive <undefined> Spark archive containing $SPARK_HOME/python directory, which is used to init session Python worker for Python language mode. string 1.7.0
kyuubi.engine.trino.event.loggers JSON A comma-separated list of engine history loggers, where engine/session/operation etc events go.
  • JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
  • JDBC: to be done
  • CUSTOM: to be done.
seq 1.7.0
kyuubi.engine.trino.extra.classpath <undefined> The extra classpath for the Trino query engine, for configuring other libs which may need by the Trino engine string 1.6.0
kyuubi.engine.trino.java.options <undefined> The extra Java options for the Trino query engine string 1.6.0
kyuubi.engine.trino.memory 1g The heap memory for the Trino query engine string 1.6.0
kyuubi.engine.type SPARK_SQL Specify the detailed engine supported by Kyuubi. The engine type bindings to SESSION scope. This configuration is experimental. Currently, available configs are:
  • SPARK_SQL: specify this engine type will launch a Spark engine which can provide all the capacity of the Apache Spark. Note, it's a default engine type.
  • FLINK_SQL: specify this engine type will launch a Flink engine which can provide all the capacity of the Apache Flink.
  • TRINO: specify this engine type will launch a Trino engine which can provide all the capacity of the Trino.
  • HIVE_SQL: specify this engine type will launch a Hive engine which can provide all the capacity of the Hive Server2.
  • JDBC: specify this engine type will launch a JDBC engine which can provide a MySQL protocol connector, for now we only support Doris dialect.
string 1.4.0
kyuubi.engine.ui.retainedSessions 200 The number of SQL client sessions kept in the Kyuubi Query Engine web UI. int 1.4.0
kyuubi.engine.ui.retainedStatements 200 The number of statements kept in the Kyuubi Query Engine web UI. int 1.4.0
kyuubi.engine.ui.stop.enabled true When true, allows Kyuubi engine to be killed from the Spark Web UI. boolean 1.3.0
kyuubi.engine.user.isolated.spark.session true When set to false, if the engine is running in a group or server share level, all the JDBC/ODBC connections will be isolated against the user. Including the temporary views, function registries, SQL configuration, and the current database. Note that, it does not affect if the share level is connection or user. boolean 1.6.0
kyuubi.engine.user.isolated.spark.session.idle.interval PT1M The interval to check if the user-isolated Spark session is timeout. duration 1.6.0
kyuubi.engine.user.isolated.spark.session.idle.timeout PT6H If kyuubi.engine.user.isolated.spark.session is false, we will release the Spark session if its corresponding user is inactive after this configured timeout. duration 1.6.0

Event#

Key Default Meaning Type Since
kyuubi.event.async.pool.keepalive.time PT1M Time(ms) that an idle async thread of the async event handler thread pool will wait for a new task to arrive before terminating duration 1.7.0
kyuubi.event.async.pool.size 8 Number of threads in the async event handler thread pool int 1.7.0
kyuubi.event.async.pool.wait.queue.size 100 Size of the wait queue for the async event handler thread pool int 1.7.0

Frontend#

Key Default Meaning Type Since
kyuubi.frontend.backoff.slot.length PT0.1S (deprecated) Time to back off during login to the thrift frontend service. duration 1.0.0
kyuubi.frontend.bind.host <undefined> Hostname or IP of the machine on which to run the frontend services. string 1.0.0
kyuubi.frontend.bind.port 10009 (deprecated) Port of the machine on which to run the thrift frontend service via the binary protocol. int 1.0.0
kyuubi.frontend.connection.url.use.hostname true When true, frontend services prefer hostname, otherwise, ip address. Note that, the default value is set to false when engine running on Kubernetes to prevent potential network issues. boolean 1.5.0
kyuubi.frontend.login.timeout PT20S (deprecated) Timeout for Thrift clients during login to the thrift frontend service. duration 1.0.0
kyuubi.frontend.max.message.size 104857600 (deprecated) Maximum message size in bytes a Kyuubi server will accept. int 1.0.0
kyuubi.frontend.max.worker.threads 999 (deprecated) Maximum number of threads in the frontend worker thread pool for the thrift frontend service int 1.0.0
kyuubi.frontend.min.worker.threads 9 (deprecated) Minimum number of threads in the frontend worker thread pool for the thrift frontend service int 1.0.0
kyuubi.frontend.mysql.bind.host <undefined> Hostname or IP of the machine on which to run the MySQL frontend service. string 1.4.0
kyuubi.frontend.mysql.bind.port 3309 Port of the machine on which to run the MySQL frontend service. int 1.4.0
kyuubi.frontend.mysql.max.worker.threads 999 Maximum number of threads in the command execution thread pool for the MySQL frontend service int 1.4.0
kyuubi.frontend.mysql.min.worker.threads 9 Minimum number of threads in the command execution thread pool for the MySQL frontend service int 1.4.0
kyuubi.frontend.mysql.netty.worker.threads <undefined> Number of thread in the netty worker event loop of MySQL frontend service. Use min(cpu_cores, 8) in default. int 1.4.0
kyuubi.frontend.mysql.worker.keepalive.time PT1M Time(ms) that an idle async thread of the command execution thread pool will wait for a new task to arrive before terminating in MySQL frontend service duration 1.4.0
kyuubi.frontend.protocols THRIFT_BINARY A comma-separated list for all frontend protocols
  • THRIFT_BINARY - HiveServer2 compatible thrift binary protocol.
  • THRIFT_HTTP - HiveServer2 compatible thrift http protocol.
  • REST - Kyuubi defined REST API(experimental).
  • MYSQL - MySQL compatible text protocol(experimental).
  • TRINO - Trino compatible http protocol(experimental).
seq 1.4.0
kyuubi.frontend.proxy.http.client.ip.header X-Real-IP The HTTP header to record the real client IP address. If your server is behind a load balancer or other proxy, the server will see this load balancer or proxy IP address as the client IP address, to get around this common issue, most load balancers or proxies offer the ability to record the real remote IP address in an HTTP header that will be added to the request for other devices to use. Note that, because the header value can be specified to any IP address, so it will not be used for authentication. string 1.6.0
kyuubi.frontend.rest.bind.host <undefined> Hostname or IP of the machine on which to run the REST frontend service. string 1.4.0
kyuubi.frontend.rest.bind.port 10099 Port of the machine on which to run the REST frontend service. int 1.4.0
kyuubi.frontend.rest.max.worker.threads 999 Maximum number of threads in the frontend worker thread pool for the rest frontend service int 1.6.2
kyuubi.frontend.ssl.keystore.algorithm <undefined> SSL certificate keystore algorithm. string 1.7.0
kyuubi.frontend.ssl.keystore.password <undefined> SSL certificate keystore password. string 1.7.0
kyuubi.frontend.ssl.keystore.path <undefined> SSL certificate keystore location. string 1.7.0
kyuubi.frontend.ssl.keystore.type <undefined> SSL certificate keystore type. string 1.7.0
kyuubi.frontend.thrift.backoff.slot.length PT0.1S Time to back off during login to the thrift frontend service. duration 1.4.0
kyuubi.frontend.thrift.binary.bind.host <undefined> Hostname or IP of the machine on which to run the thrift frontend service via the binary protocol. string 1.4.0
kyuubi.frontend.thrift.binary.bind.port 10009 Port of the machine on which to run the thrift frontend service via the binary protocol. int 1.4.0
kyuubi.frontend.thrift.binary.ssl.disallowed.protocols SSLv2,SSLv3 SSL versions to disallow for Kyuubi thrift binary frontend. seq 1.7.0
kyuubi.frontend.thrift.binary.ssl.enabled false Set this to true for using SSL encryption in thrift binary frontend server. boolean 1.7.0
kyuubi.frontend.thrift.binary.ssl.include.ciphersuites A comma-separated list of include SSL cipher suite names for thrift binary frontend. seq 1.7.0
kyuubi.frontend.thrift.http.allow.user.substitution true Allow alternate user to be specified as part of open connection request when using HTTP transport mode. boolean 1.6.0
kyuubi.frontend.thrift.http.bind.host <undefined> Hostname or IP of the machine on which to run the thrift frontend service via http protocol. string 1.6.0
kyuubi.frontend.thrift.http.bind.port 10010 Port of the machine on which to run the thrift frontend service via http protocol. int 1.6.0
kyuubi.frontend.thrift.http.compression.enabled true Enable thrift http compression via Jetty compression support boolean 1.6.0
kyuubi.frontend.thrift.http.cookie.auth.enabled true When true, Kyuubi in HTTP transport mode, will use cookie-based authentication mechanism boolean 1.6.0
kyuubi.frontend.thrift.http.cookie.domain <undefined> Domain for the Kyuubi generated cookies string 1.6.0
kyuubi.frontend.thrift.http.cookie.is.httponly true HttpOnly attribute of the Kyuubi generated cookie. boolean 1.6.0
kyuubi.frontend.thrift.http.cookie.max.age 86400 Maximum age in seconds for server side cookie used by Kyuubi in HTTP mode. int 1.6.0
kyuubi.frontend.thrift.http.cookie.path <undefined> Path for the Kyuubi generated cookies string 1.6.0
kyuubi.frontend.thrift.http.max.idle.time PT30M Maximum idle time for a connection on the server when in HTTP mode. duration 1.6.0
kyuubi.frontend.thrift.http.path cliservice Path component of URL endpoint when in HTTP mode. string 1.6.0
kyuubi.frontend.thrift.http.request.header.size 6144 Request header size in bytes, when using HTTP transport mode. Jetty defaults used. int 1.6.0
kyuubi.frontend.thrift.http.response.header.size 6144 Response header size in bytes, when using HTTP transport mode. Jetty defaults used. int 1.6.0
kyuubi.frontend.thrift.http.ssl.exclude.ciphersuites A comma-separated list of exclude SSL cipher suite names for thrift http frontend. seq 1.7.0
kyuubi.frontend.thrift.http.ssl.keystore.password <undefined> SSL certificate keystore password. string 1.6.0
kyuubi.frontend.thrift.http.ssl.keystore.path <undefined> SSL certificate keystore location. string 1.6.0
kyuubi.frontend.thrift.http.ssl.protocol.blacklist SSLv2,SSLv3 SSL Versions to disable when using HTTP transport mode. seq 1.6.0
kyuubi.frontend.thrift.http.use.SSL false Set this to true for using SSL encryption in http mode. boolean 1.6.0
kyuubi.frontend.thrift.http.xsrf.filter.enabled false If enabled, Kyuubi will block any requests made to it over HTTP if an X-XSRF-HEADER header is not present boolean 1.6.0
kyuubi.frontend.thrift.login.timeout PT20S Timeout for Thrift clients during login to the thrift frontend service. duration 1.4.0
kyuubi.frontend.thrift.max.message.size 104857600 Maximum message size in bytes a Kyuubi server will accept. int 1.4.0
kyuubi.frontend.thrift.max.worker.threads 999 Maximum number of threads in the frontend worker thread pool for the thrift frontend service int 1.4.0
kyuubi.frontend.thrift.min.worker.threads 9 Minimum number of threads in the frontend worker thread pool for the thrift frontend service int 1.4.0
kyuubi.frontend.thrift.worker.keepalive.time PT1M Keep-alive time (in milliseconds) for an idle worker thread duration 1.4.0
kyuubi.frontend.trino.bind.host <undefined> Hostname or IP of the machine on which to run the TRINO frontend service. string 1.7.0
kyuubi.frontend.trino.bind.port 10999 Port of the machine on which to run the TRINO frontend service. int 1.7.0
kyuubi.frontend.trino.max.worker.threads 999 Maximum number of threads in the frontend worker thread pool for the Trino frontend service int 1.7.0
kyuubi.frontend.worker.keepalive.time PT1M (deprecated) Keep-alive time (in milliseconds) for an idle worker thread duration 1.0.0

Ha#

Key Default Meaning Type Since
kyuubi.ha.addresses The connection string for the discovery ensemble string 1.6.0
kyuubi.ha.client.class org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient Class name for service discovery client.
  • Zookeeper: org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient
  • Etcd: org.apache.kyuubi.ha.client.etcd.EtcdDiscoveryClient
string 1.6.0
kyuubi.ha.etcd.lease.timeout PT10S Timeout for etcd keep alive lease. The kyuubi server will know the unexpected loss of engine after up to this seconds. duration 1.6.0
kyuubi.ha.etcd.ssl.ca.path <undefined> Where the etcd CA certificate file is stored. string 1.6.0
kyuubi.ha.etcd.ssl.client.certificate.path <undefined> Where the etcd SSL certificate file is stored. string 1.6.0
kyuubi.ha.etcd.ssl.client.key.path <undefined> Where the etcd SSL key file is stored. string 1.6.0
kyuubi.ha.etcd.ssl.enabled false When set to true, will build an SSL secured etcd client. boolean 1.6.0
kyuubi.ha.namespace kyuubi The root directory for the service to deploy its instance uri string 1.6.0
kyuubi.ha.zookeeper.acl.enabled false Set to true if the ZooKeeper ensemble is kerberized boolean 1.0.0
kyuubi.ha.zookeeper.auth.digest <undefined> The digest auth string is used for ZooKeeper authentication, like: username:password. string 1.3.2
kyuubi.ha.zookeeper.auth.keytab <undefined> Location of the Kyuubi server's keytab is used for ZooKeeper authentication. string 1.3.2
kyuubi.ha.zookeeper.auth.principal <undefined> Name of the Kerberos principal is used for ZooKeeper authentication. string 1.3.2
kyuubi.ha.zookeeper.auth.type NONE The type of ZooKeeper authentication, all candidates are
  • NONE
  • KERBEROS
  • DIGEST
string 1.3.2
kyuubi.ha.zookeeper.connection.base.retry.wait 1000 Initial amount of time to wait between retries to the ZooKeeper ensemble int 1.0.0
kyuubi.ha.zookeeper.connection.max.retries 3 Max retry times for connecting to the ZooKeeper ensemble int 1.0.0
kyuubi.ha.zookeeper.connection.max.retry.wait 30000 Max amount of time to wait between retries for BOUNDED_EXPONENTIAL_BACKOFF policy can reach, or max time until elapsed for UNTIL_ELAPSED policy to connect the zookeeper ensemble int 1.0.0
kyuubi.ha.zookeeper.connection.retry.policy EXPONENTIAL_BACKOFF The retry policy for connecting to the ZooKeeper ensemble, all candidates are:
  • ONE_TIME
  • N_TIME
  • EXPONENTIAL_BACKOFF
  • BOUNDED_EXPONENTIAL_BACKOFF
  • UNTIL_ELAPSED
string 1.0.0
kyuubi.ha.zookeeper.connection.timeout 15000 The timeout(ms) of creating the connection to the ZooKeeper ensemble int 1.0.0
kyuubi.ha.zookeeper.engine.auth.type NONE The type of ZooKeeper authentication for the engine, all candidates are
  • NONE
  • KERBEROS
  • DIGEST
string 1.3.2
kyuubi.ha.zookeeper.namespace kyuubi (deprecated) The root directory for the service to deploy its instance uri string 1.0.0
kyuubi.ha.zookeeper.node.creation.timeout PT2M Timeout for creating ZooKeeper node duration 1.2.0
kyuubi.ha.zookeeper.publish.configs false When set to true, publish Kerberos configs to Zookeeper. Note that the Hive driver needs to be greater than 1.3 or 2.0 or apply HIVE-11581 patch. boolean 1.4.0
kyuubi.ha.zookeeper.quorum (deprecated) The connection string for the ZooKeeper ensemble string 1.0.0
kyuubi.ha.zookeeper.session.timeout 60000 The timeout(ms) of a connected session to be idled int 1.0.0

Kinit#

Key Default Meaning Type Since
kyuubi.kinit.interval PT1H How often will the Kyuubi server run kinit -kt [keytab] [principal] to renew the local Kerberos credentials cache duration 1.0.0
kyuubi.kinit.keytab <undefined> Location of Kyuubi server's keytab. string 1.0.0
kyuubi.kinit.max.attempts 10 How many times will kinit process retry int 1.0.0
kyuubi.kinit.principal <undefined> Name of the Kerberos principal. string 1.0.0

Kubernetes#

Key Default Meaning Type Since
kyuubi.kubernetes.authenticate.caCertFile <undefined> Path to the CA cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) string 1.7.0
kyuubi.kubernetes.authenticate.clientCertFile <undefined> Path to the client cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) string 1.7.0
kyuubi.kubernetes.authenticate.clientKeyFile <undefined> Path to the client key file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) string 1.7.0
kyuubi.kubernetes.authenticate.oauthToken <undefined> The OAuth token to use when authenticating against the Kubernetes API server. Note that unlike, the other authentication options, this must be the exact string value of the token to use for the authentication. string 1.7.0
kyuubi.kubernetes.authenticate.oauthTokenFile <undefined> Path to the file containing the OAuth token to use when authenticating against the Kubernetes API server. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) string 1.7.0
kyuubi.kubernetes.context <undefined> The desired context from your kubernetes config file used to configure the K8s client for interacting with the cluster. string 1.6.0
kyuubi.kubernetes.master.address <undefined> The internal Kubernetes master (API server) address to be used for kyuubi. string 1.7.0
kyuubi.kubernetes.namespace default The namespace that will be used for running the kyuubi pods and find engines. string 1.7.0
kyuubi.kubernetes.trust.certificates false If set to true then client can submit to kubernetes cluster only with token boolean 1.7.0

Metadata#

Key Default Meaning Type Since
kyuubi.metadata.cleaner.enabled true Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in the terminate state with max age limitation. boolean 1.6.0
kyuubi.metadata.cleaner.interval PT30M The interval to check and clean expired metadata. duration 1.6.0
kyuubi.metadata.max.age PT72H The maximum age of metadata, the metadata exceeding the age will be cleaned. duration 1.6.0
kyuubi.metadata.recovery.threads 10 The number of threads for recovery from the metadata store when the Kyuubi server restarts. int 1.6.0
kyuubi.metadata.request.retry.interval PT5S The interval to check and trigger the metadata request retry tasks. duration 1.6.0
kyuubi.metadata.request.retry.queue.size 65536 The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. int 1.6.0
kyuubi.metadata.request.retry.threads 10 Number of threads in the metadata request retry manager thread pool. The metadata store might be unavailable sometimes and the requests will fail, tolerant for this case and unblock the main thread, we support retrying the failed requests in an async way. int 1.6.0
kyuubi.metadata.store.class org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore Fully qualified class name for server metadata store. string 1.6.0
kyuubi.metadata.store.jdbc.database.schema.init true Whether to init the JDBC metadata store database schema. boolean 1.6.0
kyuubi.metadata.store.jdbc.database.type DERBY The database type for server jdbc metadata store.
  • DERBY: Apache Derby, JDBC driver org.apache.derby.jdbc.AutoloadedDriver.
  • MYSQL: MySQL, JDBC driver com.mysql.jdbc.Driver.
  • CUSTOM: User-defined database type, need to specify corresponding JDBC driver.
  • Note that: The JDBC datasource is powered by HiKariCP, for datasource properties, please specify them with the prefix: kyuubi.metadata.store.jdbc.datasource. For example, kyuubi.metadata.store.jdbc.datasource.connectionTimeout=10000.
string 1.6.0
kyuubi.metadata.store.jdbc.driver <undefined> JDBC driver class name for server jdbc metadata store. string 1.6.0
kyuubi.metadata.store.jdbc.password The password for server JDBC metadata store. string 1.6.0
kyuubi.metadata.store.jdbc.url jdbc:derby:memory:kyuubi_state_store_db;create=true The JDBC url for server JDBC metadata store. By default, it is a DERBY in-memory database url, and the state information is not shared across kyuubi instances. To enable high availability for multiple kyuubi instances, please specify a production JDBC url. string 1.6.0
kyuubi.metadata.store.jdbc.user The username for server JDBC metadata store. string 1.6.0

Metrics#

Key Default Meaning Type Since
kyuubi.metrics.console.interval PT5S How often should report metrics to console duration 1.2.0
kyuubi.metrics.enabled true Set to true to enable kyuubi metrics system boolean 1.2.0
kyuubi.metrics.json.interval PT5S How often should report metrics to JSON file duration 1.2.0
kyuubi.metrics.json.location metrics Where the JSON metrics file located string 1.2.0
kyuubi.metrics.prometheus.path /metrics URI context path of prometheus metrics HTTP server string 1.2.0
kyuubi.metrics.prometheus.port 10019 Prometheus metrics HTTP server port int 1.2.0
kyuubi.metrics.reporters JSON A comma-separated list for all metrics reporters
  • CONSOLE - ConsoleReporter which outputs measurements to CONSOLE periodically.
  • JMX - JmxReporter which listens for new metrics and exposes them as MBeans.
  • JSON - JsonReporter which outputs measurements to json file periodically.
  • PROMETHEUS - PrometheusReporter which exposes metrics in Prometheus format.
  • SLF4J - Slf4jReporter which outputs measurements to system log periodically.
seq 1.2.0
kyuubi.metrics.slf4j.interval PT5S How often should report metrics to SLF4J logger duration 1.2.0

Operation#

Key Default Meaning Type Since
kyuubi.operation.idle.timeout PT3H Operation will be closed when it's not accessed for this duration of time duration 1.0.0
kyuubi.operation.interrupt.on.cancel true When true, all running tasks will be interrupted if one cancels a query. When false, all running tasks will remain until finished. boolean 1.2.0
kyuubi.operation.language SQL Choose a programing language for the following inputs
  • SQL: (Default) Run all following statements as SQL queries.
  • SCALA: Run all following input as scala codes
  • PYTHON: (Experimental) Run all following input as Python codes with Spark engine
string 1.5.0
kyuubi.operation.log.dir.root server_operation_logs Root directory for query operation log at server-side. string 1.4.0
kyuubi.operation.plan.only.excludes ResetCommand,SetCommand,SetNamespaceCommand,UseStatement,SetCatalogAndNamespace Comma-separated list of query plan names, in the form of simple class names, i.e, for SET abc=xyz, the value will be SetCommand. For those auxiliary plans, such as switch databases, set properties, or create temporary view etc., which are used for setup evaluating environments for analyzing actual queries, we can use this config to exclude them and let them take effect. See also kyuubi.operation.plan.only.mode. seq 1.5.0
kyuubi.operation.plan.only.mode none Configures the statement performed mode, The value can be 'parse', 'analyze', 'optimize', 'optimize_with_stats', 'physical', 'execution', or 'none', when it is 'none', indicate to the statement will be fully executed, otherwise only way without executing the query. different engines currently support different modes, the Spark engine supports all modes, and the Flink engine supports 'parse', 'physical', and 'execution', other engines do not support planOnly currently. string 1.4.0
kyuubi.operation.plan.only.output.style plain Configures the planOnly output style. The value can be 'plain' or 'json', and the default value is 'plain'. This configuration supports only the output styles of the Spark engine string 1.7.0
kyuubi.operation.progress.enabled false Whether to enable the operation progress. When true, the operation progress will be returned in GetOperationStatus. boolean 1.6.0
kyuubi.operation.query.timeout <undefined> Timeout for query executions at server-side, take effect with client-side timeout(java.sql.Statement.setQueryTimeout) together, a running query will be cancelled automatically if timeout. It's off by default, which means only client-side take full control of whether the query should timeout or not. If set, client-side timeout is capped at this point. To cancel the queries right away without waiting for task to finish, consider enabling kyuubi.operation.interrupt.on.cancel together. duration 1.2.0
kyuubi.operation.result.format thrift Specify the result format, available configs are:
  • THRIFT: the result will convert to TRow at the engine driver side.
  • ARROW: the result will be encoded as Arrow at the executor side before collecting by the driver, and deserialized at the client side. note that it only takes effect for kyuubi-hive-jdbc clients now.
string 1.7.0
kyuubi.operation.result.max.rows 0 Max rows of Spark query results. Rows exceeding the limit would be ignored. By setting this value to 0 to disable the max rows limit. int 1.6.0
kyuubi.operation.scheduler.pool <undefined> The scheduler pool of job. Note that, this config should be used after changing Spark config spark.scheduler.mode=FAIR. string 1.1.1
kyuubi.operation.spark.listener.enabled true When set to true, Spark engine registers an SQLOperationListener before executing the statement, logging a few summary statistics when each stage completes. boolean 1.6.0
kyuubi.operation.status.polling.timeout PT5S Timeout(ms) for long polling asynchronous running sql query's status duration 1.0.0

Server#

Key Default Meaning Type Since
kyuubi.server.batch.limit.connections.per.ipaddress <undefined> Maximum kyuubi server batch connections per ipaddress. Any user exceeding this limit will not be allowed to connect. int 1.7.0
kyuubi.server.batch.limit.connections.per.user <undefined> Maximum kyuubi server batch connections per user. Any user exceeding this limit will not be allowed to connect. int 1.7.0
kyuubi.server.batch.limit.connections.per.user.ipaddress <undefined> Maximum kyuubi server batch connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. int 1.7.0
kyuubi.server.info.provider ENGINE The server information provider name, some clients may rely on this information to check the server compatibilities and functionalities.
  • SERVER: Return Kyuubi server information.
  • ENGINE: Return Kyuubi engine information.
  • string 1.6.1
    kyuubi.server.limit.connections.per.ipaddress <undefined> Maximum kyuubi server connections per ipaddress. Any user exceeding this limit will not be allowed to connect. int 1.6.0
    kyuubi.server.limit.connections.per.user <undefined> Maximum kyuubi server connections per user. Any user exceeding this limit will not be allowed to connect. int 1.6.0
    kyuubi.server.limit.connections.per.user.ipaddress <undefined> Maximum kyuubi server connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. int 1.6.0
    kyuubi.server.limit.connections.user.unlimited.list The maximin connections of the user in the white list will not be limited. seq 1.7.0
    kyuubi.server.name <undefined> The name of Kyuubi Server. string 1.5.0
    kyuubi.server.redaction.regex <undefined> Regex to decide which Kyuubi contain sensitive information. When this regex matches a property key or value, the value is redacted from the various logs. 1.6.0

    Session#

    Key Default Meaning Type Since
    kyuubi.session.check.interval PT5M The check interval for session timeout. duration 1.0.0
    kyuubi.session.conf.advisor <undefined> A config advisor plugin for Kyuubi Server. This plugin can provide some custom configs for different users or session configs and overwrite the session configs before opening a new session. This config value should be a subclass of org.apache.kyuubi.plugin.SessionConfAdvisor which has a zero-arg constructor. string 1.5.0
    kyuubi.session.conf.file.reload.interval PT10M When FileSessionConfAdvisor is used, this configuration defines the expired time of $KYUUBI_CONF_DIR/kyuubi-session-<profile>.conf in the cache. After exceeding this value, the file will be reloaded. duration 1.7.0
    kyuubi.session.conf.ignore.list A comma-separated list of ignored keys. If the client connection contains any of them, the key and the corresponding value will be removed silently during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. seq 1.2.0
    kyuubi.session.conf.profile <undefined> Specify a profile to load session-level configurations from $KYUUBI_CONF_DIR/kyuubi-session-<profile>.conf. This configuration will be ignored if the file does not exist. This configuration only takes effect when kyuubi.session.conf.advisor is set as org.apache.kyuubi.session.FileSessionConfAdvisor. string 1.7.0
    kyuubi.session.conf.restrict.list A comma-separated list of restricted keys. If the client connection contains any of them, the connection will be rejected explicitly during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. seq 1.2.0
    kyuubi.session.engine.alive.probe.enabled false Whether to enable the engine alive probe, it true, we will create a companion thrift client that keeps sending simple requests to check whether the engine is alive. boolean 1.6.0
    kyuubi.session.engine.alive.probe.interval PT10S The interval for engine alive probe. duration 1.6.0
    kyuubi.session.engine.alive.timeout PT2M The timeout for engine alive. If there is no alive probe success in the last timeout window, the engine will be marked as no-alive. duration 1.6.0
    kyuubi.session.engine.check.interval PT1M The check interval for engine timeout duration 1.0.0
    kyuubi.session.engine.flink.main.resource <undefined> The package used to create Flink SQL engine remote job. If it is undefined, Kyuubi will use the default string 1.4.0
    kyuubi.session.engine.flink.max.rows 1000000 Max rows of Flink query results. For batch queries, rows exceeding the limit would be ignored. For streaming queries, the query would be canceled if the limit is reached. int 1.5.0
    kyuubi.session.engine.hive.main.resource <undefined> The package used to create Hive engine remote job. If it is undefined, Kyuubi will use the default string 1.6.0
    kyuubi.session.engine.idle.timeout PT30M engine timeout, the engine will self-terminate when it's not accessed for this duration. 0 or negative means not to self-terminate. duration 1.0.0
    kyuubi.session.engine.initialize.timeout PT3M Timeout for starting the background engine, e.g. SparkSQLEngine. duration 1.0.0
    kyuubi.session.engine.launch.async true When opening kyuubi session, whether to launch the backend engine asynchronously. When true, the Kyuubi server will set up the connection with the client without delay as the backend engine will be created asynchronously. boolean 1.4.0
    kyuubi.session.engine.log.timeout PT24H If we use Spark as the engine then the session submit log is the console output of spark-submit. We will retain the session submit log until over the config value. duration 1.1.0
    kyuubi.session.engine.login.timeout PT15S The timeout of creating the connection to remote sql query engine duration 1.0.0
    kyuubi.session.engine.open.max.attempts 9 The number of times an open engine will retry when encountering a special error. int 1.7.0
    kyuubi.session.engine.open.retry.wait PT10S How long to wait before retrying to open the engine after failure. duration 1.7.0
    kyuubi.session.engine.share.level USER (deprecated) - Using kyuubi.engine.share.level instead string 1.0.0
    kyuubi.session.engine.spark.main.resource <undefined> The package used to create Spark SQL engine remote application. If it is undefined, Kyuubi will use the default string 1.0.0
    kyuubi.session.engine.spark.max.lifetime PT0S Max lifetime for Spark engine, the engine will self-terminate when it reaches the end of life. 0 or negative means not to self-terminate. duration 1.6.0
    kyuubi.session.engine.spark.progress.timeFormat yyyy-MM-dd HH:mm:ss.SSS The time format of the progress bar string 1.6.0
    kyuubi.session.engine.spark.progress.update.interval PT1S Update period of progress bar. duration 1.6.0
    kyuubi.session.engine.spark.showProgress false When true, show the progress bar in the Spark's engine log. boolean 1.6.0
    kyuubi.session.engine.startup.error.max.size 8192 During engine bootstrapping, if anderror occurs, using this config to limit the length of error message(characters). int 1.1.0
    kyuubi.session.engine.startup.maxLogLines 10 The maximum number of engine log lines when errors occur during the engine startup phase. Note that this config effects on client-side to help track engine startup issues. int 1.4.0
    kyuubi.session.engine.startup.waitCompletion true Whether to wait for completion after the engine starts. If false, the startup process will be destroyed after the engine is started. Note that only use it when the driver is not running locally, such as in yarn-cluster mode; Otherwise, the engine will be killed. boolean 1.5.0
    kyuubi.session.engine.trino.connection.catalog <undefined> The default catalog that Trino engine will connect to string 1.5.0
    kyuubi.session.engine.trino.connection.url <undefined> The server url that Trino engine will connect to string 1.5.0
    kyuubi.session.engine.trino.main.resource <undefined> The package used to create Trino engine remote job. If it is undefined, Kyuubi will use the default string 1.5.0
    kyuubi.session.engine.trino.showProgress true When true, show the progress bar and final info in the Trino engine log. boolean 1.6.0
    kyuubi.session.engine.trino.showProgress.debug false When true, show the progress debug info in the Trino engine log. boolean 1.6.0
    kyuubi.session.group.provider hadoop A group provider plugin for Kyuubi Server. This plugin can provide primary group and groups information for different users or session configs. This config value should be a subclass of org.apache.kyuubi.plugin.GroupProvider which has a zero-arg constructor. Kyuubi provides the following built-in implementations:
  • hadoop: delegate the user group mapping to hadoop UserGroupInformation.
  • string 1.7.0
    kyuubi.session.idle.timeout PT6H session idle timeout, it will be closed when it's not accessed for this duration duration 1.2.0
    kyuubi.session.local.dir.allow.list The local dir list that are allowed to access by the kyuubi session application. End-users might set some parameters such as spark.files and it will upload some local files when launching the kyuubi engine, if the local dir allow list is defined, kyuubi will check whether the path to upload is in the allow list. Note that, if it is empty, there is no limitation for that. And please use absolute paths. seq 1.6.0
    kyuubi.session.name <undefined> A human readable name of the session and we use empty string by default. This name will be recorded in the event. Note that, we only apply this value from session conf. string 1.4.0
    kyuubi.session.timeout PT6H (deprecated)session timeout, it will be closed when it's not accessed for this duration duration 1.0.0
    kyuubi.session.user.sign.enabled false Whether to verify the integrity of session user name on the engine side, e.g. Authz plugin in Spark. boolean 1.7.0

    Spnego#

    Key Default Meaning Type Since
    kyuubi.spnego.keytab <undefined> Keytab file for SPNego principal string 1.6.0
    kyuubi.spnego.principal <undefined> SPNego service principal, typical value would look like HTTP/_HOST@EXAMPLE.COM. SPNego service principal would be used when restful Kerberos security is enabled. This needs to be set only if SPNEGO is to be used in authentication. string 1.6.0

    Zookeeper#

    Key Default Meaning Type Since
    kyuubi.zookeeper.embedded.client.port 2181 clientPort for the embedded ZooKeeper server to listen for client connections, a client here could be Kyuubi server, engine, and JDBC client int 1.2.0
    kyuubi.zookeeper.embedded.client.port.address <undefined> clientPortAddress for the embedded ZooKeeper server to string 1.2.0
    kyuubi.zookeeper.embedded.data.dir embedded_zookeeper dataDir for the embedded zookeeper server where stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database. string 1.2.0
    kyuubi.zookeeper.embedded.data.log.dir embedded_zookeeper dataLogDir for the embedded ZooKeeper server where writes the transaction log . string 1.2.0
    kyuubi.zookeeper.embedded.directory embedded_zookeeper The temporary directory for the embedded ZooKeeper server string 1.0.0
    kyuubi.zookeeper.embedded.max.client.connections 120 maxClientCnxns for the embedded ZooKeeper server to limit the number of concurrent connections of a single client identified by IP address int 1.2.0
    kyuubi.zookeeper.embedded.max.session.timeout 60000 maxSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 20 times the tickTime int 1.2.0
    kyuubi.zookeeper.embedded.min.session.timeout 6000 minSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 2 times the tickTime int 1.2.0
    kyuubi.zookeeper.embedded.port 2181 The port of the embedded ZooKeeper server int 1.0.0
    kyuubi.zookeeper.embedded.tick.time 3000 tickTime in milliseconds for the embedded ZooKeeper server int 1.2.0

    Spark Configurations#

    Via spark-defaults.conf#

    Setting them in $SPARK_HOME/conf/spark-defaults.conf supplies with default values for SQL engine application. Available properties can be found at Spark official online documentation for Spark Configurations

    Via kyuubi-defaults.conf#

    Setting them in $KYUUBI_HOME/conf/kyuubi-defaults.conf supplies with default values for SQL engine application too. These properties will override all settings in $SPARK_HOME/conf/spark-defaults.conf

    Via JDBC Connection URL#

    Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: jdbc:hive2://localhost:10009/default;#spark.sql.shuffle.partitions=2;spark.executor.memory=5g

    • Runtime SQL Configuration

    • Static SQL and Spark Core Configuration

      • For Static SQL Configurations and other spark core configs, e.g. spark.executor.memory, they will take effect if there is no existing SQL engine application. Otherwise, they will just be ignored

    Via SET Syntax#

    Please refer to the Spark official online documentation for SET Command

    Logging#

    Kyuubi uses log4j for logging. You can configure it using $KYUUBI_HOME/conf/log4j2.xml.

    <?xml version="1.0" encoding="UTF-8"?>
    <!--
      ~ Licensed to the Apache Software Foundation (ASF) under one or more
      ~ contributor license agreements.  See the NOTICE file distributed with
      ~ this work for additional information regarding copyright ownership.
      ~ The ASF licenses this file to You under the Apache License, Version 2.0
      ~ (the "License"); you may not use this file except in compliance with
      ~ the License.  You may obtain a copy of the License at
      ~
      ~     http://www.apache.org/licenses/LICENSE-2.0
      ~
      ~ Unless required by applicable law or agreed to in writing, software
      ~ distributed under the License is distributed on an "AS IS" BASIS,
      ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      ~ See the License for the specific language governing permissions and
      ~ limitations under the License.
      -->
    
    <!-- Provide log4j2.xml.template to fix `ERROR Filters contains invalid attributes "onMatch", "onMismatch"`, see KYUUBI-2247 -->
    <!-- Extra logging related to initialization of Log4j.
     Set to debug or trace if log4j initialization is failing. -->
    <Configuration status="INFO">
        <Properties>
            <Property name="restAuditLogPath">rest-audit.log</Property>
            <Property name="restAuditLogFilePattern">rest-audit-%d{yyyy-MM-dd}-%i.log</Property>
        </Properties>
        <Appenders>
            <Console name="stdout" target="SYSTEM_OUT">
                <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c: %m%n"/>
                <Filters>
                    <RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/>
                </Filters>
            </Console>
            <RollingFile name="restAudit" fileName="${sys:restAuditLogPath}"
                         filePattern="${sys:restAuditLogFilePattern}">
                <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c{1}: %m%n"/>
                <Policies>
                    <SizeBasedTriggeringPolicy size="51200KB" />
                </Policies>
                <DefaultRolloverStrategy max="10"/>
            </RollingFile>
        </Appenders>
        <Loggers>
            <Root level="INFO">
                <AppenderRef ref="stdout"/>
            </Root>
            <Logger name="org.apache.kyuubi.ctl.ServiceControlCli" level="error" additivity="false">
                <AppenderRef ref="stdout"/>
            </Logger>
            <!--
            <Logger name="org.apache.kyuubi.server.mysql.codec" level="trace" additivity="false">
                <AppenderRef ref="stdout"/>
            </Logger>
            -->
            <Logger name="org.apache.hive.beeline.KyuubiBeeLine" level="error" additivity="false">
                <AppenderRef ref="stdout"/>
            </Logger>
            <Logger name="org.apache.kyuubi.server.http.authentication.AuthenticationAuditLogger" additivity="false">
                <AppenderRef ref="restAudit" />
            </Logger>
        </Loggers>
    </Configuration>
    

    Other Configurations#

    Hadoop Configurations#

    Specifying HADOOP_CONF_DIR to the directory containing Hadoop configuration files or treating them as Spark properties with a spark.hadoop. prefix. Please refer to the Spark official online documentation for Inheriting Hadoop Cluster Configuration. Also, please refer to the Apache Hadoop’s online documentation for an overview on how to configure Hadoop.

    Hive Configurations#

    These configurations are used for SQL engine application to talk to Hive MetaStore and could be configured in a hive-site.xml. Placed it in $SPARK_HOME/conf directory, or treat them as Spark properties with a spark.hadoop. prefix.

    User Defaults#

    In Kyuubi, we can configure user default settings to meet separate needs. These user defaults override system defaults, but will be overridden by those from JDBC Connection URL or Set Command if could be. They will take effect when creating the SQL engine application ONLY. User default settings are in the form of ___{username}___.{config key}. There are three continuous underscores(_) at both sides of the username and a dot(.) that separates the config key and the prefix. For example:

    # For system defaults
    spark.master=local
    spark.sql.adaptive.enabled=true
    # For a user named kent
    ___kent___.spark.master=yarn
    ___kent___.spark.sql.adaptive.enabled=false
    # For a user named bob
    ___bob___.spark.master=spark://master:7077
    ___bob___.spark.executor.memory=8g
    

    In the above case, if there are related configurations from JDBC Connection URL, kent will run his SQL engine application on YARN and prefer the Spark AQE to be off, while bob will activate his SQL engine application on a Spark standalone cluster with 8g heap memory for each executor and obey the Spark AQE behavior of Kyuubi system default. On the other hand, for those users who do not have custom configurations will use system defaults.