Introduction to the Kyuubi Configurations System
Contents
Introduction to the Kyuubi Configurations System#
Kyuubi provides several ways to configure the system and corresponding engines.
Environments#
You can configure the environment variables in $KYUUBI_HOME/conf/kyuubi-env.sh
, e.g, JAVA_HOME
, then this java runtime will be used both for Kyuubi server instance and the applications it launches. You can also change the variable in the subprocess’s env configuration file, e.g.$SPARK_HOME/conf/spark-env.sh
to use more specific ENV for SQL engine applications.
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
# - JAVA_HOME Java runtime to use. By default use "java" from PATH.
#
#
# - KYUUBI_CONF_DIR Directory containing the Kyuubi configurations to use.
# (Default: $KYUUBI_HOME/conf)
# - KYUUBI_LOG_DIR Directory for Kyuubi server-side logs.
# (Default: $KYUUBI_HOME/logs)
# - KYUUBI_PID_DIR Directory stores the Kyuubi instance pid file.
# (Default: $KYUUBI_HOME/pid)
# - KYUUBI_MAX_LOG_FILES Maximum number of Kyuubi server logs can rotate to.
# (Default: 5)
# - KYUUBI_JAVA_OPTS JVM options for the Kyuubi server itself in the form "-Dx=y".
# (Default: none).
# - KYUUBI_CTL_JAVA_OPTS JVM options for the Kyuubi ctl itself in the form "-Dx=y".
# (Default: none).
# - KYUUBI_BEELINE_OPTS JVM options for the Kyuubi BeeLine in the form "-Dx=Y".
# (Default: none)
# - KYUUBI_NICENESS The scheduling priority for Kyuubi server.
# (Default: 0)
# - KYUUBI_WORK_DIR_ROOT Root directory for launching sql engine applications.
# (Default: $KYUUBI_HOME/work)
# - HADOOP_CONF_DIR Directory containing the Hadoop / YARN configuration to use.
# - YARN_CONF_DIR Directory containing the YARN configuration to use.
#
# - SPARK_HOME Spark distribution which you would like to use in Kyuubi.
# - SPARK_CONF_DIR Optional directory where the Spark configuration lives.
# (Default: $SPARK_HOME/conf)
# - FLINK_HOME Flink distribution which you would like to use in Kyuubi.
# - FLINK_CONF_DIR Optional directory where the Flink configuration lives.
# (Default: $FLINK_HOME/conf)
# - FLINK_HADOOP_CLASSPATH Required Hadoop jars when you use the Kyuubi Flink engine.
# - HIVE_HOME Hive distribution which you would like to use in Kyuubi.
# - HIVE_CONF_DIR Optional directory where the Hive configuration lives.
# (Default: $HIVE_HOME/conf)
# - HIVE_HADOOP_CLASSPATH Required Hadoop jars when you use the Kyuubi Hive engine.
#
## Examples ##
# export JAVA_HOME=/usr/jdk64/jdk1.8.0_152
# export SPARK_HOME=/opt/spark
# export FLINK_HOME=/opt/flink
# export HIVE_HOME=/opt/hive
# export FLINK_HADOOP_CLASSPATH=/path/to/hadoop-client-runtime-3.3.2.jar:/path/to/hadoop-client-api-3.3.2.jar
# export HIVE_HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.0.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.0.jar:${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar
# export HADOOP_CONF_DIR=/usr/ndp/current/mapreduce_client/conf
# export YARN_CONF_DIR=/usr/ndp/current/yarn/conf
# export KYUUBI_JAVA_OPTS="-Xmx10g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:MaxDirectMemorySize=1024m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -XX:NewRatio=3 -XX:MetaspaceSize=512m"
# export KYUUBI_BEELINE_OPTS="-Xmx2g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark"
For the environment variables that only needed to be transferred into engine side, you can set it with a Kyuubi configuration item formatted kyuubi.engineEnv.VAR_NAME
. For example, with kyuubi.engineEnv.SPARK_DRIVER_MEMORY=4g
, the environment variable SPARK_DRIVER_MEMORY
with value 4g
would be transferred into engine side. With kyuubi.engineEnv.SPARK_CONF_DIR=/apache/confs/spark/conf
, the value of SPARK_CONF_DIR
in engine side is set to /apache/confs/spark/conf
.
Kyuubi Configurations#
You can configure the Kyuubi properties in $KYUUBI_HOME/conf/kyuubi-defaults.conf
. For example:
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
## Kyuubi Configurations
#
# kyuubi.authentication NONE
# kyuubi.frontend.bind.host localhost
# kyuubi.frontend.bind.port 10009
#
# Details in https://kyuubi.apache.org/docs/latest/deployment/settings.html
Authentication#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.authentication | NONE | A comma separated list of client authentication types.
|
seq | 1.0.0 |
kyuubi.authentication.custom.class | <undefined> | User-defined authentication implementation of org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider | string | 1.3.0 |
kyuubi.authentication.jdbc.driver.class | <undefined> | Driver class name for JDBC Authentication Provider. | string | 1.6.0 |
kyuubi.authentication.jdbc.password | <undefined> | Database password for JDBC Authentication Provider. | string | 1.6.0 |
kyuubi.authentication.jdbc.query | <undefined> | Query SQL template with placeholders for JDBC Authentication Provider to execute. Authentication passes if the result set is not empty.The SQL statement must start with the SELECT clause. Available placeholders are ${user} and ${password} . |
string | 1.6.0 |
kyuubi.authentication.jdbc.url | <undefined> | JDBC URL for JDBC Authentication Provider. | string | 1.6.0 |
kyuubi.authentication.jdbc.user | <undefined> | Database user for JDBC Authentication Provider. | string | 1.6.0 |
kyuubi.authentication.ldap.base.dn | <undefined> | LDAP base DN. | string | 1.0.0 |
kyuubi.authentication.ldap.domain | <undefined> | LDAP domain. | string | 1.0.0 |
kyuubi.authentication.ldap.guidKey | uid | LDAP attribute name whose values are unique in this LDAP server.For example:uid or cn. | string | 1.2.0 |
kyuubi.authentication.ldap.url | <undefined> | SPACE character separated LDAP connection URL(s). | string | 1.0.0 |
kyuubi.authentication.sasl.qop | auth | Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
|
string | 1.0.0 |
Backend#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.backend.engine.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in SQL engine applications | duration | 1.0.0 |
kyuubi.backend.engine.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in SQL engine applications | duration | 1.0.0 |
kyuubi.backend.engine.exec.pool.size | 100 | Number of threads in the operation execution thread pool of SQL engine applications | int | 1.0.0 |
kyuubi.backend.engine.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool in SQL engine applications | int | 1.0.0 |
kyuubi.backend.server.event.json.log.path | file:///tmp/kyuubi/events | The location of server events go for the builtin JSON logger | string | 1.4.0 |
kyuubi.backend.server.event.loggers | A comma separated list of server history loggers, where session/operation etc events go.
|
seq | 1.4.0 | |
kyuubi.backend.server.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in Kyuubi server | duration | 1.0.0 |
kyuubi.backend.server.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in Kyuubi server | duration | 1.0.0 |
kyuubi.backend.server.exec.pool.size | 100 | Number of threads in the operation execution thread pool of Kyuubi server | int | 1.0.0 |
kyuubi.backend.server.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool of Kyuubi server | int | 1.0.0 |
Batch#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.batch.application.check.interval | PT5S | The interval to check batch job application information. | duration | 1.6.0 |
kyuubi.batch.conf.ignore.list | A comma separated list of ignored keys for batch conf. If the batch conf contains any of them, the key and the corresponding value will be removed silently during batch job submission. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering. You can also pre-define some config for batch job submission with prefix: kyuubi.batchConf.[batchType]. For example, you can pre-define spark.master for spark batch job with key kyuubi.batchConf.spark.spark.master . |
seq | 1.6.0 |
Credentials#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.credentials.check.interval | PT5M | The interval to check the expiration of cached |
duration | 1.6.0 |
kyuubi.credentials.hadoopfs.enabled | true | Whether to renew Hadoop filesystem delegation tokens | boolean | 1.4.0 |
kyuubi.credentials.hadoopfs.uris | Extra Hadoop filesystem URIs for which to request delegation tokens. The filesystem that hosts fs.defaultFS does not need to be listed here. | seq | 1.4.0 | |
kyuubi.credentials.hive.enabled | true | Whether to renew Hive metastore delegation token | boolean | 1.4.0 |
kyuubi.credentials.idle.timeout | PT6H | inactive users' credentials will be expired after a configured timeout | duration | 1.6.0 |
kyuubi.credentials.renewal.interval | PT1H | How often Kyuubi renews one user's delegation tokens | duration | 1.4.0 |
kyuubi.credentials.renewal.retry.wait | PT1M | How long to wait before retrying to fetch new credentials after a failure. | duration | 1.4.0 |
kyuubi.credentials.update.wait.timeout | PT1M | How long to wait until credentials are ready. | duration | 1.5.0 |
Ctl#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.ctl.batch.log.query.interval | PT3S | The interval for fetching batch logs. | duration | 1.6.0 |
kyuubi.ctl.rest.auth.schema | basic | The authentication schema. Valid values are: basic, spnego. | string | 1.6.0 |
kyuubi.ctl.rest.base.url | <undefined> | The REST API base URL, which contains the scheme (http:// or https://), host name, port number | string | 1.6.0 |
kyuubi.ctl.rest.connect.timeout | PT30S | The timeout[ms] for establishing the connection with the kyuubi server.A timeout value of zero is interpreted as an infinite timeout. | duration | 1.6.0 |
kyuubi.ctl.rest.request.attempt.wait | PT3S | How long to wait between attempts of ctl rest request. | duration | 1.6.0 |
kyuubi.ctl.rest.request.max.attempts | 3 | The max attempts number for ctl rest request. | int | 1.6.0 |
kyuubi.ctl.rest.socket.timeout | PT2M | The timeout[ms] for waiting for data packets after connection is established.A timeout value of zero is interpreted as an infinite timeout. | duration | 1.6.0 |
kyuubi.ctl.rest.spnego.host | <undefined> | When auth schema is spnego, need to config spnego host. | string | 1.6.0 |
Delegation#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.delegation.key.update.interval | PT24H | unused yet | duration | 1.0.0 |
kyuubi.delegation.token.gc.interval | PT1H | unused yet | duration | 1.0.0 |
kyuubi.delegation.token.max.lifetime | PT168H | unused yet | duration | 1.0.0 |
kyuubi.delegation.token.renew.interval | PT168H | unused yet | duration | 1.0.0 |
Engine#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.engine.connection.url.use.hostname | true | (deprecated) When true, engine register with hostname to zookeeper. When spark run on k8s with cluster mode, set to false to ensure that server can connect to engine | boolean | 1.3.0 |
kyuubi.engine.deregister.exception.classes | A comma separated list of exception classes. If there is any exception thrown, whose class matches the specified classes, the engine would deregister itself. | seq | 1.2.0 | |
kyuubi.engine.deregister.exception.messages | A comma separated list of exception messages. If there is any exception thrown, whose message or stacktrace matches the specified message list, the engine would deregister itself. | seq | 1.2.0 | |
kyuubi.engine.deregister.exception.ttl | PT30M | Time to live(TTL) for exceptions pattern specified in kyuubi.engine.deregister.exception.classes and kyuubi.engine.deregister.exception.messages to deregister engines. Once the total error count hits the kyuubi.engine.deregister.job.max.failures within the TTL, an engine will deregister itself and wait for self-terminated. Otherwise, we suppose that the engine has recovered from temporary failures. | duration | 1.2.0 |
kyuubi.engine.deregister.job.max.failures | 4 | Number of failures of job before deregistering the engine. | int | 1.2.0 |
kyuubi.engine.event.json.log.path | file:///tmp/kyuubi/events | The location of all the engine events go for the builtin JSON logger.
|
string | 1.3.0 |
kyuubi.engine.event.loggers | SPARK | A comma separated list of engine history loggers, where engine/session/operation etc events go. We use spark logger by default.
|
seq | 1.3.0 |
kyuubi.engine.flink.extra.classpath | <undefined> | The extra classpath for the flink sql engine, for configuring location of hadoop client jars, etc | string | 1.6.0 |
kyuubi.engine.flink.java.options | <undefined> | The extra java options for the flink sql engine | string | 1.6.0 |
kyuubi.engine.flink.memory | 1g | The heap memory for the flink sql engine | string | 1.6.0 |
kyuubi.engine.hive.extra.classpath | <undefined> | The extra classpath for the hive query engine, for configuring location of hadoop client jars, etc | string | 1.6.0 |
kyuubi.engine.hive.java.options | <undefined> | The extra java options for the hive query engine | string | 1.6.0 |
kyuubi.engine.hive.memory | 1g | The heap memory for the hive query engine | string | 1.6.0 |
kyuubi.engine.initialize.sql | SHOW DATABASES | SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. i.e. use SHOW DATABASES to eagerly active HiveClient. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. |
seq | 1.2.0 |
kyuubi.engine.jdbc.connection.password | <undefined> | The password is used for connecting to server | string | 1.6.0 |
kyuubi.engine.jdbc.connection.properties | The additional properties are used for connecting to server | seq | 1.6.0 | |
kyuubi.engine.jdbc.connection.provider | <undefined> | The connection provider is used for getting a connection from server | string | 1.6.0 |
kyuubi.engine.jdbc.connection.url | <undefined> | The server url that engine will connect to | string | 1.6.0 |
kyuubi.engine.jdbc.connection.user | <undefined> | The user is used for connecting to server | string | 1.6.0 |
kyuubi.engine.jdbc.driver.class | <undefined> | The driver class for jdbc engine connection | string | 1.6.0 |
kyuubi.engine.jdbc.extra.classpath | <undefined> | The extra classpath for the jdbc query engine, for configuring location of jdbc driver, etc | string | 1.6.0 |
kyuubi.engine.jdbc.java.options | <undefined> | The extra java options for the jdbc query engine | string | 1.6.0 |
kyuubi.engine.jdbc.memory | 1g | The heap memory for the jdbc query engine | string | 1.6.0 |
kyuubi.engine.jdbc.type | <undefined> | The short name of jdbc type | string | 1.6.0 |
kyuubi.engine.operation.convert.catalog.database.enabled | true | When set to true, The engine converts the JDBC methods of set/get Catalog and set/get Schema to the implementation of different engines | boolean | 1.6.0 |
kyuubi.engine.operation.log.dir.root | engine_operation_logs | Root directory for query operation log at engine-side. | string | 1.4.0 |
kyuubi.engine.pool.name | engine-pool | The name of engine pool. | string | 1.5.0 |
kyuubi.engine.pool.size | -1 | The size of engine pool. Note that, if the size is less than 1, the engine pool will not be enabled; otherwise, the size of the engine pool will be min(this, kyuubi.engine.pool.size.threshold). | int | 1.4.0 |
kyuubi.engine.pool.size.threshold | 9 | This parameter is introduced as a server-side parameter, and controls the upper limit of the engine pool. | int | 1.4.0 |
kyuubi.engine.session.initialize.sql | SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. | seq | 1.3.0 | |
kyuubi.engine.share.level | USER | Engines will be shared in different levels, available configs are:
|
string | 1.2.0 |
kyuubi.engine.share.level.sub.domain | <undefined> | (deprecated) - Using kyuubi.engine.share.level.subdomain instead | string | 1.2.0 |
kyuubi.engine.share.level.subdomain | <undefined> | Allow end-users to create a subdomain for the share level of an engine. A subdomain is a case-insensitive string values that must be a valid zookeeper sub path. For example, for USER share level, an end-user can share a certain engine within a subdomain, not for all of its clients. End-users are free to create multiple engines in the USER share level. When disable engine pool, use 'default' if absent. |
string | 1.4.0 |
kyuubi.engine.single.spark.session | false | When set to true, this engine is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database. | boolean | 1.3.0 |
kyuubi.engine.trino.extra.classpath | <undefined> | The extra classpath for the trino query engine, for configuring other libs which may need by the trino engine | string | 1.6.0 |
kyuubi.engine.trino.java.options | <undefined> | The extra java options for the trino query engine | string | 1.6.0 |
kyuubi.engine.trino.memory | 1g | The heap memory for the trino query engine | string | 1.6.0 |
kyuubi.engine.type | SPARK_SQL | Specify the detailed engine that supported by the Kyuubi. The engine type bindings to SESSION scope. This configuration is experimental. Currently, available configs are:
|
string | 1.4.0 |
kyuubi.engine.ui.retainedSessions | 200 | The number of SQL client sessions kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
kyuubi.engine.ui.retainedStatements | 200 | The number of statements kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
kyuubi.engine.ui.stop.enabled | true | When true, allows Kyuubi engine to be killed from the Spark Web UI. | boolean | 1.3.0 |
kyuubi.engine.user.isolated.spark.session | true | When set to false, if the engine is running in a group or server share level, all the JDBC/ODBC connections will be isolated against the user. Including: the temporary views, function registries, SQL configuration and the current database. Note that, it does not affect if the share level is connection or user. | boolean | 1.6.0 |
kyuubi.engine.user.isolated.spark.session.idle.interval | PT1M | The interval to check if the user isolated spark session is timeout. | duration | 1.6.0 |
kyuubi.engine.user.isolated.spark.session.idle.timeout | PT6H | If kyuubi.engine.user.isolated.spark.session is false, we will release the spark session if its corresponding user is inactive after this configured timeout. | duration | 1.6.0 |
Frontend#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.frontend.backoff.slot.length | PT0.1S | (deprecated) Time to back off during login to the thrift frontend service. | duration | 1.0.0 |
kyuubi.frontend.bind.host | <undefined> | (deprecated) Hostname or IP of the machine on which to run the thrift frontend service via binary protocol. | string | 1.0.0 |
kyuubi.frontend.bind.port | 10009 | (deprecated) Port of the machine on which to run the thrift frontend service via binary protocol. | int | 1.0.0 |
kyuubi.frontend.connection.url.use.hostname | true | When true, frontend services prefer hostname, otherwise, ip address | boolean | 1.5.0 |
kyuubi.frontend.login.timeout | PT20S | (deprecated) Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.0.0 |
kyuubi.frontend.max.message.size | 104857600 | (deprecated) Maximum message size in bytes a Kyuubi server will accept. | int | 1.0.0 |
kyuubi.frontend.max.worker.threads | 999 | (deprecated) Maximum number of threads in the of frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
kyuubi.frontend.min.worker.threads | 9 | (deprecated) Minimum number of threads in the of frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
kyuubi.frontend.mysql.bind.host | <undefined> | Hostname or IP of the machine on which to run the MySQL frontend service. | string | 1.4.0 |
kyuubi.frontend.mysql.bind.port | 3309 | Port of the machine on which to run the MySQL frontend service. | int | 1.4.0 |
kyuubi.frontend.mysql.max.worker.threads | 999 | Maximum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
kyuubi.frontend.mysql.min.worker.threads | 9 | Minimum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
kyuubi.frontend.mysql.netty.worker.threads | <undefined> | Number of thread in the netty worker event loop of MySQL frontend service. Use min(cpu_cores, 8) in default. | int | 1.4.0 |
kyuubi.frontend.mysql.worker.keepalive.time | PT1M | Time(ms) that an idle async thread of the command execution thread pool will wait for a new task to arrive before terminating in MySQL frontend service | duration | 1.4.0 |
kyuubi.frontend.protocols | THRIFT_BINARY | A comma separated list for all frontend protocols
|
seq | 1.4.0 |
kyuubi.frontend.proxy.http.client.ip.header | X-Real-IP | The http header to record the real client ip address. If your server is behind a load balancer or other proxy, the server will see this load balancer or proxy IP address as the client IP address, to get around this common issue, most load balancers or proxies offer the ability to record the real remote IP address in an HTTP header that will be added to the request for other devices to use. Note that, because the header value can be specified to any ip address, so it will not be used for authentication. | string | 1.6.0 |
kyuubi.frontend.rest.bind.host | <undefined> | Hostname or IP of the machine on which to run the REST frontend service. | string | 1.4.0 |
kyuubi.frontend.rest.bind.port | 10099 | Port of the machine on which to run the REST frontend service. | int | 1.4.0 |
kyuubi.frontend.thrift.backoff.slot.length | PT0.1S | Time to back off during login to the thrift frontend service. | duration | 1.4.0 |
kyuubi.frontend.thrift.binary.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via binary protocol. | string | 1.4.0 |
kyuubi.frontend.thrift.binary.bind.port | 10009 | Port of the machine on which to run the thrift frontend service via binary protocol. | int | 1.4.0 |
kyuubi.frontend.thrift.http.allow.user.substitution | true | Allow alternate user to be specified as part of open connection request when using HTTP transport mode. | boolean | 1.6.0 |
kyuubi.frontend.thrift.http.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via http protocol. | string | 1.6.0 |
kyuubi.frontend.thrift.http.bind.port | 10010 | Port of the machine on which to run the thrift frontend service via http protocol. | int | 1.6.0 |
kyuubi.frontend.thrift.http.compression.enabled | true | Enable thrift http compression via Jetty compression support | boolean | 1.6.0 |
kyuubi.frontend.thrift.http.cookie.auth.enabled | true | When true, Kyuubi in HTTP transport mode, will use cookie based authentication mechanism | boolean | 1.6.0 |
kyuubi.frontend.thrift.http.cookie.domain | <undefined> | Domain for the Kyuubi generated cookies | string | 1.6.0 |
kyuubi.frontend.thrift.http.cookie.is.httponly | true | HttpOnly attribute of the Kyuubi generated cookie. | boolean | 1.6.0 |
kyuubi.frontend.thrift.http.cookie.max.age | 86400 | Maximum age in seconds for server side cookie used by Kyuubi in HTTP mode. | int | 1.6.0 |
kyuubi.frontend.thrift.http.cookie.path | <undefined> | Path for the Kyuubi generated cookies | string | 1.6.0 |
kyuubi.frontend.thrift.http.max.idle.time | PT30M | Maximum idle time for a connection on the server when in HTTP mode. | duration | 1.6.0 |
kyuubi.frontend.thrift.http.path | cliservice | Path component of URL endpoint when in HTTP mode. | string | 1.6.0 |
kyuubi.frontend.thrift.http.request.header.size | 6144 | Request header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
kyuubi.frontend.thrift.http.response.header.size | 6144 | Response header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
kyuubi.frontend.thrift.http.ssl.keystore.password | <undefined> | SSL certificate keystore password. | string | 1.6.0 |
kyuubi.frontend.thrift.http.ssl.keystore.path | <undefined> | SSL certificate keystore location. | string | 1.6.0 |
kyuubi.frontend.thrift.http.ssl.protocol.blacklist | SSLv2,SSLv3 | SSL Versions to disable when using HTTP transport mode. | string | 1.6.0 |
kyuubi.frontend.thrift.http.use.SSL | false | Set this to true for using SSL encryption in http mode. | boolean | 1.6.0 |
kyuubi.frontend.thrift.http.xsrf.filter.enabled | false | If enabled, Kyuubi will block any requests made to it over http if an X-XSRF-HEADER header is not present | boolean | 1.6.0 |
kyuubi.frontend.thrift.login.timeout | PT20S | Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.4.0 |
kyuubi.frontend.thrift.max.message.size | 104857600 | Maximum message size in bytes a Kyuubi server will accept. | int | 1.4.0 |
kyuubi.frontend.thrift.max.worker.threads | 999 | Maximum number of threads in the of frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
kyuubi.frontend.thrift.min.worker.threads | 9 | Minimum number of threads in the of frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
kyuubi.frontend.thrift.worker.keepalive.time | PT1M | Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.4.0 |
kyuubi.frontend.worker.keepalive.time | PT1M | (deprecated) Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.0.0 |
Ha#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.ha.addresses | The connection string for the discovery ensemble | string | 1.6.0 | |
kyuubi.ha.client.class | org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient | Class name for service discovery client.
|
string | 1.6.0 |
kyuubi.ha.etcd.lease.timeout | PT10S | Timeout for etcd keep alive lease. The kyuubi server will known unexpected loss of engine after up to this seconds. | duration | 1.6.0 |
kyuubi.ha.etcd.ssl.ca.path | <undefined> | Where the etcd CA certificate file is stored. | string | 1.6.0 |
kyuubi.ha.etcd.ssl.client.certificate.path | <undefined> | Where the etcd SSL certificate file is stored. | string | 1.6.0 |
kyuubi.ha.etcd.ssl.client.key.path | <undefined> | Where the etcd SSL key file is stored. | string | 1.6.0 |
kyuubi.ha.etcd.ssl.enabled | false | When set to true, will build a ssl secured etcd client. | boolean | 1.6.0 |
kyuubi.ha.namespace | kyuubi | The root directory for the service to deploy its instance uri | string | 1.6.0 |
kyuubi.ha.zookeeper.acl.enabled | false | Set to true if the zookeeper ensemble is kerberized | boolean | 1.0.0 |
kyuubi.ha.zookeeper.auth.digest | <undefined> | The digest auth string is used for zookeeper authentication, like: username:password. | string | 1.3.2 |
kyuubi.ha.zookeeper.auth.keytab | <undefined> | Location of Kyuubi server's keytab is used for zookeeper authentication. | string | 1.3.2 |
kyuubi.ha.zookeeper.auth.principal | <undefined> | Name of the Kerberos principal is used for zookeeper authentication. | string | 1.3.2 |
kyuubi.ha.zookeeper.auth.type | NONE | The type of zookeeper authentication, all candidates are
|
string | 1.3.2 |
kyuubi.ha.zookeeper.connection.base.retry.wait | 1000 | Initial amount of time to wait between retries to the zookeeper ensemble | int | 1.0.0 |
kyuubi.ha.zookeeper.connection.max.retries | 3 | Max retry times for connecting to the zookeeper ensemble | int | 1.0.0 |
kyuubi.ha.zookeeper.connection.max.retry.wait | 30000 | Max amount of time to wait between retries for BOUNDED_EXPONENTIAL_BACKOFF policy can reach, or max time until elapsed for UNTIL_ELAPSED policy to connect the zookeeper ensemble | int | 1.0.0 |
kyuubi.ha.zookeeper.connection.retry.policy | EXPONENTIAL_BACKOFF | The retry policy for connecting to the zookeeper ensemble, all candidates are:
|
string | 1.0.0 |
kyuubi.ha.zookeeper.connection.timeout | 15000 | The timeout(ms) of creating the connection to the zookeeper ensemble | int | 1.0.0 |
kyuubi.ha.zookeeper.engine.auth.type | NONE | The type of zookeeper authentication for engine, all candidates are
|
string | 1.3.2 |
kyuubi.ha.zookeeper.namespace | kyuubi | (deprecated) The root directory for the service to deploy its instance uri | string | 1.0.0 |
kyuubi.ha.zookeeper.node.creation.timeout | PT2M | Timeout for creating zookeeper node | duration | 1.2.0 |
kyuubi.ha.zookeeper.publish.configs | false | When set to true, publish Kerberos configs to Zookeeper.Note that the Hive driver needs to be greater than 1.3 or 2.0 or apply HIVE-11581 patch. | boolean | 1.4.0 |
kyuubi.ha.zookeeper.quorum | (deprecated) The connection string for the zookeeper ensemble | string | 1.0.0 | |
kyuubi.ha.zookeeper.session.timeout | 60000 | The timeout(ms) of a connected session to be idled | int | 1.0.0 |
Kinit#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.kinit.interval | PT1H | How often will Kyuubi server run kinit -kt [keytab] [principal] to renew the local Kerberos credentials cache |
duration | 1.0.0 |
kyuubi.kinit.keytab | <undefined> | Location of Kyuubi server's keytab. | string | 1.0.0 |
kyuubi.kinit.max.attempts | 10 | How many times will kinit process retry |
int | 1.0.0 |
kyuubi.kinit.principal | <undefined> | Name of the Kerberos principal. | string | 1.0.0 |
Kubernetes#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.kubernetes.context | <undefined> | The desired context from your kubernetes config file used to configure the K8S client for interacting with the cluster. | string | 1.6.0 |
Metadata#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.metadata.cleaner.enabled | true | Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in terminate state with max age limitation. | boolean | 1.6.0 |
kyuubi.metadata.cleaner.interval | PT30M | The interval to check and clean expired metadata. | duration | 1.6.0 |
kyuubi.metadata.max.age | PT72H | The maximum age of metadata, the metadata that exceeds the age will be cleaned. | duration | 1.6.0 |
kyuubi.metadata.recovery.threads | 10 | The number of threads for recovery from metadata store when Kyuubi server restarting. | int | 1.6.0 |
kyuubi.metadata.request.retry.interval | PT5S | The interval to check and trigger the metadata request retry tasks. | duration | 1.6.0 |
kyuubi.metadata.request.retry.queue.size | 65536 | The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. | int | 1.6.0 |
kyuubi.metadata.request.retry.threads | 10 | Number of threads in the metadata request retry manager thread pool. The metadata store might be unavailable sometimes and the requests will fail, to tolerant for this case and unblock the main thread, we support to retry the failed requests in async way. | int | 1.6.0 |
kyuubi.metadata.store.class | org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore | Fully qualified class name for server metadata store. | string | 1.6.0 |
kyuubi.metadata.store.jdbc.database.schema.init | true | Whether to init the jdbc metadata store database schema. | boolean | 1.6.0 |
kyuubi.metadata.store.jdbc.database.type | DERBY | The database type for server jdbc metadata store.
|
string | 1.6.0 |
kyuubi.metadata.store.jdbc.driver | <undefined> | JDBC driver class name for server jdbc metadata store. | string | 1.6.0 |
kyuubi.metadata.store.jdbc.password | The password for server jdbc metadata store. | string | 1.6.0 | |
kyuubi.metadata.store.jdbc.url | jdbc:derby:memory:kyuubi_state_store_db;create=true | The jdbc url for server jdbc metadata store. By defaults, it is a DERBY in-memory database url, and the state information is not shared across kyuubi instances. To enable multiple kyuubi instances high available, please specify a production jdbc url. | string | 1.6.0 |
kyuubi.metadata.store.jdbc.user | The username for server jdbc metadata store. | string | 1.6.0 |
Metrics#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.metrics.console.interval | PT5S | How often should report metrics to console | duration | 1.2.0 |
kyuubi.metrics.enabled | true | Set to true to enable kyuubi metrics system | boolean | 1.2.0 |
kyuubi.metrics.json.interval | PT5S | How often should report metrics to json file | duration | 1.2.0 |
kyuubi.metrics.json.location | metrics | Where the json metrics file located | string | 1.2.0 |
kyuubi.metrics.prometheus.path | /metrics | URI context path of prometheus metrics HTTP server | string | 1.2.0 |
kyuubi.metrics.prometheus.port | 10019 | Prometheus metrics HTTP server port | int | 1.2.0 |
kyuubi.metrics.reporters | JSON | A comma separated list for all metrics reporters
|
seq | 1.2.0 |
kyuubi.metrics.slf4j.interval | PT5S | How often should report metrics to SLF4J logger | duration | 1.2.0 |
Operation#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.operation.idle.timeout | PT3H | Operation will be closed when it's not accessed for this duration of time | duration | 1.0.0 |
kyuubi.operation.interrupt.on.cancel | true | When true, all running tasks will be interrupted if one cancels a query. When false, all running tasks will remain until finished. | boolean | 1.2.0 |
kyuubi.operation.language | SQL | Choose a programing language for the following inputs
|
string | 1.5.0 |
kyuubi.operation.log.dir.root | server_operation_logs | Root directory for query operation log at server-side. | string | 1.4.0 |
kyuubi.operation.plan.only.excludes | ResetCommand,SetCommand,SetNamespaceCommand,UseStatement,SetCatalogAndNamespace | Comma-separated list of query plan names, in the form of simple class names, i.e, for set abc=xyz , the value will be SetCommand . For those auxiliary plans, such as switch databases , set properties , or create temporary view e.t.c, which are used for setup evaluating environments for analyzing actual queries, we can use this config to exclude them and let them take effect. See also kyuubi.operation.plan.only.mode. |
seq | 1.5.0 |
kyuubi.operation.plan.only.mode | NONE | Whether to perform the statement in a PARSE, ANALYZE, OPTIMIZE, PHYSICAL, EXECUTION only way without executing the query. When it is NONE, the statement will be fully executed | string | 1.4.0 |
kyuubi.operation.progress.enabled | false | Whether to enable the operation progress. When true, the operation progress will be returned in GetOperationStatus . |
boolean | 1.6.0 |
kyuubi.operation.query.timeout | <undefined> | Timeout for query executions at server-side, take affect with client-side timeout(java.sql.Statement.setQueryTimeout ) together, a running query will be cancelled automatically if timeout. It's off by default, which means only client-side take fully control whether the query should timeout or not. If set, client-side timeout capped at this point. To cancel the queries right away without waiting task to finish, consider enabling kyuubi.operation.interrupt.on.cancel together. |
duration | 1.2.0 |
kyuubi.operation.result.max.rows | 0 | Max rows of Spark query results. Rows that exceeds the limit would be ignored. By setting this value to 0 to disable the max rows limit. | int | 1.6.0 |
kyuubi.operation.scheduler.pool | <undefined> | The scheduler pool of job. Note that, this config should be used after change Spark config spark.scheduler.mode=FAIR. | string | 1.1.1 |
kyuubi.operation.spark.listener.enabled | true | When set to true, Spark engine registers a SQLOperationListener before executing the statement, logs a few summary statistics when each stage completes. | boolean | 1.6.0 |
kyuubi.operation.status.polling.timeout | PT5S | Timeout(ms) for long polling asynchronous running sql query's status | duration | 1.0.0 |
Server#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.server.limit.connections.per.ipaddress | <undefined> | Maximum kyuubi server connections per ipaddress. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
kyuubi.server.limit.connections.per.user | <undefined> | Maximum kyuubi server connections per user. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
kyuubi.server.limit.connections.per.user.ipaddress | <undefined> | Maximum kyuubi server connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. | int | 1.6.0 |
kyuubi.server.name | <undefined> | The name of Kyuubi Server. | string | 1.5.0 |
kyuubi.server.redaction.regex | <undefined> | Regex to decide which Kyuubi contain sensitive information. When this regex matches a property key or value, the value is redacted from the various logs. | 1.6.0 |
Session#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.session.check.interval | PT5M | The check interval for session timeout. | duration | 1.0.0 |
kyuubi.session.conf.advisor | <undefined> | A config advisor plugin for Kyuubi Server. This plugin can provide some custom configs for different user or session configs and overwrite the session configs before open a new session. This config value should be a class which is a child of 'org.apache.kyuubi.plugin.SessionConfAdvisor' which has zero-arg constructor. | string | 1.5.0 |
kyuubi.session.conf.ignore.list | A comma separated list of ignored keys. If the client connection contains any of them, the key and the corresponding value will be removed silently during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | seq | 1.2.0 | |
kyuubi.session.conf.restrict.list | A comma separated list of restricted keys. If the client connection contains any of them, the connection will be rejected explicitly during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | seq | 1.2.0 | |
kyuubi.session.engine.alive.probe.enabled | false | Whether to enable the engine alive probe, it true, we will create a companion thrift client that sends simple request to check whether the engine is keep alive. | boolean | 1.6.0 |
kyuubi.session.engine.alive.probe.interval | PT10S | The interval for engine alive probe. | duration | 1.6.0 |
kyuubi.session.engine.alive.timeout | PT2M | The timeout for engine alive. If there is no alive probe success in the last timeout window, the engine will be marked as no-alive. | duration | 1.6.0 |
kyuubi.session.engine.check.interval | PT1M | The check interval for engine timeout | duration | 1.0.0 |
kyuubi.session.engine.flink.main.resource | <undefined> | The package used to create Flink SQL engine remote job. If it is undefined, Kyuubi will use the default | string | 1.4.0 |
kyuubi.session.engine.flink.max.rows | 1000000 | Max rows of Flink query results. For batch queries, rows that exceeds the limit would be ignored. For streaming queries, the query would be canceled if the limit is reached. | int | 1.5.0 |
kyuubi.session.engine.hive.main.resource | <undefined> | The package used to create Hive engine remote job. If it is undefined, Kyuubi will use the default | string | 1.6.0 |
kyuubi.session.engine.idle.timeout | PT30M | engine timeout, the engine will self-terminate when it's not accessed for this duration. 0 or negative means not to self-terminate. | duration | 1.0.0 |
kyuubi.session.engine.initialize.timeout | PT3M | Timeout for starting the background engine, e.g. SparkSQLEngine. | duration | 1.0.0 |
kyuubi.session.engine.launch.async | true | When opening kyuubi session, whether to launch backend engine asynchronously. When true, the Kyuubi server will set up the connection with the client without delay as the backend engine will be created asynchronously. | boolean | 1.4.0 |
kyuubi.session.engine.log.timeout | PT24H | If we use Spark as the engine then the session submit log is the console output of spark-submit. We will retain the session submit log until over the config value. | duration | 1.1.0 |
kyuubi.session.engine.login.timeout | PT15S | The timeout of creating the connection to remote sql query engine | duration | 1.0.0 |
kyuubi.session.engine.share.level | USER | (deprecated) - Using kyuubi.engine.share.level instead | string | 1.0.0 |
kyuubi.session.engine.spark.main.resource | <undefined> | The package used to create Spark SQL engine remote application. If it is undefined, Kyuubi will use the default | string | 1.0.0 |
kyuubi.session.engine.spark.max.lifetime | PT0S | Max lifetime for spark engine, the engine will self-terminate when it reaches the end of life. 0 or negative means not to self-terminate. | duration | 1.6.0 |
kyuubi.session.engine.spark.progress.timeFormat | yyyy-MM-dd HH:mm:ss.SSS | The time format of the progress bar | string | 1.6.0 |
kyuubi.session.engine.spark.progress.update.interval | PT1S | Update period of progress bar. | duration | 1.6.0 |
kyuubi.session.engine.spark.showProgress | false | When true, show the progress bar in the spark engine log. | boolean | 1.6.0 |
kyuubi.session.engine.startup.error.max.size | 8192 | During engine bootstrapping, if error occurs, using this config to limit the length error message(characters). | int | 1.1.0 |
kyuubi.session.engine.startup.maxLogLines | 10 | The maximum number of engine log lines when errors occur during engine startup phase. Note that this max lines is for client-side to help track engine startup issue. | int | 1.4.0 |
kyuubi.session.engine.startup.waitCompletion | true | Whether to wait for completion after engine starts. If false, the startup process will be destroyed after the engine is started. Note that only use it when the driver is not running locally, such as yarn-cluster mode; Otherwise, the engine will be killed. | boolean | 1.5.0 |
kyuubi.session.engine.trino.connection.catalog | <undefined> | The default catalog that trino engine will connect to | string | 1.5.0 |
kyuubi.session.engine.trino.connection.url | <undefined> | The server url that trino engine will connect to | string | 1.5.0 |
kyuubi.session.engine.trino.main.resource | <undefined> | The package used to create Trino engine remote job. If it is undefined, Kyuubi will use the default | string | 1.5.0 |
kyuubi.session.engine.trino.showProgress | true | When true, show the progress bar and final info in the trino engine log. | boolean | 1.6.0 |
kyuubi.session.engine.trino.showProgress.debug | false | When true, show the progress debug info in the trino engine log. | boolean | 1.6.0 |
kyuubi.session.idle.timeout | PT6H | session idle timeout, it will be closed when it's not accessed for this duration | duration | 1.2.0 |
kyuubi.session.local.dir.allow.list | The local dir list that are allowed to access by the kyuubi session application. User might set some parameters such as spark.files and it will upload some local files when launching the kyuubi engine, if the local dir allow list is defined, kyuubi will check whether the path to upload is in the allow list. Note that, if it is empty, there is no limitation for that and please use absolute path list. |
seq | 1.6.0 | |
kyuubi.session.name | <undefined> | A human readable name of session and we use empty string by default. This name will be recorded in event. Note that, we only apply this value from session conf. | string | 1.4.0 |
kyuubi.session.timeout | PT6H | (deprecated)session timeout, it will be closed when it's not accessed for this duration | duration | 1.0.0 |
Spnego#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.spnego.keytab | <undefined> | Keytab file for SPNego principal | string | 1.6.0 |
kyuubi.spnego.principal | <undefined> | SPNego service principal, typical value would look like HTTP/_HOST@EXAMPLE.COM. SPNego service principal would be used when restful Kerberos security is enabled. This needs to be set only if SPNEGO is to be used in authentication. | string | 1.6.0 |
Zookeeper#
Key | Default | Meaning | Type | Since |
---|---|---|---|---|
kyuubi.zookeeper.embedded.client.port | 2181 | clientPort for the embedded zookeeper server to listen for client connections, a client here could be Kyuubi server, engine and JDBC client | int | 1.2.0 |
kyuubi.zookeeper.embedded.client.port.address | <undefined> | clientPortAddress for the embedded zookeeper server to | string | 1.2.0 |
kyuubi.zookeeper.embedded.data.dir | embedded_zookeeper | dataDir for the embedded zookeeper server where stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database. | string | 1.2.0 |
kyuubi.zookeeper.embedded.data.log.dir | embedded_zookeeper | dataLogDir for the embedded zookeeper server where writes the transaction log . | string | 1.2.0 |
kyuubi.zookeeper.embedded.directory | embedded_zookeeper | The temporary directory for the embedded zookeeper server | string | 1.0.0 |
kyuubi.zookeeper.embedded.max.client.connections | 120 | maxClientCnxns for the embedded zookeeper server to limits the number of concurrent connections of a single client identified by IP address | int | 1.2.0 |
kyuubi.zookeeper.embedded.max.session.timeout | 60000 | maxSessionTimeout in milliseconds for the embedded zookeeper server will allow the client to negotiate. Defaults to 20 times the tickTime | int | 1.2.0 |
kyuubi.zookeeper.embedded.min.session.timeout | 6000 | minSessionTimeout in milliseconds for the embedded zookeeper server will allow the client to negotiate. Defaults to 2 times the tickTime | int | 1.2.0 |
kyuubi.zookeeper.embedded.port | 2181 | The port of the embedded zookeeper server | int | 1.0.0 |
kyuubi.zookeeper.embedded.tick.time | 3000 | tickTime in milliseconds for the embedded zookeeper server | int | 1.2.0 |
Spark Configurations#
Via spark-defaults.conf#
Setting them in $SPARK_HOME/conf/spark-defaults.conf
supplies with default values for SQL engine application. Available properties can be found at Spark official online documentation for Spark Configurations
Via kyuubi-defaults.conf#
Setting them in $KYUUBI_HOME/conf/kyuubi-defaults.conf
supplies with default values for SQL engine application too. These properties will override all settings in $SPARK_HOME/conf/spark-defaults.conf
Via JDBC Connection URL#
Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: jdbc:hive2://localhost:10009/default;#spark.sql.shuffle.partitions=2;spark.executor.memory=5g
Runtime SQL Configuration
For Runtime SQL Configurations, they will take affect every time
Static SQL and Spark Core Configuration
For Static SQL Configurations and other spark core configs, e.g.
spark.executor.memory
, they will take affect if there is no existing SQL engine application. Otherwise, they will just be ignored
Via SET Syntax#
Please refer to the Spark official online documentation for SET Command
Flink Configurations#
Via flink-conf.yaml#
Setting them in $FLINK_HOME/conf/flink-conf.yaml
supplies with default values for SQL engine application. Available properties can be found at Flink official online documentation for Flink Configurations
Via kyuubi-defaults.conf#
Setting them in $KYUUBI_HOME/conf/kyuubi-defaults.conf
supplies with default values for SQL engine application too. You can use properties with the additional prefix flink.
to override settings in $FLINK_HOME/conf/flink-conf.yaml
.
For example:
flink.parallelism.default 2
flink.taskmanager.memory.process.size 5g
The below options in kyuubi-defaults.conf
will set parallelism.default: 2
and taskmanager.memory.process.size: 5g
into flink configurations.
Via JDBC Connection URL#
Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: jdbc:hive2://localhost:10009/default;#parallelism.default=2;taskmanager.memory.process.size=5g
Via SET Statements#
Please refer to the Flink official online documentation for SET Statements
Logging#
Kyuubi uses log4j for logging. You can configure it using $KYUUBI_HOME/conf/log4j2.xml
.
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<!-- Provide log4j2.xml.template to fix `ERROR Filters contains invalid attributes "onMatch", "onMismatch"`, see KYUUBI-2247 -->
<!-- Extra logging related to initialization of Log4j.
Set to debug or trace if log4j initialization is failing. -->
<Configuration status="INFO">
<Appenders>
<Console name="stdout" target="SYSTEM_OUT">
<PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c: %m%n"/>
<Filters>
<RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/>
</Filters>
</Console>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="stdout"/>
</Root>
<Logger name="org.apache.kyuubi.ctl.ServiceControlCli" level="error" additivity="false">
<AppenderRef ref="stdout"/>
</Logger>
<!--
<Logger name="org.apache.kyuubi.server.mysql.codec" level="trace" additivity="false">
<AppenderRef ref="stdout"/>
</Logger>
-->
<Logger name="org.apache.hive.beeline.KyuubiBeeLine" level="error" additivity="false">
<AppenderRef ref="stdout"/>
</Logger>
</Loggers>
</Configuration>
Other Configurations#
Hadoop Configurations#
Specifying HADOOP_CONF_DIR
to the directory contains hadoop configuration files or treating them as Spark properties with a spark.hadoop.
prefix. Please refer to the Spark official online documentation for Inheriting Hadoop Cluster Configuration. Also, please refer to the Apache Hadoop’s online documentation for an overview on how to configure Hadoop.
Hive Configurations#
These configurations are used for SQL engine application to talk to Hive MetaStore and could be configured in a hive-site.xml
. Placed it in $SPARK_HOME/conf
directory, or treating them as Spark properties with a spark.hadoop.
prefix.
User Defaults#
In Kyuubi, we can configure user default settings to meet separate needs. These user defaults override system defaults, but will be overridden by those from JDBC Connection URL or Set Command if could be. They will take effect when creating the SQL engine application ONLY.
User default settings are in the form of ___{username}___.{config key}
. There are three continuous underscores(_
) at both sides of the username
and a dot(.
) that separates the config key and the prefix. For example:
# For system defaults
spark.master=local
spark.sql.adaptive.enabled=true
# For a user named kent
___kent___.spark.master=yarn
___kent___.spark.sql.adaptive.enabled=false
# For a user named bob
___bob___.spark.master=spark://master:7077
___bob___.spark.executor.memory=8g
In the above case, if there are related configurations from JDBC Connection URL, kent
will run his SQL engine application on YARN and prefer the Spark AQE to be off, while bob
will activate his SQL engine application on a Spark standalone cluster with 8g heap memory for each executor and obey the Spark AQE behavior of Kyuubi system default. On the other hand, for those users who do not have custom configurations will use system defaults.