Kyuubi AuthZ Plugin For Spark SQL#
Security is one of the fundamental features for enterprise adoption with Kyuubi. When deploying Kyuubi against secured clusters, storage-based authorization is enabled by default, which only provides file-level coarse-grained authorization mode. When row/column-level fine-grained access control is required, we can enhance the data access model with the Kyuubi Spark AuthZ plugin.
The Plugin Itself#
Kyuubi Spark Authz Plugin itself provides general purpose for ACL management for data & metadata while using Spark SQL. It is not necessary to deploy it with the Kyuubi server and engine, and can be used as an extension for any Spark SQL jobs. However, the authorization always requires a robust authentication layer and multi tenancy support, so Kyuubi is a perfect match.
Restrict security configuration#
End-users can disable the AuthZ plugin by modifying Spark’s configuration. For example:
select * from parquet.`/path/to/table`
set spark.sql.optimizer.excludedRules=org.apache.kyuubi.plugin.spark.authz.ranger.RuleAuthorization
Kyuubi provides a mechanism to ban security configurations to enhance the security of production environments
Note
How do we modify the Spark engine configurations please refer to the documentation Spark Configurations
Restrict session level config#
You can specify config kyuubi.session.conf.ignore.list values and config kyuubi.session.conf.restrict.list values to disable changing session+ level configuration on the server side. For example:
kyuubi.session.conf.ignore.list spark.driver.memory,spark.sql.optimizer.excludedRules
kyuubi.session.conf.restrict.list spark.driver.memory,spark.sql.optimizer.excludedRules
Restrict operation level config#
You can specify config spark.kyuubi.conf.restricted.list values to disable changing operation level configuration on the engine side, this means that the config key in the restricted list cannot set dynamic configuration via SET syntax. For examples:
spark.kyuubi.conf.restricted.list spark.sql.adaptive.enabled,spark.sql.adaptive.skewJoin.enabled
Note
Note that config spark.sql.runSQLOnFiles values and config spark.sql.extensions values are by default in the engine restriction configuration list
A set statement with key equal to spark.sql.optimizer.excludedRules and value containing org.apache.kyuubi.plugin.spark.authz.ranger.* also does not allow modification.