spark.executor.instances should default to spark.dynamicAllocation.minExecutors for Spark Dynamic configurations

Description

Spark Static mode uses spark.executor.instances to allocate that number of executors. For dynamic allocation there are three configuration properties for configuring number of executors, namely spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors, spark.dynamicAllocation.maxExecutors.

In the case of spark.dynamicAllocation.initialExecutors it can be overriden if spark.executor.instances is set and larger than spark.dynamicAllocation.initialExecutors, see https://spark.apache.org/docs/latest/configuration.html. This may lead to a situation where spark.dynamicAllocation.initialExecutors is greater than spark.dynamicAllocation.maxExecutors and this will fail the job.

Currently when you configure a job the JobConfiguration JSON is shared in the background (sparkConfigCtrl.js) for all modes (Experiment, ParallExp, Spark static etc..) which means that you can set the number of executors being spark.executor.instances=3 in the Spark Static mode, and then switch to Experiment where spark.executor.maxExecutors is 1. Running with this configuation will fail the job.

To solve this problem we can set spark.executor.instances to spark.dynamicAllocation.minExecutors value when running Spark dynamic/Experiment/Parallel Exp/DIst training mode to prevent users from accidently creating an invalid configuration.

spark.dynamicAllocation.initialExecutors field should also be modifiable when configuring a job.

Assignee

Robin

Reporter

Jim Dowling

Labels

None

Fix versions

Affects versions

Priority

Medium
Configure