Submit

submit is design to create a batch on livy server. It sends request to livy, get response and start watching for logs (like what’s happening in Read log).

In original use case, I’m designing and utilizing this tool for sending local on to the server as a test. The local files are of course not accessible to the livy server. Therefore it comes with Plugin system system for automatically upload the file to somewhere avaliable to the server.

Usage

usage: livy submit [-h] [--class-name COM.EXAMPLE.FOO]
                   [--jars FOO.JAR [FOO.JAR ...]]
                   [--py-files FOO.ZIP [FOO.ZIP ...]]
                   [--files FOO.TXT [FOO.TXT ...]]
                   [--archives FOO.TAR [FOO.TAR ...]] [--queue-name DEFAULT]
                   [--session-name HELLO] [--on-pre-submit PLUG [PLUG ...]]
                   --api-url API_URL [--driver-memory 10G] [--driver-cores N]
                   [--executor-memory 10G] [--executor-cores N]
                   [--num-executors N]
                   [--spark-conf CONF_NAME=VALUE [CONF_NAME=VALUE ...]]
                   [--watch-log | --no-watch-log]
                   [--on-task-success PLUG [PLUG ...]]
                   [--on-task-failed PLUG [PLUG ...]]
                   [--on-task-ended PLUG [PLUG ...]] [-v | -q]
                   [--highlight-logger NAME [NAME ...]]
                   [--hide-logger NAME [NAME ...]] [--pb | --no-pb]
                   [--log-file [XXXX.log] | --no-log-file]
                   [--log-file-level {DEBUG,INFO,WARNING,ERROR}]
                   script [args ...]

Submit a batch task to livy server.

positional arguments:
  script                Path to the script that contains the application to be
                        executed
  args                  Arguments for the task script

optional arguments:
  -h, --help            show this help message and exit
  --class-name COM.EXAMPLE.FOO
                        Application Java/Spark main class (for Java/Scala
                        task)
  --jars FOO.JAR [FOO.JAR ...]
                        Java dependencies to be used in this batch
  --py-files FOO.ZIP [FOO.ZIP ...]
                        Python dependencies to be used in this batch
  --files FOO.TXT [FOO.TXT ...]
                        Files to be used in this batch
  --archives FOO.TAR [FOO.TAR ...]
                        Archives to be used in this batch
  --queue-name DEFAULT  The name of the YARN queue to which submitted
  --session-name HELLO  The session name to execute this batch

pre-submit actions:
  --on-pre-submit PLUG [PLUG ...]
                        Run plugin(s) before submit

livy server configuration:
  --api-url API_URL     Base-URL for Livy API server
  --driver-memory 10G   Amount of memory to use for the driver process.
  --driver-cores N      Number of cores to use for the driver process.
  --executor-memory 10G
                        Amount of memory to use for the executor process.
  --executor-cores N    Number of cores to use for each executor.
  --num-executors N     Number of executors to launch for this batch.
  --spark-conf CONF_NAME=VALUE [CONF_NAME=VALUE ...]
                        Spark configuration properties.

post-submit actions:
  --watch-log           Watching for logs until it is finished
  --no-watch-log        Not to watch for logs. Only submit the task and quit.

after-task-finish actions:
  --on-task-success PLUG [PLUG ...]
                        Run plugin(s) on task is finished and success
  --on-task-failed PLUG [PLUG ...]
                        Run plugin(s) on task is ended and failed
  --on-task-ended PLUG [PLUG ...]
                        Run plugin(s) on task is ended and ended and
                        regardless to its state

console:
  -v, --verbose         Enable debug log.
  -q, --silent          Silent mode. Only show warning and error log.
  --highlight-logger NAME [NAME ...]
                        Highlight logs from the given loggers. This option
                        only takes effect when `colorama` is installed.
  --hide-logger NAME [NAME ...]
                        Do not show logs from the given loggers.
  --pb, --with-progressbar
                        Convert TaskSetManager's `Finished task XX in stage Y`
                        logs into progress bar. This option only takes effect
                        when `tqdm` is installed.
  --no-pb, --without-progressbar
                        Not to convert TaskSetManager's logs into progress
                        bar.

file logging:
  --log-file [XXXX.log]
                        Output logs into log file. A temporary file would be
                        created if path is not specificied.
  --no-log-file         Do not output logs into log file.
  --log-file-level {DEBUG,INFO,WARNING,ERROR}
                        Set minimal log level to be written to file. Default:
                        DEBUG.

Configurations

Following configs could be set via Configure command:

root.api_url

URL to Livy server

submit.pre_submit

List of plugin to be triggered before task is submitted to server. Value should be in module1:func,module2:func2 format. e.g. livy.cli.plugin:upload_s3 would bypass the meta to upload_s3() in livy.cli.plugin module.

submit.driver_memory

Amount of memory to use for the driver process. Need to specific unit, e.g. 12gb or 34mb.

submit.driver_cores

Number of cores to use for the driver process.

submit.executor_memory

Amount of memory to use per executor process. Need to specific unit, e.g. 12gb or 34mb.

submit.executor_cores

Number of cores to use for each executor.

submit.num_executors

Number of executors to launch for this batch.

submit.spark_conf

Key value pairs to override spark configuration properties.

submit.watch_log

Watching for logs after the task is submitted. This option shares the same behavior to keep_watch, only different is the scope it take effects.