Submit¶
submit
is design to create a batch on livy server. It sends request to livy, get response and start watching for logs (like what’s happening in Read log).
In original use case, I’m designing and utilizing this tool for sending local on to the server as a test. The local files are of course not accessible to the livy server. Therefore it comes with Plugin system system for automatically upload the file to somewhere avaliable to the server.
Usage¶
usage: livy submit [-h] [--class-name COM.EXAMPLE.FOO]
[--jars FOO.JAR [FOO.JAR ...]]
[--py-files FOO.ZIP [FOO.ZIP ...]]
[--files FOO.TXT [FOO.TXT ...]]
[--archives FOO.TAR [FOO.TAR ...]] [--queue-name DEFAULT]
[--session-name HELLO] [--on-pre-submit PLUG [PLUG ...]]
--api-url API_URL [--driver-memory 10G] [--driver-cores N]
[--executor-memory 10G] [--executor-cores N]
[--num-executors N]
[--spark-conf CONF_NAME=VALUE [CONF_NAME=VALUE ...]]
[--watch-log | --no-watch-log]
[--on-task-success PLUG [PLUG ...]]
[--on-task-failed PLUG [PLUG ...]]
[--on-task-ended PLUG [PLUG ...]] [-v | -q]
[--highlight-logger NAME [NAME ...]]
[--hide-logger NAME [NAME ...]] [--pb | --no-pb]
[--log-file [XXXX.log] | --no-log-file]
[--log-file-level {DEBUG,INFO,WARNING,ERROR}]
script [args ...]
Submit a batch task to livy server.
positional arguments:
script Path to the script that contains the application to be
executed
args Arguments for the task script
optional arguments:
-h, --help show this help message and exit
--class-name COM.EXAMPLE.FOO
Application Java/Spark main class (for Java/Scala
task)
--jars FOO.JAR [FOO.JAR ...]
Java dependencies to be used in this batch
--py-files FOO.ZIP [FOO.ZIP ...]
Python dependencies to be used in this batch
--files FOO.TXT [FOO.TXT ...]
Files to be used in this batch
--archives FOO.TAR [FOO.TAR ...]
Archives to be used in this batch
--queue-name DEFAULT The name of the YARN queue to which submitted
--session-name HELLO The session name to execute this batch
pre-submit actions:
--on-pre-submit PLUG [PLUG ...]
Run plugin(s) before submit
livy server configuration:
--api-url API_URL Base-URL for Livy API server
--driver-memory 10G Amount of memory to use for the driver process.
--driver-cores N Number of cores to use for the driver process.
--executor-memory 10G
Amount of memory to use for the executor process.
--executor-cores N Number of cores to use for each executor.
--num-executors N Number of executors to launch for this batch.
--spark-conf CONF_NAME=VALUE [CONF_NAME=VALUE ...]
Spark configuration properties.
post-submit actions:
--watch-log Watching for logs until it is finished
--no-watch-log Not to watch for logs. Only submit the task and quit.
after-task-finish actions:
--on-task-success PLUG [PLUG ...]
Run plugin(s) on task is finished and success
--on-task-failed PLUG [PLUG ...]
Run plugin(s) on task is ended and failed
--on-task-ended PLUG [PLUG ...]
Run plugin(s) on task is ended and ended and
regardless to its state
console:
-v, --verbose Enable debug log.
-q, --silent Silent mode. Only show warning and error log.
--highlight-logger NAME [NAME ...]
Highlight logs from the given loggers. This option
only takes effect when `colorama` is installed.
--hide-logger NAME [NAME ...]
Do not show logs from the given loggers.
--pb, --with-progressbar
Convert TaskSetManager's `Finished task XX in stage Y`
logs into progress bar. This option only takes effect
when `tqdm` is installed.
--no-pb, --without-progressbar
Not to convert TaskSetManager's logs into progress
bar.
file logging:
--log-file [XXXX.log]
Output logs into log file. A temporary file would be
created if path is not specificied.
--no-log-file Do not output logs into log file.
--log-file-level {DEBUG,INFO,WARNING,ERROR}
Set minimal log level to be written to file. Default:
DEBUG.
Configurations¶
Following configs could be set via Configure command:
- root.api_url
URL to Livy server
- submit.pre_submit
List of plugin to be triggered before task is submitted to server. Value should be in
module1:func,module2:func2
format. e.g.livy.cli.plugin:upload_s3
would bypass the meta toupload_s3()
inlivy.cli.plugin
module.- submit.driver_memory
Amount of memory to use for the driver process. Need to specific unit, e.g.
12gb
or34mb
.- submit.driver_cores
Number of cores to use for the driver process.
- submit.executor_memory
Amount of memory to use per executor process. Need to specific unit, e.g.
12gb
or34mb
.- submit.executor_cores
Number of cores to use for each executor.
- submit.num_executors
Number of executors to launch for this batch.
- submit.spark_conf
Key value pairs to override spark configuration properties.
- submit.watch_log
Watching for logs after the task is submitted. This option shares the same behavior to
keep_watch
, only different is the scope it take effects.