Multyvac for Python Primer

This document will show you how to get started with the multyvac Python library.

Creating a Job

The way to use Multyvac in Python is to designate a function that you want to run on Multyvac, instead of your own machine. Here we’ll walk through an example of offloading a simple function to Multyvac.

Open Python interactively and define the add function.

def add(x, y):
   return x+y

Normally, you would just run the function locally by calling it:

>>> add(1, 2)
3

If you want to run it on Multyvac, submit it using multyvac.submit():

>>> import multyvac
>>> jid = multyvac.submit(add, 1, 2)

That’s it! You pass in arguments to add() by passing them as arguments in the same order to submit(). Keyword arguments are just as easy: :meth:multyvac.submit(add, x=1, y=2).

submit() is non-blocking; it returns immediately without waiting for add to actually run. To verify, try this:

>>> import time
>>> time.sleep(10) # will sleep for 10 seconds
>>> multyvac.submit(time.sleep, 10) # returns immediately

By returning immediately, multyvac.submit() can’t give you the result of your function. What it returns instead is an integer jid (Job IDentification).

>>> print jid
1

For the remainder of the Primer, let’s assume the jid is 1. We’ll show you what you can do with a jid.

Using the Job Id

Job identifiers are unique to your account. Your first job has jid 1, and it is incremented sequentially with each new job. All of Multyvac’s job-related facilities use jids. We’ll explore a few of them now.

Querying a Job’s Status

Below is a diagram of the possible statuses a job can have once it is created.

digraph foo {
   rankdir=LR;
   node [shape=circle,fixedsize=false,width=0.6,fontsize=8]; waiting;
   node [shape=square,fixedsize=true,width=0.6,fontsize=9]; error; stalled; killed;
   node [shape=circle,fixedsize=false,width=0.6,fontsize=8.style=filled,fillcolor=lightgray]; queued; processing;
   node [shape=square,width=0.6,fontsize=9,style=filled,fillcolor=lightgray]; done;

   subgraph cluster_1 {
             done; error; stalled; killed;
             color=white;
   }
   node [shape=circle,color=white,style=filled,fillcolor=white]; "new job";
   "new job" -> "queued";
   "new job" -> "waiting";
   "queued" -> "processing";
   "processing" -> "done";
   "processing" -> "error";
   "waiting" -> "stalled";
   "waiting" -> "queued";
   "queued" -> "killed";
   "processing" -> "killed";
   "waiting" -> "killed";
}

A job spends a variable amount of time in various steps before it is finished (shown as squares), at which point its status becomes permanent. Only then will its result, or reason for failure be available. The path of gray elements, queued -> processing -> done, is the most common. The full definition of statuses follows:

Status Definition
waiting Job is waiting until its dependencies are satisfied.
queued Job is in the queue waiting for a free core.
processing Job is running.
done Job completed successfully.
error Job errored (typically due to an uncaught exception).
killed Job was aborted by the user.
stalled Job will not run due to a dependency erroring.

To query a job’s status in Python:

>>> job = multyvac.get(1) # gets job info
>>> job.status # job is still running
'processing'
>>> job.update() # grab latest job info
>>> job.status # job has finished
'done'

Querying a Job’s Result

To get the result of the functions we ran earlier, use job.get_result():

>>> job.get_result()
3

Result calls will block until the job has finished, and therefore the result is ready.

Waiting for a Job

If you just want to wait for a job to finish, you can do:

>>> job.wait()

You can wait for the job to start processing:

>>> job.wait(status='processing')

Viewing a Job in the Dashboard

On the Job Dashboard, the jobs you create are listed. The left most column has the job’s id, and the right most column its status. To see a detailed report for a job, just click on its jid.

SSH into a Job

You may want to SSH into the system that is running a job for debugging and inspection purposes.

>>> job.open_ssh_console()
Welcome to Ubuntu 12.04.3 LTS (GNU/Linux 3.11.0-12-generic x86_64)
multyvac@c:~$

As you can see, you’re dropped into a shell right from your Python terminal. Do note that the SSH session will be closed when the job is finished.

More Attributes

The job object has a lot of other attributes you may find useful:

Attribute Definition
stdout The standard output of the job. If the job is processing, this gives a live snapshot.
stderr The standard error of the job. If the job is processing, this gives a live snapshot.
runtime The number of seconds the job ran for. If the job is processing, this gives a live snapshot.
cmd The shell command that was executed.
core The core that was used.
multicore The number of cores that were used.
status The current status of the job
tags A dict of all the key-values you’ve assigned the job.
created_at When the job was created.
started_at When the job started processing.
finished_at When the job was finished.
queue_delay The number of seconds the job spent in the queue.
overhead_delay The number of seconds of overhead that Multyvac introduced.
cputime_user The number of seconds of user CPU time used. If the job is processing, this gives a live snapshot.
cputime_system The number of seconds of system CPU time used. If the job is processing, this gives a live snapshot.
memory_failcnt The number of times memory allocations failed. If the job is processing, this gives a live snapshot.
memory_max_usage The max amount of memory the job has used thus far. If the job is processing, this gives a live snapshot.
ports The listening sockets that a job has opened mapped to (address, port) combinations that should be used to connect to them.

Killing a Job

If a job is not behaving as expected, or no longer needs to be run for whatever reason, you can kill it. Killing terminates a processing or queued job immediately.

>>> job.kill()

Giving a Job a Name

Jobs can be given names, which can make it easier to query them later on.

>>> multyvac.submit(add, 1, 2, _name='1+2')
345 # jid
>>> multyvac.get_by_name('1+2')
Job(345, name=u'1+2')

If multiple jobs share the same name, only the most recently created is returned.

Layers: Dependencies

There are certain functions, such as those that require compilation (C-extensions) that cannot be automatically transferred from your machine to Multyvac. For example, numpy:

>>> import numpy
>>> jid = multyvac.submit(numpy.add, 1, 2)
>>> job = multyvac.get(jid)
>>> job.get_result()
JobError: Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/multyvacinit/pybootstrap.py", line 8, in <module>
    f, args, kwargs = pickle.loads(stdin)
  File "/usr/local/lib/python2.7/dist-packages/multyvac/util/cloudpickle.py", line 961, in _getobject
    mod = __import__(modname)
ImportError: ('No module named numpy', <function _getobject at 0x26b4de8>, ('numpy', 'add'))

To install numpy on Multyvac, use a Layer.

>>> multyvac.layer.create('numpy')
>>> layer = multyvac.layer.get('numpy')
>>> modify_job = layer.modify()
>>> modify_job.open_ssh_console()

multyvac@c:~$ sudo apt-get install numpy
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  binutils cpp cpp-4.6 gcc gcc-4.6 libblas3gf libc-dev-bin libc6-dev libgfortran3 libgmp10 libgomp1 liblapack3gf libmpc2 libmpfr4 libquadmath0 linux-libc-dev manpages manpages-dev python-numpy
0 upgraded, 19 newly installed, 0 to remove and 0 not upgraded.
Need to get 28.8 MB of archives.
After this operation, 75.6 MB of additional disk space will be used.
Do you want to continue [Y/n]? Y
...
$ exit

>>> modify_job.snapshot()

Now that we’ve snapshot-ed a new layer, we can run numpy.add as long as we specify the layer:

>>> jid = multyvac.submit(numpy.add, 1, 2, _layer='numpy')
>>> job = multyvac.get(jid)
>>> job.get_result()
3

To see what else you can do, please consult the Layer API.

Volumes: Data Storage

If you want your job to be able to access data on the filesystem, you’ll want to use a Volume. Volumes are folders that can be mounted at any path in the Multyvac filesystem.

Here we’ll create a volume called dataset and mount it at /data.

>>> multyvac.volume.create('dataset', '/data')

Now let’s put some data (hello, world) in the volume in a file called msg.

>>> vol = multyvac.volume.get('dataset')
>>> vol.put_contents('hello, world', 'msg')

We can now run a job that can read this file:

>>> def dump_file()
...     return open('/data/msg').read()
>>> jid = multyvac.submit(dump_file)
>>> job = multyvac.get(jid)
>>> job.get_result()
JobError: Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/multyvacinit/pybootstrap.py", line 10, in <module>
    res = f(*args, **kwargs)
  File "<ipython-input-4-c3c19ef9b651>", line 2, in dump_file
IOError: [Errno 2] No such file or directory: '/data/msg'

We got this error because we didn’t specify that the job should use the volume. Let’s try again:

>>> jid = multyvac.submit(dump_file, _vol='dataset')
>>> job = multyvac.get(jid)
>>> job.get_result()
'hello, world'

Working with your Local Filesystem

Volumes give you the ability to synchronize a local path with Multyvac. We say “synchronize” because if you synchronize a path multiple times, our client will only send up what has changed.

>>> vol.sync_up('/path/to/big/data', 'stuff') # takes a while
>>> vol.sync_up('/path/to/big/data', 'stuff') # takes seconds because nothing changed

Likewise, you can synchronize data from Multyvac to your local filesystem efficiently.

>>> vol.sync_down('stuff', '/path/on/your/machine')

The efficiency of synchronization comes at the cost of additional time and processing overhead. If you’re uploading small, novel files, you may prefer to use get_file() and put_file().

>>> vol.put_file('local_path', 'remote_path')
>>> vol.get_file('remote_path', 'local_path')

To see what else you can do, please consult the Volume API.