Tuesday, May 26, 2015

Djangui, empower your python scripts with Django!

Djangui is a Django app which converts scripts written in argparse to a web based interface.

In short, it converts this:





Into this:




If you just want to go out and play with it now, you can do so here:

https://djangui.herokuapp.com

Note: this is on a free heroku plan, so it may take ~20 seconds to spin up or you may have to refresh if the site is initially unresponsive.

And if you like what you see, you can see how to install it here:


So, what is it?

I quite often have to do things which are minimally related to my day to day work, usually involving handling other's data. In biology, there is an exponential increase in the datasets, and most biologists are limited by what can be opened in excel. Even then, they are limited to small datasets because even if you can open 1 million rows in your latest excel, doing useful analysis of it is another story (such as waiting 5 hours to group things).

So basic tasks, like:

  • I want to make a heatmap
  • I want to do a GO analysis
  • I want to do a PCA analysis
  • I want to group by X, and make a histogram
and other similar tasks, are out of reach. Traditionally, you sought out your resident bioinformatician or someone who 'I've seen use the command line before', or tried to find something online.

Because I didn't want to make heatmaps for people (which also, makes them realize making a heatmap of 80,000 rows x 50 columns is pointless unless you are displaying your graph on an imax), I wanted to make a system which enables others to use my scripts without my involvement. There are existing tools to do this with the interests of biologists in mind, such as galaxy or taverna. However, my fundamental issue with these programs is they make me write a wrapper around my scripts in their own markup language. I don't want to learn a platform specific language which has no relevant role in my day-to-day life. Knowing this language is purely altruistic on my end, and doesn't expedite my workday any.

Enter, Djangui

So, there is a wonderful library for building command line scripts in python, argparse. Given its fairly structured nature, I thought it would be fairly straight forward to map these structures to a web form equivalent (it wasn't too hard). This project was initially inspired by what power sandman could leverage over a sql database in a single command. Guided by this 'it should be simple' ideology, djangui will fully bootstrap a Django project with djangui pre-installed for you in a single command:


djanguify.py -p ProjectName

From here, Djangui leverages the awesome Django admin to control adding scripts as well as managing their organization. Scripts are classified into groups, such as plotting tools/etc., and user access to scripts can be controlled at the script group as well as the script level.




A Sample Script

After adding a script, your basic interface appears as so:



There are three aspects of the front-end:
  • Your script groups
  • Your currently selected script: We are currently using the 'find cats' script, which will find you cats.
  • A summary of past, on-going, and queued jobs. Tasks can be further viewed in here by clicking on their task name (its also a datatables table, so it's nicely sortable and queryable). 


Task Viewing

The task viewer has several useful feature. From the aforementioned 'find cats' script (also: how to cheaply get undeserved attention from the internet):

File Previews:

Currently there are three types of files we attempt to identify and embed. I'm considering ways to make this more of a plug-and-play style from the admin.
  • Images
    • From our find cats scripts
    • An incomprehensible heatmap: (images naturally have a lightbox)


  • Fasta Files: (a format for DNA/RNA/protein sequences)


  • Tabular Files:



Job Cloning:


Complex scripts may have several options to enable, a job can be cloned and its existing values will be pre-filled into a script for the user to submit.

And other obvious options like job deletion, resubmission, and downloading the contents of the job as a zip or tar.gz file.

Job Resubmission:

Made a stupid mistake in a script you uploaded? Fix the script, then you can simply resubmit the job.

Anything else?

Djangui is a work in progress. It supports ephemeral file systems, so you can use platforms such as Heroku with an Amazon S3 data store. This also allows you to host your worker on a separate node from your main web server. It also supports customization of the interface. If you know a bit of Django, you can override the template system with your own to customize its interface (or extend certain elements of it). Also, it is a pluggable app, meaning you can attach it to an existing Django project simply.

A Flask Alternative

In the time I was making this, another project emerged called Wooey which is based on Flask (ironically enough Martin also works in the scientific realm, so it seems we deal with similar issues). It has a slick interface, but has a few differences I went against such as storing the scripts as local json files instead of in the database. Thankfully, he chose a different avenue for loading scripts (using AST), which was a great boon for handling cases my approach to parsing argparse scripts would not cover. We have plans to release a library that is solely based on converting libraries such as argparse/clint to a standard markup for libraries such as Djangui and Wooey to utilize.

Pull requests and comments are welcome!

1 comment:

  1. This is a great concept. I work in the bioinformatics field and can definitely see value in this. Appreciate your posts!

    ReplyDelete