DFAST API - Help

DFAST Web API

DFAST Helper for easy job management

About

DFASThelper.py is a python script that enables easy job management via command operation.
An access tokes is required to submit jobs using this script.

Please get an access token by e-mail request to submit a job via the web API. dfast [at] nig.ac.jp

Download

Script: DFASThelper.py (version 1.0.4, Release 2021.2.22) (Python 3.7-)
Metadata file example (complete genome): metadata_example_complete.txt
Metadata file example (draft genome): metadata_example_draft.txt
Excel file example: genome_list_example.xlsx
Metadata description: metadata_description.xlsx

Usage

Getting ready
DFASThelper.py runs in Python 3.7-.
Download the script from here and open it with your text editor to set token in the header part of the script.
'Pandas' and 'openpyxl' modules are required for batch job submission. Install it with '(sudo) pip install pandas openpyxl' or 'conda install pandas openpyxl'.
Help
You can see the help with -h option.
```
$ python DFASThelper.py -h
```
You can also check the help for subcommands.
```
$ python DFASThelper.py submit -h
```
Single job submission
Use -f (--fasta) to specify the path to a genomic FASTA file.
Optionally, you can specify an additional reference database with -d (--dataset) option.
Available additional reference databases: lab, cyanobase, ecoli, bifidobacterium, hpylori
```
$ python DFASThelper.py submit --fasta genome.fa --dataset lab 
```
You can upload metadata with -m (--metadata) option.
Please see the example of metadata file for more detail.
```
$ python DFASThelper.py submit -f genome.fa -m metadata.txt -d lab 
```
Batch job submission for multiple genomes
Prepare an Excel work sheet that contains a path to a FASTA file and metadata in each row. Please download the example and use it as a template.
The values specified in the 'strain' column will be used as file names of the results. It is recommendable to use names that can be distinguished from others.
Use -r option to write the job submission result (Job IDs and status), which will be required to download result files.
If you want to create DDBJ submission files, use --strict option to enable strict metadata validation (recommended).
If you don't need to create DDBJ submission files, leave cells for metadata blank.
```
$ python DFASThelper.py submit -l genome_list.xlsx -r result_list.txt
```
Job status / delete jobs
Use -j (--jobids) to specify job ID(s) or -l (--list) to specify the job submission result.
```
$ python DFASThelper.py status -j XXXXX YYYYY ZZZZZ 
$ python DFASThelper.py delete -l result_list.txt 
```
Download result
Use -f (--format) to specify the file format.
Acceptable file formats: genbank, gff, genome, protein, cds, rna, tabulated, stats, zip, annt, seq
    zip: Zip archive that contains all files
    annt: Annotation file for DDBJ submission
    seq: Sequence file for DDBJ submission

Use -o (--outfile) to save the file, or the file will be displayed on the screen.
```
$ python DFASThelper.py download -j XXXXX -f genbank
```
For downloading multiple job results, use -O (--outdir) to specify the directory where files will be writen.
```
$ python DFASThelper.py download -l result_list.txt -f annt -O result_dir
```