DFAST Web API

DFAST Helper for easy job management


About

DFASThelper.py is a python script that enables easy job management via command operation.
An access tokes is required to submit jobs using this script.

Please get an access token by e-mail request to submit a job via the web API. dfast [at] nig.ac.jp


Download

Script
DFASThelper.py (version 1.0.4, Release 2021.2.22) (Python 3.7-)
Metadata file example (complete genome)
metadata_example_complete.txt
Metadata file example (draft genome)
metadata_example_draft.txt
Excel file example
genome_list_example.xlsx
Metadata description
metadata_description.xlsx

Usage

  1. Getting ready

    DFASThelper.py runs in Python 3.7-.
    Download the script from here and open it with your text editor to set token in the header part of the script.
    'Pandas' and 'openpyxl' modules are required for batch job submission. Install it with '(sudo) pip install pandas openpyxl' or 'conda install pandas openpyxl'.

  2. Help

    You can see the help with -h option.

    $ python DFASThelper.py -h
    

    You can also check the help for subcommands.

    $ python DFASThelper.py submit -h
    
  3. Single job submission

    Use -f (--fasta) to specify the path to a genomic FASTA file.
    Optionally, you can specify an additional reference database with -d (--dataset) option.
    Available additional reference databases: lab, cyanobase, ecoli, bifidobacterium, hpylori

    $ python DFASThelper.py submit --fasta genome.fa --dataset lab 
    

    You can upload metadata with -m (--metadata) option.
    Please see the example of metadata file for more detail.

    $ python DFASThelper.py submit -f genome.fa -m metadata.txt -d lab 
    
  4. Batch job submission for multiple genomes

    Prepare an Excel work sheet that contains a path to a FASTA file and metadata in each row. Please download the example and use it as a template.
    The values specified in the 'strain' column will be used as file names of the results. It is recommendable to use names that can be distinguished from others.
    Use -r option to write the job submission result (Job IDs and status), which will be required to download result files.
    If you want to create DDBJ submission files, use --strict option to enable strict metadata validation (recommended).
    If you don't need to create DDBJ submission files, leave cells for metadata blank.

    $ python DFASThelper.py submit -l genome_list.xlsx -r result_list.txt
    
  5. Job status / delete jobs

    Use -j (--jobids) to specify job ID(s) or -l (--list) to specify the job submission result.

    $ python DFASThelper.py status -j XXXXX YYYYY ZZZZZ 
    $ python DFASThelper.py delete -l result_list.txt 
    
  6. Download result

    Use -f (--format) to specify the file format.
    Acceptable file formats: genbank, gff, genome, protein, cds, rna, tabulated, stats, zip, annt, seq
        zip: Zip archive that contains all files
        annt: Annotation file for DDBJ submission
        seq: Sequence file for DDBJ submission

    Use -o (--outfile) to save the file, or the file will be displayed on the screen.

    $ python DFASThelper.py download -j XXXXX -f genbank
    

    For downloading multiple job results, use -O (--outdir) to specify the directory where files will be writen.

    $ python DFASThelper.py download -l result_list.txt -f annt -O result_dir