SLiMSuite REST Server


Links
REST Home
EdwardsLab Homepage
EdwardsLab Blog
SLiMSuite Blog
SLiMSuite
Webservers
Genomes
REST Pages
REST Status
REST Help
REST Tools
REST Alias Data
REST API
REST News
REST Sitemap

rje_archive V0.7.3

KDM Archive Manager

Module: rje_archive
Description: KDM Archive Manager
Version: 0.7.3
Last Edit: 28/04/20

Copyright © 2017 Richard J. Edwards - See source code for GNU License Notice


Imported modules: rje rje_db rje_obj


See SLiMSuite Blog for further documentation. See rje for general commands.

Function

This module is for backing up data to the UNSW Research Data Store (RDS) and reporting on the status of backups. Details to be added.

Development notes

This module is designed to run in the following modes:

1. Backup. This archives a specific directory (or list) to a given data archive project.

2. Remove. This checks that directories are in the archive and then deletes them according to some criteria.

du testing -> parse ^(\d+)\s_\S.+$' -> size, directory


ls -lt | head -2
total 60
drwxr-xr-x. 3 z3452659 unsw 4096 Mar 30 16:40 tools

-> Store this for each directory: should not change between upload check.

ls -1p dev | grep "/$" -cv -> Number of files
ls -1p dev | grep "/$" -c -> Number of directories


module add unswdataarchive/2015-09-10
upload.sh /home/z3452659/bioinf/redwards/projects/ManefieldPacBio-Sep15 "/UNSW_RDS/D0234444/"

NOTE: Will need to get the system set up so that it does not ask for a password. This can be achieved by requesting
a token from IT and then updating the config file.

The first time it runs, it will report X imported file(s). If the files are already uploaded, it will report 0 files
imported. All files are listed in "Consume" lines, though. This should (presumably) match the number of files listed
by ls with the -a flag, ignoring '.' and '..'? Or could try matching against:
rje.listDir(callobj=None,folder=os.getcwd(),subfolders=True,folders=True,files=True,summary=True,asksub=False,dircut=0,dirdepth=-1)

=> Make checking file numbers a toggle. (checknum=T/F)

$ upload.sh /srv/scratch/z3452659/CaneToad-May15/analysis/2016-11-22.Tyr "/UNSW_RDS/D0234445/CaneToad-May15/analysis"
Picked up _JAVA_OPTIONS: -Xmx1g
Password:
Consume: 2016-11-22.Tyr/BLASTFAS/F7CL37.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/canetoad.20161122A.tyr_XENTR.vs.canetoad.20161122A.est_hits.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/fiesta.ini [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/fiesta.log [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/gablam.log [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nhr [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nin [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nog [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsd [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsi [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsq [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.gablam.tdt [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.hitsum.tdt [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.local.tdt [construct: null, logical: null, encapsulation: null]
live: imported 15 file(s)

$ upload.sh /srv/scratch/z3452659/CaneToad-May15/analysis/2016-11-22.Tyr "/UNSW_RDS/D0234445/CaneToad-May15/analysis"
Picked up _JAVA_OPTIONS: -Xmx1g
Password:
Consume: 2016-11-22.Tyr/canetoad.20161122A.tyr_XENTR.vs.canetoad.20161122A.est_hits.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/BLASTFAS/F7CL37.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/fiesta.ini [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/fiesta.log [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/gablam.log [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nhr [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nin [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nog [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsd [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsi [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.fas.nsq [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.gablam.tdt [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.hitsum.tdt [construct: null, logical: null, encapsulation: null]
Consume: 2016-11-22.Tyr/tyr_XENTR.vs.canetoad.20161122A.local.tdt [construct: null, logical: null, encapsulation: null]
live: imported 0 file(s)

#!# Should generate a log file that has the time, total number of files and total number imported.

.name' failed: The namespace '/UNSW_RDS/D0234445/CaneToad-May15/delete' does not exist or is not accessible

Output

Add details of backups.tdt and archived.tdt here: (dir, project, rds, date, files, imports)

Commandline

Main Archive Options

rds=X : UNSW_RDS ResData ID project code to use (e.g. D0234444) []
uploadsh=X : Full path for runnining upload.sh script ['/home/z3452659/unswdataarchive/upload.sh']
homedir=PATH : Home directory from which the archive script will be run ['~']
projects=FILE : Delimited file of
Project and RDS` code. If provided, will not use rds=X ['projects.tdt']
strict=T/F : Restrict processing to projects found in Projects file and add no new ones. [False]
backupdirs=LIST : List of directories to backup (should be project subdirectory full paths) []
archivedirs=LIST: List of directories to check archive and tar/delete (should be project subdirectory full paths) []
rmdirs=T/F : Delete archived directories. (Will ask if i>0) [False]
targz=T/F : Whether to tar and zip directories to be deleted [True]
checknum=T/F : Whether to check numbers of files consumed by upload.sh versus directory contents [True]
tryparent=T/F : Whether to try to run backup parent directory in case of failure [False]
basefile=FILE : This will set the 'root' filename for output files (FILE.*), including the log ['rds']
backupdb=FILE : File to output backup summaries into ['BASEFILE.backups.tdt']
archived=FILE : File to output archive summaries into ['BASEFILE.archived.tdt']
cleanup=T/F : Whether to perform post-upload cleanup of backups and archived files [True]
quiet=X : Min number of days of inactivity before a directory gets rates as quiet [1]
skipquiet=T/F : Whether to skip uploads for quiet directories [True]
checkarchive=T/F: Whether to run upload.sh on on directories that have been uploaded in last run [False]
dormancy=X : Min number of days of inactivity before a directory gets rated as dormant (0=no dormancy) [30]
skipdormant=T/F : Whether to skip uploads for dormant directories [True]
dormant=FILE : File to output dormant directories into ['BASEFILE.dormant.tdt']
maxfiles=INT : Maximum number of files in a directory to generate backups (0 = no limit) [10000]
maxdirsize=INT : Maximum directory size in bytes to generate backup (0 = no limit) [1e11 (~100Gb)]
archivetgz=T/F : Whether to tar and zip then backup directories exceeding maxfiles=INT cutoff [False]
useqsub=T/F : Whether to use QSub for tarballing and then archiving the tarball [False]


History Module Version History

    # 0.0.0 - Initial Compilation.
    # 0.1.0 - Initial (partially) functional version.
    # 0.2.0 - Updated functions with additional status measures and backups/archives division.
    # 0.2.1 - Fixed float division error.
    # 0.3.0 - Modified default homepath. Renamed backups to backupdb.
    # 0.3.1 - Fixed backups/backupdb bug.
    # 0.4.0 - Replaced module with uploadsh=X Full path for runnining upload.sh script ['/share/apps/unswdataarchive/2015-09-10/']
    # 0.4.1 - Added tryparent=T/F : Whether to try to run backup parent directory in case of failure [True]
    # 0.5.0 - Updated to run on Mac with OSX=T.
    # 0.6.0 - Added toggle to skip quiet created/updated/modified.
    # 0.7.0 - Added maxfiles cap and option to targz directories exceeding threshold.
    # 0.7.1 - Set checkarchive=F tryparent=F by default to make standard running quicker.
    # 0.7.2 - Added maxdirsize=INT  : Maximum directory size in bytes to generate backup [1e11 (~100Gb)]
    # 0.7.3 - Python 2.6 compatibility.

© 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au.