SLiMSuite REST Server


Links
REST Home
EdwardsLab Homepage
EdwardsLab Blog
SLiMSuite Blog
SLiMSuite
Webservers
Genomes
REST Pages
REST Status
REST Help
REST Tools
REST Alias Data
REST API
REST News
REST Sitemap

rje_xml.py V0.2

XML (Text) Parsing Module

Module: rje_xml.py
Description: XML (Text) Parsing Module
Version: 0.2
Last Edit: 06/08/13

Copyright © 2006 Richard J. Edwards - See source code for GNU License Notice


Imported modules: rje rje_zen


See SLiMSuite Blog for further documentation. See rje for general commands.

Function

This module contains the XML class for parsing XML files. These are parsed into a generic set of nested dictionaries and XML objects stored in the root XML object. The main attributes of interest in the XML object for retrieving data are:

  • info['Name'] = element name
  • info['Content'] = text content of element (if any)
  • dict['Attributes'] = dictionary of element attributes (if any)
  • list['XML'] = list of XML objects containing nested elements (if any)

For example, the following short XML: < ?xml version="1.0" encoding="ISO-8859-1"? > < database name="EnsEMBL" ftproot="ftp://ftp.ensembl.org/pub/" outdir="EnsEMBL" > < file path="current_aedes_aegypti/data/fasta/pep/*.gz" >Yellow Fever Mosquito< /file > < file path="current_anopheles_gambiae/data/fasta/pep/*.gz" >Malaria Mosquito< /file > < /database >

The following XML objects would be created:

  • XML root object:
  • XML.info['Name'] = filename XML.list['XML'] = [XML1]
    • XML1:
    • XML1.info['Name'] = 'database' XML1.info['Content'] = '' XML1.dict['Atrributes'] = {'name':"EnsEMBL",'ftproot':"ftp://ftp.ensembl.org/pub/",'outdir':"EnsEMBL"} XML1.list['XML'] = [XML2,XML3]
      • XML2:
      • XML2.info['Name'] = 'file' XML2.info['Content'] = 'Yellow Fever Mosquito' XML2.dict['Atrributes'] = {'path':"current_aedes_aegypti/data/fasta/pep/*.gz"} XML2.list['XML'] = []
        • XML3:
        • XML3.info['Name'] = 'file' XML3.info['Content'] = 'Malaria Mosquito' XML3.dict['Atrributes'] = {'path':"current_anopheles_gambiae/data/fasta/pep/*.gz"} XML3.list['XML'] = []
          The top level XML list (XML.list['XML']) is returned by the parseXML() method of the class, which populates all the objects.

Commandline

- parse=FILE : Source file for reading XML file [None]
- attributes=LIST : List of Attributes to exclusively extract []
- elements=LIST : List of Elements to exclusively extract []

History Module Version History

    # 0.0 - Initial Compilation.
    # 0.1 - Added xml.sax functions.
    # 0.2 - Added parsing from URL.

© 2015 RJ Edwards. Contact: richard.edwards@unsw.edu.au.