|
|
Module: | rje_xml.py |
Description: | XML (Text) Parsing Module |
Version: | 0.2 |
Last Edit: | 06/08/13 |
|
Copyright © 2006 Richard J. Edwards - See source code for GNU License Notice
Imported modules:
rje
rje_zen
See SLiMSuite Blog for further documentation. See rje
for general commands.
Function
This module contains the XML class for parsing XML files. These are parsed into a generic set of nested dictionaries
and XML objects stored in the root XML object. The main attributes of interest in the XML object for retrieving data
are:
info['Name']
= element name
info['Content']
= text content of element (if any)
dict['Attributes']
= dictionary of element attributes (if any)
list['XML']
= list of XML objects containing nested elements (if any)
For example, the following short XML:
< ?xml version="1.0
" encoding="ISO-8859-1
"? >
< database name="EnsEMBL
" ftproot="ftp://ftp.ensembl.org/pub
/" outdir="EnsEMBL
" >
< file path="current_aedes_aegypti/data/fasta/pep/*.gz
" >Yellow Fever Mosquito< /file >
< file path="current_anopheles_gambiae/data/fasta/pep/*.gz
" >Malaria Mosquito< /file >
< /database >
The following XML objects would be created:
- XML root object:
XML.info['Name'] = filename
XML.list['XML'] = [XML1]
- XML1:
XML1.info['Name'] = 'database'
XML1.info['Content'] = ''
XML1.dict['Atrributes'] = {'name':"EnsEMBL",'ftproot':"ftp://ftp.ensembl.org/pub/",'outdir':"EnsEMBL"}
XML1.list['XML'] = [XML2,XML3]
- XML2:
XML2.info['Name'] = 'file'
XML2.info['Content'] = 'Yellow Fever Mosquito'
XML2.dict['Atrributes'] = {'path':"current_aedes_aegypti/data/fasta/pep/*.gz"}
XML2.list['XML'] = []
- XML3:
XML3.info['Name'] = 'file'
XML3.info['Content'] = 'Malaria Mosquito'
XML3.dict['Atrributes'] = {'path':"current_anopheles_gambiae/data/fasta/pep/*.gz"}
XML3.list['XML'] = []
The top level XML list (XML.list['XML']) is returned by the parseXML() method of the class, which populates all the
objects.
History Module Version History
# 0.0 - Initial Compilation.
# 0.1 - Added xml.sax functions.
# 0.2 - Added parsing from URL.