My company would like to perform a complete overhaul of their metadata. They would like to use the ISO standard.
How can we go about this?
I have been researching and have done a lot of reading about metadata but I need to form a plan of action.
iso-19115iso-19139metadata
My company would like to perform a complete overhaul of their metadata. They would like to use the ISO standard.
How can we go about this?
I have been researching and have done a lot of reading about metadata but I need to form a plan of action.
THis post got me started to create a needed XML-file with Python: https://stackoverflow.com/questions/3605680/creating-a-simple-xml-file-using-python
in my case, this code:
##############################
#Create the necessary XML file
##############################
root = ET.Element("OGRVRTDataSource")
OGRVRTLayer = ET.SubElement(root, "OGRVRTLayer")
OGRVRTLayer.set("name", AMSRcsv_shortname)
SrcDataSource = ET.SubElement(OGRVRTLayer, "SrcDataSource")
SrcDataSource.text = AMSRcsv
GeometryType = ET.SubElement(OGRVRTLayer, "GeometryType")
GeometryType.text = "wkbPoint"
GeometryField = ET.SubElement(OGRVRTLayer,"GeometryField")
GeometryField.set("encoding", "PointFromColumns")
GeometryField.set("x", "lon")
GeometryField.set("y", "lat")
GeometryField.set("z", "brightness")
tree = ET.ElementTree(root)
tree.write(AMSRcsv_vrt)
creates such files:
<OGRVRTDataSource>
<OGRVRTLayer name="GW1AM2_201301010834_032D_L1SGRTBR_1110110_channel89H">
<SrcDataSource>G:\AMSR\GW1AM2_201301010834_032D_L1SGRTBR_1110110_channel89H.csv</SrcDataSource>
<GeometryType>wkbPoint</GeometryType>
<GeometryField encoding="PointFromColumns" x="lon" y="lat" z="brightness" />
</OGRVRTLayer>
</OGRVRTDataSource>
One way of doing it is to generate a really simple XML tree for each record in your spreadsheet, write an XSLT stylesheet to translate from your simple XML to ISO 19139, then use lxml.etree.XSLT to transform.
The major advantage of this is it keeps the complex iso19139 XML out of your python code which makes it much easier to debug. Some more info/tutorials on xslt here and here.
Here's some example code:
from lxml import etree
import os.path
def list_to_xml(datalist,rootname):
'''Transform a metadata record to a flat XML string'''
root = etree.Element(rootname)
for fld in datalist:
col=fld[0]
dat=str(fld[1]) # may need to be careful of unicode encoding issues
# when reading data from Excel
child=etree.SubElement(root,col)
child.text=dat
return root
def transform(inxml,xslfile):
xslfile=os.path.abspath(xslfile).replace('\\','/') #xslt doesn't like backslashes in absolute paths...
xsl = etree.parse(xslfile)
xslt = etree.XSLT(xsl)
return xslt(inxml)
xslfile = 'test.xslt'
#Column headers from the Excel spreadsheet
header=['column1','xmin','ymin','xmax','ymax','thelastcolumn']
#Data pulled out of the spreadsheet
#You would loop over all rows and do a transform on each row
row=['A value',1.2345,5.4321,2.3456,6.5432,'last value']
inxml = list_to_xml(zip(header,row),"ExcelMetadata")
print etree.tostring(inxml)
#The above prints - <ExcelMetadata><column1>A value</column1><xmin>1.2345</xmin><ymin>5.4321</ymin><xmax>2.3456</xmax><ymax>6.5432</ymax><thelastcolumn>last value</thelastcolumn></ExcelMetadata>
iso19139 = transform(inxml,xslfile)
print str(iso19139)
The 'test.xslt' stylesheet referred to in the above code is:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
xmlns:xdt="http://www.w3.org/2005/xpath-datatypes"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:gts="http://www.isotc211.org/2005/gts"
xmlns:gsr="http://www.isotc211.org/2005/gsr"
xmlns:gss="http://www.isotc211.org/2005/gss"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:gml="http://www.opengis.net/gml"
xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://www.isotc211.org/2005/gmd/gmd.xsd">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/ExcelMetadata">
<gmd:fileIdentifier>
<gco:CharacterString><xsl:value-of select="column1"/></gco:CharacterString>
</gmd:fileIdentifier>
<!-- Lots of other xml stuff -->
<xsl:if test="normalize-space(thelastcolumn)">
<gmd:someelement>
<gco:CharacterString><xsl:value-of select="thelastcolumn"/></gco:CharacterString>
</gmd:someelement>
</xsl:if>
<gmd:identificationInfo>
<gmd:MD_DataIdentification>
<gmd:extent>
<gmd:EX_Extent>
<gmd:geographicElement>
<gmd:EX_GeographicBoundingBox>
<gmd:extentTypeCode>
<gco:Boolean>1</gco:Boolean>
</gmd:extentTypeCode>
<gmd:westBoundLongitude>
<gco:Decimal><xsl:value-of select="xmin"/></gco:Decimal>
</gmd:westBoundLongitude>
<gmd:eastBoundLongitude>
<gco:Decimal><xsl:value-of select="xmax"/></gco:Decimal>
</gmd:eastBoundLongitude>
<gmd:southBoundLatitude>
<gco:Decimal><xsl:value-of select="ymin"/></gco:Decimal>
</gmd:southBoundLatitude>
<gmd:northBoundLatitude>
<gco:Decimal><xsl:value-of select="ymax"/></gco:Decimal>
</gmd:northBoundLatitude>
</gmd:EX_GeographicBoundingBox>
</gmd:geographicElement>
</gmd:EX_Extent>
</gmd:extent>
</gmd:MD_DataIdentification>
</gmd:identificationInfo>
</xsl:template>
</xsl:stylesheet>
Best Answer
ISO 19139 is an XML encoding of ISO 19115 with additions from ISO 19119 to cover metadata for services that provide geospatial data. Whilst it is true that ISO 19115 (and thus also ISO 19139) allows a party to define a very comprehensive set of metadata, it is also true that these ISO standards allow you to define a very minimal set of metadata, based on the core metadata required by the ISO 19115 standard.
To put that another way, to be conformant to ISO 19115/ISO 19139 requires a minimal set of metadata, that is:
Mandatory (M): The metadata entity or metadata element shall be documented
Conditional (C): The metadata entity or metadata element shall be documented if another entity or element has been documented, or if a condition is or isn’t met elsewhere.
Dataset title (M)A unique title (within your metadata records) for your data.
Dataset reference date (M) Geographic location of the dataset (by four coordinates or by geographic identifier) (C)If the metadata applies to a data set which is spatially referenced (such as a WMS) this is required.
Dataset language (M)Language(s) used within the dataset. Required even if the resource does not include any textual information; defaults to the Metadata language.
Dataset character set (C)Full name of the character encoding used for the data set. You must supply this character set if you are not using the ISO/IEC 10646-1 character set and if your character set is not defined by the document encoding.
Dataset topic category (M)Main theme(s) of the data set described using the most appropriate term defined in the standard; such as: ‘geoscientificInformation’, ‘economy’ , or ‘imageryBaseMapsEarthCover’
Metadata language (C)Language used to document the metadata. You must supply the metadata language if it is not defined by the document encoding.
Abstract defining the dataset (M)Brief narrative summary of the content of the resource.
Metadata character set (C)Full name of the character encoding used for the metadata set. You must supply this character set in your metadata if you are not using the ISO/IEC 10646-1 character set AND if your character set is not defined by the document encoding. Note as most XML and HTML pages provide a character set as part of their own metadata, it is likely that you will not need to explicitly state this for your own layer metadata
Metadata point of contact (M)Party responsible for the metadata information
Metadata date stamp (M)So really what you probably need to do is to go through which metadata you need to supply for you own benefit (the benefit of your organization) and map it to the metadata defined by ISO 19115 and then to the relevant elements and attributes of ISO 19139. You may wish to put extra constraints on the content of an ISO 19139 instance document such that it is conformant to your standard, for example to restrict possible terms to a subset of terms allowed by the standard, and in such cases you can define those rules using Schematron