TransWikia.com

Parse XML into Pandas Dataframe, Python 3.8, ElementTree

Stack Overflow Asked by a11 on February 22, 2021

Using ElementTree with Python 3.8, how can I convert the data into a Pandas dataframe?

Example XML:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<SystemSourceSet id="-1" name="UC33brAvg_FM31" weight="0.5">
  <!-- This model is an example and for review purposes only -->
  <!-- Reference: UC33brAvg_FM31 -->
  <!-- Description: UCERF 3.3 Branch Averaged Solution (FM31)-->
  <Settings>
    <DefaultMfds>
      <IncrementalMfd floats="false" m="6.5" rate="0.0" type="SINGLE" weight="1.0"/>
    </DefaultMfds>
  </Settings>
  <Source>
    <IncrementalMfd m="6.449" rate="3.3631184e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:1" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.638" rate="1.5340160e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:2" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.78" rate="1.0903030e-05" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:3" rake="-90.0" width="15.273"/>
  </Source>
  <Source>
    <IncrementalMfd m="6.893" rate="7.3397665e-06" type="SINGLE"/>
    <Geometry depth="1.3" dip="50.0" indices="0:4" rake="-90.0" width="15.273"/>
  </Source>

Expected Dataframe:

enter image description here

One Answer

Navigate the tree manually and collect the data points you want to keep:

from xml.etree import ElementTree

root = ElementTree.parse('data.xml').getroot()
data = []
for node in root:
    if node.tag != 'Source':
        continue
    
    mfd = node.find('IncrementalMfd')
    geometry = node.find('Geometry')
    data.append({
        'indices': geometry.get('indices'),
        'IncrementalMfd m': mfd.get('m'),
        'rate': mfd.get('rate'),
        'type': mfd.get('type'),
        'Geometry depth': geometry.get('depth'),
        'dip': geometry.get('dip'),
        'rake': geometry.get('rake'),
        'width': geometry.get('width')
    })
    
df = pd.DataFrame(data)

Correct answer by Code Different on February 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP