TransWikia.com

How to extract data and attribute data from an XMLElement

Mathematica Asked by user74549 on December 25, 2020

I am trying to write a function to extract both the data and the attribute data from an XMLElement in the same pass. I have followed the examples from Mathematica’s Transforming XML tutorial and the Mathematica StackExchange solution here

Extract attribute data from an XMLElement

However, the function I wrote returns the empty list. I believe the Cases function traverses the XML string once and, consequently, does not parse the XMLElement for the second time as I thought it would.

My MWL starts with reading the fleetXMLString from file with Import[fName]

    XMLObject[
  "Document"][{XMLObject["Declaration"]["Version" -> "1.0", 
   "Encoding" -> "utf-8"]}, 
 XMLElement[
  "Fleet", {}, {XMLElement[
    "SomeVehicle", {}, {XMLElement["Name", {}, {"BJ#00"}], 
     XMLElement[
      "Bus", {}, {XMLElement["Shape", {}, {"parallelepiped"}], 
       XMLElement["Length", {"unit" -> "Distance"}, {"0.5"}], 
       XMLElement["Width", {"unit" -> "Distance"}, {"0.4"}], 
       XMLElement["Height", {"unit" -> "Distance"}, {"0.3"}], 
       XMLElement["Density", {"unit" -> "Density"}, {"500.0"}]}]}], 
   XMLElement[
    "SomeVehicle", {}, {XMLElement["Name", {}, {"BJ#01"}], 
     XMLElement[
      "Bus", {}, {XMLElement["Shape", {}, {"parallelepiped"}], 
       XMLElement["Length", {"unit" -> "Distance"}, {"0.5"}], 
       XMLElement["Width", {"unit" -> "Distance"}, {"0.4"}], 
       XMLElement["Height", {"unit" -> "Distance"}, {"0.3"}], 
       XMLElement[
        "Density", {"unit" -> "Density"}, {"500.0"}]}]}]}], {}]

The two functions below give the expected results

BusPhysParam[xmlString_, name_, pName_] :=
 Cases[
  Cases[xmlString,
   XMLElement["SomeVehicle", _, {___,
     XMLElement["Name", _, {name}], ___}], Infinity],
  XMLElement["Bus", _, {___,
     XMLElement[pName, _, {dim_}], ___}] :> ToExpression[dim], 
  Infinity]
BusPhysParam[fleetXMLString, "BJ#00", "Width"]

{0.4}

and

BusPhysParamUnit[xmlString_, name_, pName_] :=
 Cases[
  Cases[xmlString,
   XMLElement["SomeVehicle", _, {___,
     XMLElement["Name", _, {name}], ___}], Infinity],
  XMLElement["Bus", _, {___,
     XMLElement[pName, {___, "unit" -> unit_}, ___], ___}] :> unit, 
  Infinity]
BusPhysParamUnit[fleetXMLString, "BJ#00", "Width"]

{Distance}

However, this function returns the empty list

BusPhysParamMod[xmlString_, name_, pName_] :=
 Cases[
  Cases[xmlString,
   XMLElement["SomeVehicle", _, {___,
     XMLElement["Name", _, {name}], ___}], Infinity],
  XMLElement["Bus", _, {___,
     XMLElement[pName, _, {dim_}], ___,
     XMLElement[pName, {___, "unit" -> unit_}, ___]}] :> {ToExpression[dim], 
    unit}, Infinity]
BusPhysParamMod[fleetXMLString, "BJ#00", "Width"]

Is there a way to extract both the value and attribute at the same time?
Thank you!
B

2 Answers

BusPhysParamMod[xmlString_, name_, pName_] := 
 Cases[Cases[xmlString, 
   XMLElement[
    "SomeVehicle", _, {___, XMLElement["Name", _, {name}], ___}], 
   Infinity], 
  XMLElement[
    "Bus", _, {___, 
     XMLElement[
      pName, {___, "unit" -> unit_}, {dim_}], ___}] :> {ToExpression[
     dim], unit}, Infinity]
BusPhysParamMod[fleetXMLString, "BJ#00", "Width"]

Correct answer by Jean-Pierre on December 25, 2020

I think that using Cases to break apart XML is probably not the right way to go. When working with the nested structure like this, the natural approach is to use recursion. The following code could use polishing but it suggests a general way to deal with XML. The basic idea is that you find the head of the structure and strip it off, then recurse on what's left. Given the myriad of forms that XML can take, this allows you to customize the code to whatever kind of nesting XML throws at you, and the parsing is pretty mechanical once you understand the XML nesting (which unfortunately you must do if you want to work with XML!)

trf[XMLObject["Document"][decl_, rest_, {}], bS_] := trf[rest, bS];
trf[XMLElement["Fleet", {}, fleetL_], bS_] := trf[#, bS] & /@ fleetL;
trf[XMLElement["SomeVehicle", {}, vehL_], bS_] := 
  If[vehL[[1]] == XMLElement["Name", {}, {bS}],
   trf[Rest[vehL]]];
trf[{XMLElement["Bus", {}, busAttL_]}] := busAttGL = busAttL;


 (* called with *)
 trf[fleetXMLString, "BJ#00"];
  1. The first function strips off the document header and the declaration.

  2. The second one pulls off the "Fleet" and maps across the list of vehicles in the fleet, because there are potentially many vehicles.

  3. The third function pulls out the name of the specific vehicle you are looking for (say "BJ#00") with the IF statement and then continues the recursion stripping off the "SomeVehicle" XMLElement.

  4. The final step takes the list of attributes and stuffs the list into a global list for further processing (maybe not the most elegant part) but I think this is the general solution to the problem you are posing, which at a high level is "how can I extract the traits of a specific vehcile?" This leaves all the traits together in a form can be post-processed as you need.

  5. This looks like {XMLElement["Shape", {}, {"parallelepiped"}], XMLElement["Length", {"unit" -> "Distance"}, {"0.5"}], XMLElement["Width", {"unit" -> "Distance"}, {"0.4"}], XMLElement["Height", {"unit" -> "Distance"}, {"0.3"}], XMLElement["Density", {"unit" -> "Density"}, {"500.0"}]}

  6. From here you can extract the attributes by manipulating the list directly, or if you prefer you can use Cases like this:

selectedAttr=Cases[busAttGL, XMLElement["Width", ___, ___]][[1]]

And then you can take the rule and the value and put them together using Cases again and the rule:

{"unit" /. First[Cases[selectedAttr, {"unit" -> "Distance"}]],selectedAttr[[3]][[1]]}

I would probably just go into the list itself:

`{selectedAttr[[2]][[1]][[2]], selectedAttr[[3]][[1]]}`

Both produce {"Distance", "0.4"} which is what you are after. This makes it fairly straightforward to do any permutation of bus attributes etc.

Both of these are kind of unpretty and I'm sure somebody on the forum will suggest a cleaner way to go from the XMLElement form to what you want.

Answered by Mike Colacino on December 25, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP