Jump to content
Tommi Prami

Trouble with (very) simple XML-parsing

Recommended Posts

(I use 10.3.1)

 

If got an code :

 

uses
  Winapi.msxml, Winapi.MSXMLIntf;

procedure TForm10.Button1Click(Sender: TObject);
var
  LDocument: IXMLDomDocument3;
  LCurrrentValueNode: IXMLDOMNode;
  LNodeList: IXMLDOMNodeList;
  I: Integer;
begin
  LDocument := CoDOMDocument60.Create;
  LDocument.async := False;
  LDocument.validateOnParse := False;
  LDocument.ResolveExternals := False;
  LDocument.PreserveWhiteSpace := True;

  LDocument.loadXML(Memo1.Lines.Text);
  if LDocument.parseError.errorCode <> 0 then
    raise Exception.CreateFmt('XML parsing error : %s at row %d position %d',
      [LDocument.parseError.reason, LDocument.parseError.Line, LDocument.parseError.linepos]);

  if Assigned(LDocument.documentElement) then
    LNodeList := LDocument.documentElement.getElementsByTagName('IBAN')
  else
    LNodeList := LDocument.getElementsByTagName('IBAN');

  for I := 0 to LNodeList.Length - 1 do
  begin
    LCurrrentValueNode := LNodeList.Item[I];
    Memo2.Lines.Add(LCurrrentValueNode.nodeValue);
  end;

  repeat
    LCurrrentValueNode := LNodeList.nextNode;
    if Assigned(LCurrrentValueNode) then
      Memo2.Lines.Add(LCurrrentValueNode.nodeValue)
    else
      Break;
  until True;
end;

and XML

 

<?xml version="1.0" encoding="UTF-8"?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:camt.053.001.02">
  <BkToCstmrStmt>
    <Stmt>
      <Acct>
        <Id>
          <IBAN>FI0434270410003403</IBAN>
        </Id>
      </Acct>
    </Stmt>
    <Stmt>
      <Acct>
        <Id>
          <IBAN>FI7542316072000169</IBAN>
        </Id>
      </Acct>
    </Stmt>
  </BkToCstmrStmt>
</Document>

Any idea why getElementsByTagName does not return Tags it should (I think).

 

Once you're set to do something pretty simple and nothing works. I remember now why I do not enjoy the M$Xml too much.

 

Also if you can recommend better replacement XML libraries, feel free to do so.

Edited by Tommi Prami

Share this post


Link to post

With OmniXML dom I got slightly closer

 

procedure TForm10.Button2Click(Sender: TObject);
var
  LXMlDocument: IXMLDocument;
  LDomList: IDOMNodeList;
  LValueNode: IDOMNode;
  I: Integer;
begin
  DefaultDOMVendor := sOmniXmlVendor;
  LXMlDocument := LoadXMLData(Memo1.Lines.Text);
  LDomList := LXMlDocument.DOMDocument.documentElement.getElementsByTagName(TAG_IBAN);

  for I := 0 to LDomList.length - 1 do
  begin
    LValueNode := LDomList[I];
    Memo2.Lines.Add(LValueNode.nodeName + ': ' + LValueNode.nodeValue);
  end;
end;

it find the tags, but nodeValue returns empty. 

 

-Tee-

Share this post


Link to post

Doesn't getElementsByTagName(..) only list direct descendants? You will either have to search recursively, or use something like an XPath expression to get all the nodes you are interested in.

Share this post


Link to post

At least all the documentation I've found tells that it should. find others also. m$xml docs and also more generic DOM ones.

 

What ever is the implementation is totally different story.

 

-Tee-

Share this post


Link to post
1 hour ago, Tommi Prami said:

if Assigned(LDocument.documentElement) then 
  LNodeList := LDocument.documentElement.getElementsByTagName('IBAN')
 else
   LNodeList := LDocument.getElementsByTagName('IBAN');

 

Why do you have this test? I would think that the second variant (LDocument.getElementsByTagName) would be good enough...

 

I also think it would be safer if you used XPath or specified the path in your search criteria:

LDocument.getElementsByTagName('/BkToCstmrStmt/Stmt/Acct/Id/IBAN');

 

Share this post


Link to post

First test was there just to get an idea is there one and which it'll use. IT seems to depend on something. Done so many tests so can't really give specifics on that.

 

At least that xpath down work, because structure of XML is not always same. It has to be more structure agnostic.

 

 

Share this post


Link to post

Did test with Xml.XMLIntf.pas IXMLDocument also. OmniXML as vendor or not.

 

Made own recursive routine, based on OmniXML implementation, it'll find nodes just fine, but not node Values, which seems to be super weird.

 

function GetElementsByTagName(const ADocumentElement: IDOMElement; const AElementName: string): TList<IDOMNode>;

  procedure InternalGetElementsByTagName(const ANode: IDOMNode; const AElementName: string;
    const ANodeList: TList<IDOMNode>);
  var
    I: Integer;
    LChildNode: IDOMNode;
  begin
    if ANode.HasChildNodes then
      for I := 0 to ANode.ChildNodes.Length - 1 do
      begin
        LChildNode := ANode.ChildNodes.Item[I];
        if (LChildNode.NodeType = ELEMENT_NODE) and ((LChildNode as IDOMElement).NodeName = AElementName) then
          ANodeList.Add(LChildNode);
        InternalGetElementsByTagName(LChildNode, AElementName, ANodeList);
      end;
  end;

begin
  Result := TList<IDOMNode>.Create;

  InternalGetElementsByTagName(ADocumentElement, AElementName, Result);
end;

 

Edited by Tommi Prami

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×