Jump to content
Tommi Prami

How to compare msXML nodes

Recommended Posts

I have an code:

function FindParentNodeByName(const AChildNode: IXMLDOMNode; const AParentNodeName: string): IXMLDOMNode;
var
  LParentNode: IXMLDOMNode;
begin
  Result := nil;

  if not Assigned(AChildNode) then
    Exit;

  LParentNode := AChildNode;
  repeat
    LParentNode := LParentNode.parentNode;

    if Assigned(LParentNode) and (LParentNode.nodeName = AParentNodeName) then
      Exit(LParentNode);
  until not Assigned(LParentNode)
end;

 

Which finds the Node OK, but if I cann it two times with same parameters are not same, meaning their pointers are not same.

 

So if I have two random IXMLDOMNodes from same XML document, how I can be sure that they are actually same, if can't compare pointer?

 

So if I have something like:

LParent1 := FindParentNodeByName(LNode1, 'somenode');

LParent2 := FindParentNodeByName(LNode1, 'somenode');

 

if LParent1 = LParent2 then ShowMessage('Same); // Won't work ever

So I should have some way to compare they are actually same node like

 

if LParent1.IsExactlySameNode(LParent2) then ShowMessage('Same);

 

-Tee-

 

PS. not sure if this is actually right group, move if in wrong one...

 

Edited by Tommi Prami

Share this post


Link to post
Guest

If i understand your task, i'd walk upwards and check that i get exactly the same "path" up to the root. But i might be mistaken.

Share this post


Link to post
Guest

I do similar compare by copying one of them into a temp (IXMLDOMNode in this case) , then walk the second one by one (childs first) while deleting from temp, such process will stop in the middle because a delete failed or you will be left with empty temp, indicating they are same.

Share this post


Link to post
6 hours ago, Tommi Prami said:

So if I have something like:


LParent1 := FindParentNodeByName(LNode1);

LParent2 := FindParentNodeByName(LNode1);

 

if LParent1 = LParent2 then ShowMessage('Same); // Won't work ever

That makes no sense.  If you pass the SAME AChildNode node into both calls, with the same AParentNodeName value (which you didn't show), then they will return the same node on output.  Perhaps you meant LNode2 instead of LNode1 in the second call?

 

If you pass in DIFFERENT node pointers on input, then obviously you are going to get DIFFERENT node pointers in output, unless they have a COMMON parent node, in which case you WILL then get the same node pointer on output.  Unless, for some reason, Microsoft decided to make the IXMLDOMNode.parentNode property getter allocate a new implementation object every time it is called, which is unlikely (but easy to test).

6 hours ago, Tommi Prami said:

So I should have some way to compare they are actually same node like

If you can't compare the node pointers directly, then you have to compare the content they represent.  There is no getting around that.

 

Can you show an example XML and code demonstrating exactly what you are trying to accomplish?  How are you retrieving the IXMLDOMNode pointers that you are trying to compare?

 

Share this post


Link to post

The problem is that. I have two pr more nodes. I have one or nodes in the tree, same level and name, but there might be more than one.

 

I'll try to illustrate.

 

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <note>
    <to>Tove</to>
    <from>Jani</from>
    <heading>Reminder</heading>
    <body>Beer this week end, please</body>
    <footer>and some other beverages</footer>
  </note>
  <note>
    <to>Big Lebowski</to>
    <from>Stephen King</from>
    <heading>FYI</heading>
    <body>Not on my rug</body>
    <footer>man</footer>
  </note>
</root>

 

If got list of From nodes on my hand, lets say for sport, from that XML 7 of them. lots of duplicates, sadly I am only here at this point in life (for sake of example). I need to know  are they members of same node or not. Lets say I would like to throw away the duplicates, or whatever.. .

 

To my surprise even though my code returns same node, but the pointer is not the same. I've debugged this and have unittest for it also that proves it.

 

Is there in msXML any way to tell are two node s the same, as the variable/pointer are not same. Is this because of it's an Interface, dunno

 

But the nodes returned haven't same value in those  LNode1 and LNode2 variables in that case, I am sure the node is the same tough... XML which I was working on had only one parent node so it made me really question my mental health. But I've got code to prove it. I'll try to make runnable demo on weekend If have time.

 

Path is not OK, because path can be same up to the root.

 

I am not fully awake so I just can't managage to compress my thoughts into fewer words.

-Tee-

Share this post


Link to post

Do you have to stick to msxml? From what I see it's pretty awkward and in Delphi you have tons of native implementations

Share this post


Link to post
40 minutes ago, Fr0sT.Brutal said:

Do you have to stick to msxml? From what I see it's pretty awkward and in Delphi you have tons of native implementations

My first suggestions was not to use msXML, and guess the answer I got...

 

-Tee-

Share this post


Link to post
Guest

Will comparing the paths(attributes, values..) do ?

Like this 

function CompareIXMLDOMNode(const Node1, Node2: IXMLDOMNode; IncludeAttribute:
  Boolean = True; IgnoreOrder: boolean = False; IgnoreDuplicated: Boolean =
  False): Boolean;

  function GetIXMLDOMNodeAttributes(NodeMap: IXMLDOMNamedNodeMap; const
    Separator: string = '&'; AttributeCount: Integer = 100): string;
  var
    i,Len: Integer;
  begin
    if Assigned(NodeMap) and (NodeMap.length > 0) then
    begin
      Len:=NodeMap.length ;
      if Len>AttributeCount then
        Len:=AttributeCount;
      Dec(Len);
      for i := 0 to Len do
        if NodeMap.item[i].nodeValue <> null then
          Result := Result + Separator + NodeMap.item[i].nodeName + '=' +
            NodeMap.item[i].nodeValue;
    end;
  end;

  function GetIXMLDOMNodePath(Node: IXMLDOMNode; IncludeAttribute: Boolean =
    True; const Separator: string = ''; PathDepth: Integer = 100): string;
  var
    stAttribute: string;
  begin
    if (PathDepth > 0) and Assigned(Node) then
    begin
      stAttribute := GetIXMLDOMNodeAttributes(Node.attributes);
      if stAttribute <> '' then
        stAttribute := '{' + stAttribute + '}';
      Result := GetIXMLDOMNodePath(Node.parentNode, IncludeAttribute, Separator,
        PathDepth - 1) + stAttribute + Node.nodeName + Separator;
    end;
  end;

  procedure GetIXMLDOMNodeAsText(Node: IXMLDOMNode; XMLTextList: TStringList;
    IncludeAttribute: Boolean = True; const Separator: string = ''; PathDepth:
    Integer = 100);
  var
    I: Integer;
  begin
    if Node.nodeValue = Null then
      XMLTextList.Add(GetIXMLDOMNodePath(Node, IncludeAttribute, Separator,
        PathDepth - 1))
    else
      XMLTextList.Add(GetIXMLDOMNodePath(Node, IncludeAttribute, Separator,
        PathDepth - 1) + Node.nodeValue);

    if (PathDepth > 0) and (Node.childNodes.length > 0) then
      for I := 0 to Node.childNodes.length - 1 do
        GetIXMLDOMNodeAsText(Node.childNodes.item[I], XMLTextList,
          IncludeAttribute, Separator, PathDepth - 1);
  end;

var
  slNode1, slNode2: TStringList;
begin
  Result := False;
  if GetIXMLDOMNodePath(Node1) = GetIXMLDOMNodePath(Node2) then
  begin
    slNode1 := TStringList.Create;
    try
      if IgnoreDuplicated then
        slNode1.Duplicates := dupIgnore;
      GetIXMLDOMNodeAsText(Node1, slNode1, IncludeAttribute, '/');
      slNode2 := TStringList.Create;
      try
        if IgnoreDuplicated then
          slNode2.Duplicates := dupIgnore;
        GetIXMLDOMNodeAsText(Node2, slNode2, IncludeAttribute, '/');

        if IgnoreOrder then
        begin
          slNode1.Sort;
          slNode2.Sort;
        end;

        if slNode1.Text = slNode2.Text then
        begin
          Result := True;
          Exit;
        end;

      finally
        slNode2.Free;
      end;
    finally
      slNode1.Free;
    end;
  end;
end;

 

Edited by Guest

Share this post


Link to post

Well, if you deal with non-changing document, you can calculate paths using indexes, this will uniquely identify a node. Like Base[3].Child[2].GrandChild[2] for a node

<Base>
<Base>
<Base> - 3
  <Child>
  <Child> - 2
    <GrandChild>
    <GrandChild> - 2 <== that's it

Or, if you can modify the document internally, you can add ID attr to each node you need.

 

But from what you said you seem wanting to detect duplicates? That's a different task!

Share this post


Link to post

Hello,

 

Thanks for workaround ideas.

 

I've got an workaround/fix idea. I

 

just really really would like to avoid it. Too much of an work at this point of time.

 

If someone has any idea how to compare compare random nodes and tell ae they exact same nodes or not, would be really cool!

 

-Tee-

Share this post


Link to post

First could you be more precise what nodes would you consider "equal" and what you're able to do with the loaded document.  In the very general case, without preliminary marking all nodes with unique ID you can't tell if two random pointers point to the same node object.

Share this post


Link to post
On 11/11/2019 at 3:55 PM, Fr0sT.Brutal said:

First could you be more precise what nodes would you consider "equal" and what you're able to do with the loaded document.  In the very general case, without preliminary marking all nodes with unique ID you can't tell if two random pointers point to the same node object.

Not talking about equal, maybe lost in translation, I am talking of SAME node. 

BAsically i've got two separate processes, and I need to compare their result in third are they actually very same node or not.

But I just assumed that DOM woulöd somehow know, or have some comparison to tell are these two random nodes actually same node or not. Apparently not, because there has not been any answer. And didn't find anything by googling around.

I think I am not allowed to change the structure of the XML. if there is "tag" property in the nodes, most likely could walk through the DOM at beginning and initialize some counter ID. Or Actually would need only mark nodes I am interested in, while searching the nodes of interest.

 

-Tee-

Share this post


Link to post

Hmm, I'm a bit confused. Do you have two XML's written by two processes? Then I don't get how you could check if a node is "the same". For comparing results you'll have to compare node contents

2 hours ago, Tommi Prami said:

I think I am not allowed to change the structure of the XML

You could assign ID's in runtime only.

Share this post


Link to post
On 11/14/2019 at 9:29 AM, Fr0sT.Brutal said:

Hmm, I'm a bit confused. Do you have two XML's written by two processes? Then I don't get how you could check if a node is "the same". For comparing results you'll have to compare node contents

You could assign ID's in runtime only.

How? Or where?

But the actual DOM must not change because need to save it after processing.

 

-Tee-

Share this post


Link to post
1 hour ago, Tommi Prami said:

How? Or where?

But the actual DOM must not change because need to save it after processing.

My idea was:

- you have your XML loaded

- you loop through nodes and number them with unique ID's

- now you can easily identify the nodes returned by FindNode

- when it's time to save XML, first clean these temporary ID's

 

but now when you mention two processes, I got lost and can't realize what exactly you want.

Maybe if you could describe your task in brief but fully it'll shed some light.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×