Jump to content
RaelB

Understanding UniqueFilter stage from WebSpider demo

Recommended Posts

Hello,

I am looking at the code from WebSpider demo from HL-III presentation. (The project is just called pipeline)

 

It has code like this:

procedure TfrmWebSpider.UniqueFilter(const input, output: IOmniBlockingCollection);
var
  uniqueUrls: TStringList;
  url       : string;
begin
  uniqueUrls := TStringList.Create;
  try
    uniqueUrls.Sorted := true;
    for url in input do begin
      if uniqueUrls.IndexOf(url) < 0 then begin
        uniqueUrls.Add(url);
        output.TryAdd(url);
      end
      else if FURLCount.Decrement = 0 then
        FSpider.Input.CompleteAdding;
    end;
  finally FreeAndNil(uniqueUrls); end;
end;

When I run the demo, I see that the uniqueUrls TStringList is working on all inputs received. This is what we want, but how does this work?, since it is a local variable one would expect/imagine that it goes out of scope each time UniqueFilter is "called".

 

Thanks

Rael

Share this post


Link to post

UniqueFilter does not go out of scope. If defined in this format (with input, output: IOmniBlockingCollection parameters), pipeline only calls it once.

 

The 'for url in input' loop is the one that processes all elements that arrive via the input pipeline.

Share this post


Link to post

Thanks, that is quite amazing. You're saying the loop is actually pausing/resuming depending on queue activity.

Share this post


Link to post

Indeed. IOmniBlockingCollection implements an enumerator which waits for the next available value. The only way to terminate such for loop is to call input.CompleteAdding which signals the enumerator that no new values can ever be produced.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×