Jump to content

Understanding UniqueFilter stage from WebSpider demo

Recommended Posts


I am looking at the code from WebSpider demo from HL-III presentation. (The project is just called pipeline)


It has code like this:

procedure TfrmWebSpider.UniqueFilter(const input, output: IOmniBlockingCollection);
  uniqueUrls: TStringList;
  url       : string;
  uniqueUrls := TStringList.Create;
    uniqueUrls.Sorted := true;
    for url in input do begin
      if uniqueUrls.IndexOf(url) < 0 then begin
      else if FURLCount.Decrement = 0 then
  finally FreeAndNil(uniqueUrls); end;

When I run the demo, I see that the uniqueUrls TStringList is working on all inputs received. This is what we want, but how does this work?, since it is a local variable one would expect/imagine that it goes out of scope each time UniqueFilter is "called".




Share this post

Link to post

UniqueFilter does not go out of scope. If defined in this format (with input, output: IOmniBlockingCollection parameters), pipeline only calls it once.


The 'for url in input' loop is the one that processes all elements that arrive via the input pipeline.

Share this post

Link to post

Thanks, that is quite amazing. You're saying the loop is actually pausing/resuming depending on queue activity.

Share this post

Link to post

Indeed. IOmniBlockingCollection implements an enumerator which waits for the next available value. The only way to terminate such for loop is to call input.CompleteAdding which signals the enumerator that no new values can ever be produced.

Share this post

Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now