Jump to content
JGMS

Module "sentence_transformers" import in P4D not possible

Recommended Posts

I found some nice Python code on WWW.SBERT.NET that perfectly suited my need to find the best matching images in two file sets.
The file sets are related to each other: they origin the same celluloids, but were created with different quality of scanning equipment. In addition, cropping and tilting may have taken place, and in different ways.
I modified the example code to create a file with information on the best fitting high-quality-version of the low-Q image, as well as the three top score values.

Here is the code that runs perfectly in Python, version 3.10. I am using PyScripter: great!
You can easily test it with your own files, though it may require a series of Python modules to be installed before it runs.

 

 

from sentence_transformers import SentenceTransformer, util
from PIL import Image

model = SentenceTransformer("clip-ViT-B-32")
f = open("E:/Fotos/testmap/ListMatchingFotos.txt", "a")

lijst = ["E:/Fotos/File_nr1.JPG","E:/Fotos/File_nr2.JPG"] # this is the list of low quality images.
image_names = ["M:/Fotos/negatieven 001.jpg","M:/Fotos/negatieven 002.jpg","M:/Fotos/negatieven 003.jpg","M:/Fotos/negatieven 004.jpg","M:/Fotos/negatieven 005.jpg","M:/Fotos/negatieven 006.jpg","M:/Fotos/negatieven 007.jpg","M:/Fotos/negatieven 008.jpg"]
image_names += lijst
encoded_image = model.encode([Image.open(filepath) for filepath in image_names], batch_size=128, convert_to_tensor=True, show_progress_bar=False)

processed_images = util.paraphrase_mining_embeddings(encoded_image)
threshold = 0.99
near_duplicates = [image for image in processed_images if image[0] < threshold]
L = len(near_duplicates)
for j in range(len(lijst)): # narrow the list of pairs to consider only the files in the "Lijst"
    searchresults = []
    for i in range(0,L):
       score, image_id1, image_id2 = near_duplicates[i]
       idf = image_names.index(lijst[j])
       if (( (image_names[image_id1] == image_names[idf] ) and (image_id2 != idf) ) and (not (image_names[image_id2] in lijst))) or  (( (image_names[image_id2] == image_names[idf] ) and (image_id1 != idf) ) and (not (image_names[image_id1] in lijst))):
           searchresults.append( near_duplicates[i] )
    ls = len(searchresults)
    score1 = 0
    score2 = 0
    score, image_id1, image_id2 = searchresults[0]
    if ls > 1:
      score1, image_id11, image_id21 = searchresults[1]
    if ls > 2:
       score2, image_id12, image_id22 = searchresults[2]
    if image_id1 != idf: image_id2 = image_id1
    if score < 85/100:
         f.write( image_names[idf] + " " + image_names[image_id2] + " Score1: {:.3f}%".format(score * 100) + " Score2: {:.3f}%".format(score1 * 100) + " Score3: {:.3f}%".format(score2 * 100) + str(" NO MATCH OR VERY POOR \n"))
    else:
       f.write( image_names[idf] + " " + image_names[image_id2] + " Score1: {:.3f}%".format(score * 100) + " Score2: {:.3f}%".format(score1 * 100) + " Score3: {:.3f}%\n".format(score2 * 100))

f.close()

However, the very same code doesn't run in Python4Delphi, although it uses the same PythonEngine.dll and path and libraries.  
I got the error message "Project .... raised exception class EPyAttributeError with message 'AttributeError: 'NoneType' object has no attribute 'flush'".
The error is generated in the very first line "from sentence_transformers import...", on both modules, either combined or in separate lines.

 

Here is my delphi version of the code above.

Function TPyForm.Picture_Matching_using_Python(Foto_bestanden : String; VAR Zoeklijst :TArray<string>; TekstBestand :String; TargetScore : integer) : Boolean;

 VAR
   Mem :TStringList;
   Lijst, BeterLijst: String;
   Fotos : Tarray<String>;


   begin
    If Foto_bestanden = '' then exit;

    Fotos := Foto_bestanden.Split([',']);
    Lijst      := '[';
    BeterLijst := '[';

    for Var Bestand : String in Fotos DO Lijst := Lijst + '"' + B2F(Bestand) + '"'  + ',';
    Lijst := copy(Lijst,1,length(Lijst)-1)+ ']';

    for Var Bestand : String in Zoeklijst DO BeterLijst := BeterLijst + '"' + B2F(Bestand) + '"'  + ',';
    BeterLijst := copy(BeterLijst,1,length(BeterLijst)-1)+']';


    TRY
      Mem := TStringList.Create;

        With Mem DO
        begin

          Add('import os');
          Add('from PIL import Image');
          Add('  from sentence_transformers import SentenceTransformer, util');
          Add('model = SentenceTransformer("clip-ViT-B-32") ');

          Add('f = open("' + B2F(TekstBestand) + '", "a")');
          
          Add('lijst = '+ Lijst );
          Add('image_names = '+ BeterLijst );
          Add('image_names += lijst');
          Add('encoded_image = model.encode([Image.open(filepath) for filepath in image_names], batch_size=128, convert_to_tensor=True, show_progress_bar=False)');
          Add('processed_images = util.paraphrase_mining_embeddings(encoded_image)');
          Add('threshold = 99/100');
          Add('near_duplicates = [image for image in processed_images if image[0] < threshold] ');
          Add('l = len(near_duplicates) ');

          Add('for j in range(len(lijst)): ');
          Add('   searchresults = [] ');
          Add('   for i in range(0,l): ');
          Add('      score, image_id1, image_id2 = near_duplicates[i] ');
          Add('      idf = image_names.index(lijst[j]) ');
          Add('      if (( (image_names[image_id1] == image_names[idf] ) and (image_id2 != idf) ) and (not (image_names[image_id2] in lijst))) or ' +
                      ' (( (image_names[image_id2] == image_names[idf] ) and (image_id1 != idf) ) and (not (image_names[image_id1] in lijst))): ');
          Add('         searchresults.append( near_duplicates[i] )  ');

          Add('   ls = len(searchresults) ');
          Add('   score1 = 0' );
          Add('   score2 = 0' );
          Add('   score, image_id1, image_id2 = searchresults[0]');
          Add('   if ls > 1: score1, image_id11, image_id21 = searchresults[1] ');
          Add('   if ls > 2: score2, image_id12, image_id22 = searchresults[2] ');
          Add('   if image_id1 != idf: image_id2 = image_id1');

          Add('   if score < ' + TargetScore.tostring + '/100: ');
          Add('      f.write( image_names[idf] + " " + image_names[image_id2] + " Score1: {:.3f}%".format(score * 100) + " Score2: {:.3f}%".format(score1 * 100) + " Score3: {:.3f}%".format(score2 * 100) + str(" GEEN OF TWIJFELACHTIGE MATCH \n"))');
          Add('   else:');
          Add('      f.write( image_names[idf] + " " + image_names[image_id2] + " Score1: {:.3f}%".format(score * 100) + " Score2: {:.3f}%".format(score1 * 100) + " Score3: {:.3f}%\n".format(score2 * 100))' );

          Add('f.close()  ');

        end;

     TRY
       Result         := True;
       PythonEngine1.ExecString( ansiString( Mem.text ) );
     
     Except
         Result := False;
     END;

     FINALLY
       Mem.Free;
     END;
   end;


I have no idea how to proceed, and do hope that anyone does.

I would very much appreciate any help.

 

Jan

Edited by JGMS

Share this post


Link to post

Thank you, and yes indeed.

I am not sure whether I understand about "sys.version" and "sys.path". Where can I find these?

I successfully use the following formcreate function in all my P4D projects, except for the line with "SetPythonHome", as I have added just now.

 

The added line does not make any difference: the same error emerges.

 

procedure TPyForm.FormCreate(Sender: TObject);
begin
  MaskFPUExceptions(True);
  PythonEngine1              := TPythonEngine.Create(PyForm);
  PythonEngine1.RegVersion   := '3.10';
  PythonEngine1.DllName      := 'python310.dll';
  PythonEngine1.DllPath      :=  'C:\Users\myInlogName\AppData\Local\Programs\Python\Python310'; 
  PythonEngine1.AutoLoad     := false;
  PythonEngine1.AutoFinalize := true;
  PythonEngine1.AutoUnload   := true;
  PythonEngine1.UseLastKnownVersion := false;
  PythonEngine1.RedirectIO   := false;
  PythonDelphiVar1.Engine    := PythonEngine1;
  PythonDelphiVar2.Engine    := PythonEngine1;
  PythonDelphiVar3.Engine    := PythonEngine1;
  
  PythonEngine1.SetPythonHome('C:\Users\myInlogName\AppData\Local\Programs\Python\Python310'); // I added this suggested code line here, in absence of a BeforeLoad event.
  
  PythonEngine1.loadDLL; 

  PyEmbeddedResEnvironment3101.pythonEngine := PythonEngine1;
  PyEmbeddedResEnvironment3101.Autoload     := True;

end;

 

Share this post


Link to post
On 7/31/2023 at 6:41 PM, JGMS said:

'AttributeError: 'NoneType' object has no attribute 'flush'"

Looks like the python module is trying to write to the console.

Do you use TPythonGUIInputOutput linked to the PythonEngine?   If not try setting UseWindowsConsole to True.

Share this post


Link to post

I tried your suggestion, by adding "PythonEngine1.UseWindowsConsole:= True;" in the formcreate function, just before the LoadDLL command.

The only effect was that a black command screen appeared for about half a second. The error message remained unchanged.

 

Edited by JGMS

Share this post


Link to post
4 hours ago, JGMS said:

The only effect was that a black command screen appeared for about half a second

Then try the following before loading the engine:

- Create a TPythonInputOutput

- Set the PythonEngine IO property to the TPythonInputOutput

- Set the RedirectIO property to True.

 

 

Share this post


Link to post

Thank you so much, @PyScripter.

The error message indeed disappeared, and the code functioned like designed.

However, the modifications envoked a new problem: it appears no longer possible to run the P4D routines in the background.

If I try to do so, I got the error "...raised exception class $C0000005 with message 'c0000005 ACCESS_VIOLATION".

 

Not only this occurs with the routine "Picture_Matching_using_Python" as shown above, but just with all of them if in background.

I use the following calling code:

  TTask.Run( Procedure
   begin
	  
      PyForm.Picture_Matching_using_Python(StringWithFilenames_CSV,MatchedList_Tarray, TextFileNameContainingMatchResults, some_info_Str, minimumscore_int ) ; 
	  
      TThread.Synchronize(nil, procedure
	  begin
		ForceDirectories(NameCopyToFolder);
		for var j := 0 to length(MatchedList_Tarray) -1 do
		  if MatchedList_Tarray[j] <> OriginalsList[j] then
			TFile.Copy(trim(MatchedList_Tarray[j]), NameCopyToFolder + trim(ExtractFileName(MatchedList_Tarray[j])), TRUE ) ;   
	  end);

   end);

What can be the cause of this?
I do hope you have suggestions on how to solve the issue.

Edited by JGMS

Share this post


Link to post

Running python in threads is not straightforward.  Have a look at TPythonThread,  Demo 33, and search this forum for relevant discussions.

Share this post


Link to post

Thank you again @pyscripter,

I just stared at Demo 33 code for quite a while. Indeed, it does not look trivial. It needs a deep dive.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×