Mark Williams 14 Posted October 10, 2020 I am using late binding to open Word/Excel documents etc to try to read the built in and custom properties if documents/workbooks. var MSApp, Props, item:Variant; i, int:integer; d:TDateTime; f:Double; s:string; begin MSApp := CreateOLEObject('Word.Application'); MSApp.Documents.OpenNoRepairDialog(filename, false, true, false, password, password, False, password, password, emptyParam, emptyParam, false, true); MSApp.ActiveDocument.Repaginate; Props := MSAPP.ActiveDocument.BuiltInDocumentProperties; for i := 0 to Props.Count do begin item := Props.item[i]; memo1.Lines.Add('Name='+item.Name); case varType(item.value) of varSmallInt, varShortInt, VarInteger, varSingle, varByte, varWord, varLongWord, varInt64: begin int:=item.value; Memo1.Lines.Add('value as int='+int.ToString); end; varDate: begin d:=item.value; Memo1.Lines.Add('value as date='+DateTimeToStr(d)); end; varDouble, varCurrency: begin f:=item.value; Memo1.Lines.Add('value as float='+FloatToStr(f)); end; varBoolean: begin b:=item.value; Memo1.Lines.Add('value as boolean='+ord(b).ToString); end; varOleStr, varStrArg, varString: begin s:=item.value; Memo1.Lines.Add('value as string='+s); end else try s:=item.value; Memo1.Lines.Add('value as somethign else='+s); except end; end; end; I am getting errors and other issues with a number of the properties: DateTime There are three date time properties in the built in properties: Last print date Creation date Last save time My function fails with the print date and the last save time. I get an EOLEExecption "unspecified error" when querying the VarType of the item value (case varType(item.value) ). I don't get an error for Creation date, however, the creation date is always read as the current date time, regardless of what the properties of the document actually says. The format of all three dates appears identical in the document properties. Pages, word count ,number of characters etc Pages always returns 1 regardless of number of pages. I thought calling Repaginate first might resolve this, but it didn't. Word & character count always return 0 (which is incorrect). Any help with the above appreciated. Thanks. Share this post Link to post
Guest Posted October 10, 2020 (edited) Since you do not state what you are trying to achieve... it's a bit difficult. *If* you are using OLE to "process" or just read a docx-document... (no embedding, no VB-scripts) then i would recommend you to look at OOXML. Microsoft has an SDK for OOXML. One could use it through a .NET layer thingy with Delphi. There's also a GUI tool that lets you dig into word-documents, structure and whatnot. https://docs.microsoft.com/en-us/office/open-xml/open-xml-sdk, going on there are whitepapers (on the OOXML format) and there are 3rd party and components out there too that does not depend on Word.exe. I am doing document processing on the server so starting an application (word) is a really bad decision. Word.exe is an application, not an engine. I decided a couple of years ago to roll my own OOXML-processing unit (units). It is not trivial but it is not rocket science either. I ended up using Kluug.net Oxml because Office is really sensitive to an extra #10 or #13 in the XML. The monster is basically: Unzip the docx, xlsx or pptx file to memory/after changes zip it up and write down. 2. A set of functions and types that mainly iterates recursively through the xml-parts and then recursive sub-functions, reading, parsing and modifying the xml "as needed"*. This way i can provide my clients with advanced docx-manipulation using only native Delphi code. Also, since there is no macro/VB/scripting engine whatsoever i do not have to worry about viruses and neither about process-stopping notifications of macro/virus threat. Presently, i supply a file, two dictionaries; one for replacing bookmarks and one for replacing the text in tags (click or touch here to edit...), with function arguments for specific processing and logic for deciding to replace text/information or not. I can read and extract anything, the properties structs are the easiest ones. I can replace headers and footers and recourses. Also it's not difficult to weed out or collect "review" information. OOXML is not (like the binary doc model) all bad. M$ actually do some stuff properly, not that i'm in love or something. Hold the horses! HTH Edited October 10, 2020 by Guest Sounded too excited about M$ work, had to take it down a bit. Share this post Link to post
Mark Williams 14 Posted October 11, 2020 17 hours ago, Dany Marmur said: Since you do not state what you are trying to achieve... it's a bit difficult. I though I stated that I was trying to read the BuiltInDocumentProperties of office documents and that I was having difficulties reading certain of those properties. I also provided the code and stated what were the problems I was experiencing. I'm not sure what further info was relevant. I wish to proceed, if at all possible (and I don't see why it shouldn't be) with OLE late binding to open the office documents (I need to save them in pdf format) and then read their BuiltInDocumentProperties whilst I have the document open. I appreciate your recommendation of OOXML, but I don't really have time to delve into this at the moment and would prefer to use BuiltInDocumentProperties, particularly as I have to open the document via its relevant MS app in any event. Share this post Link to post
Guest Posted October 11, 2020 @Mark Williams, i see. You do state your needs quite appropriately. I must apologise, i read "OLE" and did not look at your code. I left/chucked/disposed of any OLE and late bindings 15 yrs ago. So i cannot contribute. Good luck! Share this post Link to post
Mark Williams 14 Posted October 11, 2020 5 minutes ago, Dany Marmur said: i see. You do state your needs quite appropriately. I must apologise Thanks, but no need for apology! Share this post Link to post