Jump to content
Mark Williams

How to detect if TIDMessage is unable to read an email

Recommended Posts

I can call LoadFromFile for TIDMessage on a MSG file. However, it has no return value to advise if it has successfully loaded the email. You only seem to know whether it has failed when you attempt to read certain properties eg Date which is returned as a 0 value.

 

Is it therefore safe to assume that an email is in a format TIDMessage cannot read if it returns a 0 value for Date or is there a more reliable way of ascertaining if TIDMessage can read the particular email format?

Share this post


Link to post
2 hours ago, Mark Williams said:

Is it therefore safe to assume that an email is in a format TIDMessage cannot read if it returns a 0 value for Date or is there a more reliable way of ascertaining if TIDMessage can read the particular email format?

I don't know but don't you have the source code?

It should be fairly easy to find out exactly what TIDMessage does if you just trace into LoadFromFile in the debugger.

Share this post


Link to post
9 minutes ago, Anders Melander said:

if you just trace into LoadFromFile in the debugger.

I have worked through the code. Basically, it loads the file (whatever format it may be in) and attempts to parse the headers. If it cannot find a header it move on. 

So even with an outlook MSG file (which is in a proprietary format that IDMessage cannot read), it gives it a go and outputs something albeit not terribly useful.

 

For the Date header, it calls a separate function that tries to parse the header value and if this is not in the right format it throws up and catches an exception and exits the function return a nil datetime value.

 

I thought that might be the safest thing to use working on the assumption that ALL emails will have a Date header formatted in compliance with the standards and if it cannot be read that must mean that TIDMessage cannot read the format of that particular email. However, I am also aware that the standards are not strictly adhered to in all case and so my assumption is possibly unsafe.

 

If it is unsafe then I am hoping there might be someone who has come up with a more reliable solution.

Share this post


Link to post
6 minutes ago, Mark Williams said:

I have worked through the code.

Then you are already able to answer your own question.

 

8 minutes ago, Mark Williams said:

I thought that might be the safest thing to use working on the assumption that ALL emails will have a Date header formatted in compliance with the standards [...] However, I am also aware that the standards are not strictly adhered to in all case and so my assumption is possibly unsafe. 

If your assumption is "possibly unsafe" then working on that assumption is not "the safest thing". You're contradicting yourself.

 

You are never going to be able to handle every possible scenario. I suggest you create a set of test files, both valid and invalid. Make sure you can handle those but code defensively under the assumption that there are cases you don't know of yet.

Share this post


Link to post
1 hour ago, Anders Melander said:

If your assumption is "possibly unsafe" then working on that assumption is not "the safest thing".

It may possibly be unsafe, but it may be the only option and therefore the safest one, albeit not 100% safe. However, I'm asking for alternative and hopefully better solutions from users familiar with TIDMessage who may have encountered this issue. 

 

I have tried reading a large number of different emails including MSG. I've also submitted pdf, xml, avi, jpg, pas etc. It will have a crack at anything without complaint. Consistently (and unsurprisingly)  it returns 0 for Date for files in an unreadable format (or possibly I should say incorrect format). Every email I have submitted to it in the expected format it has returned a non 0 Date value.  However, it is possible (though I suspect unlikely or at least very rare) that there may be emails otherwise in the expected format, but with an incorrect Date header. I would rather my procedure does not return with an unread result in such cases.

 

Unless someone can suggest a more accurate solution, I will go with what I have, but I would just like to hear from someone who has been here before me

Share this post


Link to post

TIdMessage is designed to parse RFC822-style emails only (EML files, etc).  Attempting to load anything else is basically undefined behavior, there is no guarantee if/how errors will be reported back into user code.  There is no single point of query where you can discover the result of a failed parse.  So, your best option is to just filter out non-RFC822 files before you try to load them into TIdMessage.  Basically, analyze a handful of bytes at the beginning of a given file, and if they don't appear to resemble RFC822-style headers then simply don't load the file at all (all of the non-EML file formats you have mentioned are well-documented and easily identifiable).  This is something I may consider adding to TIdMessage itself in a future release, but that is not going to happen anytime soon, so I suggest you add it to your own code.

  • Thanks 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×