Mark Williams 14 Posted December 11, 2020 I'm extracting the Date headers and Received headers from the raw email headers. I want to save the date as the user's local datetime, and I am also trying to extract the UTC offset stored in the header and I am finding this difficult to do. For example, I extract the following string from a header: Quote Sun, 8 Nov 2020 06:55:38 -0800 (PST) As far as I can see the following are the relevant functions in IDGlobalProtocols StrInternetToDateTime gives you the original datetime as in the header ie "8 Nov 2020 06:55:38". GmtOffsetStrToDateTime - looks just the job on first glance, but unfortunately it expects just the last part of the header string containing the offset info ie "-0800 (PST)." If it gets more it won't work. Looking at the code which deals with the extraction of the UTC portion, it seems to be an awful lot of work to get to what the function needs. It is not simply a case of calling StringPos for +/-. The two functions above would be enough for my purposes if it were simple to extract the relevant part of the header to submit to GmtOffsetStrToDateTime. GMTToLocalDateTime - returns the local date time from the header. I had an idea that I could: Call TTimeZone GetUTCOffset on the value returned by GMTToLocalDateTime Deduct the offset from the GMTToLocalDateTime value to give me UTC datetime. Deduct the original header date from the UTC datetime (or vice versa) to give me the UTC offset figure as in the header. However, there is a problem (bug?) with GMTToLocalDateTime and how it handles daylightsavings. It seems to apply daylight savings dependent on the date on the pc clock not the date in the header. If I call GMTToLocalDateTime on the above example date header with my PC Clock correctly set as today's date (UK aligned with UTC at the moment) it correctly returns the time as 14:55. If I change the date submitted to GMTToLocalDateTime from 8 Nov to 8 Jul, you would expect it to return the time as 15:55 (in July UK is one hour ahead of UTC). But it doesn't. It returns 14:55. If I change the pc date to 8 July and submit the 8 Nov date to GMTToLocalDateTime it returns 15:55 ie one hour ahead of UTC. This is correct for 8 July, but not for 8 Nov. RawStrInternetToDateTime this is used internally by both StrInternetToDateTime and GMTToLocalDateTime. RawStrInternetToDateTime takes the date string as a var parameter. GMTToLocalDateTime then uses the returned value (formatted as required) for submission to GmtOffsetStrToDateTime to get the UTC offset. Unfortunately, the date string parameter for both StrInternetToDateTime and GMTToLocalDateTime is a constant not a var. If it were a var, I could simply submit the returned value to GmtOffsetStrToDateTime. But the long and short of it is, I cannot figure out how best to get the UTC offset. Share this post Link to post
Remy Lebeau 1421 Posted December 12, 2020 (edited) 1 hour ago, Mark Williams said: StrInternetToDateTime gives you the original datetime as in the header ie "8 Nov 2020 06:55:38". Yes. It strips off the UTC offset portion, and converts the remaining string as-is to TDateTime form. Quote GmtOffsetStrToDateTime - looks just the job on first glance, but unfortunately it expects just the last part of the header string containing the offset info ie "-0800 (PST)." If it gets more it won't work. Yes. It is intended to convert UTC offset strings into TDateTime form, which are then added to/subtracted from TDateTime values returned by Raw/StrInternetToDateTime(). Quote Looking at the code which deals with the extraction of the UTC portion, it seems to be an awful lot of work to get to what the function needs. It is not simply a case of calling StringPos for +/-. You can thank popular Internet protocols for that, with their long history of using non-standardized date/time formats. Much of which went away with standardized ISO 8601 formats. In your case, I would probably suggest simply parsing the date/time string backwards, token by token. If a token is in "(<timezone>)" format, you would have to manually convert it to the corresponding UTC offset (which sadly changes frequently in many timezones). Indy has a TimeZoneToGmtOffsetStr() function that does this, but it is private to the IdGlobalProtocols unit, it is used internally by GmtOffsetStrToDateTime(). If a token is in "±HHMM" format, then it is trivial to convert it as-is to a UTC offset. Quote The two functions above would be enough for my purposes if it were simple to extract the relevant part of the header to submit to GmtOffsetStrToDateTime. Sounds like it may be useful to add a new function to Indy, which takes in a date/time string as input and just splits it up into its various components as output. Maybe as a helper for RawStrInternetToDateTime() to call internally. Quote I had an idea that I could: Call TTimeZone GetUTCOffset on the value returned by GMTToLocalDateTime Deduct the offset from the GMTToLocalDateTime value to give me UTC datetime. Deduct the original header date from the UTC datetime (or vice versa) to give me the UTC offset figure as in the header. Indy has TimeZoneBias() and OffsetFromUTC() functions, which return the current UTC offset in TDateTime form. Indy also has functions Quote However, there is a problem (bug?) with GMTToLocalDateTime and how it handles daylightsavings. it is not a bug. Quote It seems to apply daylight savings dependent on the date on the pc clock not the date in the header. As it should be.. Quote If I change the date submitted to GMTToLocalDateTime from 8 Nov to 8 Jul, you would expect it to return the time as 15:55 (in July UK is one hour ahead of UTC). But it doesn't. It returns 14:55. Operating systems are not very good about keeping accurate history of past time zones. And in any case, date/time strings in this format as supposed to be self-contained, not relying on PC clock data at all. A string like "Sun, 8 Jul 2020 06:55:38 -0800 (PST)" literally claims to represent July 8 2020 at 6:55:38 AM in a timezone that is -8 hours from UTC, Period. Regardless of the fact that on July 8 2020 the Pacific time zone was actually operating in PDT and not in PST. So any competent software producing this date/time string at that time would have produced "Sun, 8 Jul 2020 06:55:38 -0700 (PDT)" instead, which is the correct string for that date/time. By lying to the parser, you are not going to get the result you want. And it is not fair to ask Indy to go digging through the OS or the Internet trying to verify if the date/time string is accurate or not. The string is taken at face value. Quote If I change the pc date to 8 July and submit the 8 Nov date to GMTToLocalDateTime it returns 15:55 ie one hour ahead of UTC. This is correct for 8 July, but not for 8 Nov. Right, because your PC would be operating in the PDT time zone needed for July 8 2020, not in the PST time zone needed for Nov 8 2020. You would have needed to adjust the timezone info in the date/time string accordingly, not just the date. Quote RawStrInternetToDateTime takes the date string as a var parameter. GMTToLocalDateTime then uses the returned value (formatted as required) for submission to GmtOffsetStrToDateTime to get the UTC offset. Yes, RawStrInternetToDateTime() converts and strips off the date/time portion of the string, modifying the string to leave the UTC offset portion, which can then be passed to GmtOffsetStrToDateTime(). Quote But the long and short of it is, I cannot figure out how best to get the UTC offset. You would have to parse the date/time string manually, none of Indy's existing public functions are designed for the specific task you are trying to accomplish. The private functions that would accomplish it are, well, private. And for one simple reason - because nobody ever asked for them to be made public before. Edited December 12, 2020 by Remy Lebeau 1 Share this post Link to post
timfrost 78 Posted December 12, 2020 If you need to handle time zones all around the world, for users and senders in different time zones, I recommend TZDB from https://github.com/pavkam/tzdb. It is essentially a single PAS file which has all the TZ and DST data and functions you need. There are also other tools, including one to update the source code from the IANA database, which is updated a few times a year. Of course you also need to follow the advice above about how to parse the input! As I have said here before, I suggest tweaking the database extraction tool to exclude all the historical data, which it seems unlikely you will need. 1 Share this post Link to post
Mark Williams 14 Posted December 12, 2020 21 minutes ago, timfrost said: If you need to handle time zones all around the world, for users and senders in different time zones, I recommend TZDB from https://github.com/pavkam/tzdb. It is essentially a single PAS file which has all the TZ and DST data and functions you need. There are also other tools, including one to update the source code from the IANA database, which is updated a few times a year. Of course you also need to follow the advice above about how to parse the input! As I have said here before, I suggest tweaking the database extraction tool to exclude all the historical data, which it seems unlikely you will need. The TTimeZone unit is probably fine for my purposes. All I need to be able to do is get the UTC time zone from the end of the email header. Share this post Link to post
Mark Williams 14 Posted December 12, 2020 13 hours ago, Remy Lebeau said: The private functions that would accomplish it are, well, private. And for one simple reason - because nobody ever asked for them to be made public before. The simple solution to my specific problem would be to change the declaration of either or both StrInternetToDateTime and GMTToLocalDateTime so that it takes a var parameter rather than a constant so that users can then submit the returned value to GmtOffsetStrToDateTime without having to work out/duplicate all the work Indy already does in parsing the header to extract the relevant portion. I appreciate that I am the only person who seems to have asked for this functionality to date, but it is such a simple change would it be possible to incorporate into an upcoming update? Share this post Link to post
Remy Lebeau 1421 Posted December 13, 2020 On 12/12/2020 at 5:54 AM, Mark Williams said: The simple solution to my specific problem would be to change the declaration of either or both StrInternetToDateTime and GMTToLocalDateTime That is not likely to happen. Maybe add some overloads perhaps, but it would make more sense to simply make the declaration of RawStrInternetToDateTime() itself be public instead, and then you can call that. Or, to add a new function that parses a strong and just returns the GMT portion rather than the date portion. On 12/12/2020 at 5:54 AM, Mark Williams said: it is such a simple change would it be possible to incorporate into an upcoming update? It is not as simple as you make it out to be. You are not just changing the declaration, you are changing the semantics of how the functions are to be called. That affects any code that calls the functions, not just in Indy itself, but in 3rd party code, too. And for that reason, I am not inclined to incorporate the change, at least in the way you have described it. If you submit a ticket to Indy’s issue tracker, alternatives may be considered. Share this post Link to post
Mark Williams 14 Posted December 14, 2020 12 hours ago, Remy Lebeau said: but it would make more sense to simply make the declaration of RawStrInternetToDateTime() itself be public instead That would certainly solve my problem. Share this post Link to post
Remy Lebeau 1421 Posted December 14, 2020 7 hours ago, Mark Williams said: That would certainly solve my problem. I have added a new GetGMTOffsetStr() function to the IdGlobalProtocols unit. 1 Share this post Link to post