Jump to content
ertank

String to Date conversion (yet another one)

Recommended Posts

Hello,

 

I am using Delphi 10.3.2.

 

I used to parse date string "Tue 17 Sep 2019" in an e-mail message using VarToDateTime() which served its purpose well. Now, format changed to "Sunday, 22 September 2019" and that function is not so helpful as before.

 

I am trying not to manually parse the string as it is due to change in the future, too.

 

My questions are;

1- It is always possible one cannot see a simple solution. I appreciate if you can point me to right existing Delphi solution. I could not make StrToDate() working for me even with TFormatSettings provided and even with several different input strings like "22 September 2019", "22/September/2019", etc.

 

2- It would be great if anybody have a unit or a function he/she can share which handles provided date format like 'dddd, dd mmmm yyyy' and convert the input string to TDateTime using that given format. I do not know C#. I saw several examples of DateTime.ParseExact() which seems like what I am searching for. I might be completely wrong about that though.

 

Thanks & regards,

Ertan

 

 

Share this post


Link to post

Yes, .net can do the job

using System;
					
public class Program
{
	const string dateToParse = "Sunday, 22 September 2019";
	const string dateFormat = "dddd, dd MMMM yyyy";
	const string dateOutFormat = "yyyy-MM-dd";
	
	public static void Main()
	{
		var date = DateTime.ParseExact(dateToParse, dateFormat, System.Globalization.CultureInfo.InvariantCulture);
		Console.WriteLine(date.ToString(dateOutFormat, System.Globalization.CultureInfo.InvariantCulture));
	}
}

see in action on .net fiddle

 

The C# source code is open source and published on github so you may port it to Delphi.

Share this post


Link to post
function TForm7.StrToDateFrmt(const iFormat, iDateStr: string): TDateTime;
var
  AYear, AMonth, ADay, AHour, AMinute, ASecond, AMilliSecond: Word;
  aPos: Integer;

  procedure InitVars;
  begin
    AYear := 1;
    AMonth := 1;
    ADay := 1;
    AHour := 0;
    AMinute := 0;
    ASecond := 0;
    AMilliSecond := 0;
  end;

  function GetPart(const iPart: Char): Word;
  var
    aYCnt: Integer;
  begin
    Result := 0;
    aYCnt := 0;

    while (aPos <= High(iFormat)) and (iFormat.Chars[aPos + aYCnt] = iPart) do
      inc(aYCnt);

    Result := StrToInt(iDateStr.Substring(aPos, aYCnt));

    aPos := aPos + aYCnt;
  end;

begin
  InitVars;
  aPos := 0;

  while aPos <= High(iFormat) do
  begin
    case iFormat.Chars[aPos] of
      'Y':
        AYear := GetPart('Y');
      'M':
        AMonth := GetPart('M');
      'D':
        ADay := GetPart('D');
      'H':
        AHour := GetPart('H');
      'N':
        AMinute := GetPart('N');
      'S':
        ASecond := GetPart('S');
      'Z':
        AMilliSecond := GetPart('Z');
    else
      inc(aPos);
    end;
  end;

  Result := EncodeDateTime(AYear, AMonth, ADay, AHour, AMinute, ASecond,
    AMilliSecond);
end;

 

Share this post


Link to post

@zinpub Nope

const dateToParse = 'Sunday, 22 September 2019';
const dateFormat = 'DDDD, DD MMMM YYYY';
const dateOutFormat = 'yyyy-MM-dd';

var
 date : TDateTime;

begin
  try

    date := StrToDateFrmt(dateFormat,dateToParse);
    Writeln( FormatDateTime(dateOutFormat,date) );

  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;
  Readln;
end.

results in

EConvertError: 'Sund' ist kein gültiger Integerwert

 

Share this post


Link to post
6 minutes ago, zinpub said:

requires minimal refinement

Yeah, minimal... like dealing with every possible language for weekdays the mail could contain regardless the local language of the system the software is running on...

  • Like 4
  • Haha 4

Share this post


Link to post

Well, the transformation of the "any - possible" date is not a task for one function 🙂

Share this post


Link to post
Posted (edited)

Shared C# code conversion is out of my league. I would vote for such a function to be implemented in Delphi by Embarcadero though.

 

For now, I have written something as following. My two versions are parsed OK with that.

 

I might improve this code to have time part and to handle my possible fail cases in the future. I am not after very fast code at the moment. For now I will keep that one.

unit uUtils.ParseExact;

interface

uses
  System.SysUtils;

type
  TDateTimeHelper = record helper for TDateTime
  public
    class function ParseExact(const Value, Format: string; AFormatSettings: TFormatSettings): TDateTime; static;
  end;

implementation


procedure GetNumber(const InValue: string; out OutValue: Integer);
var
  Finish: Integer;
begin
  Finish := 1;
  while CharInSet(InValue.Chars[Finish], ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']) do
    Inc(Finish);

  if not TryStrToInt(InValue.Substring(0, Finish), OutValue) then raise Exception.Create('Cannot convert to number');
end;

class function TDateTimeHelper.ParseExact(const Value, Format: string; AFormatSettings: TFormatSettings): TDateTime;
label
  Again;
var
  Day: Integer;
  Month: Integer;
  Year: Integer;
  TempString: string;
  TempFormat: string;
  I: Integer;
begin
  Result := 0;

  TempString := Value.ToLower();
  TempFormat := Format.ToLower();

  if TempFormat.Contains('mmm') or TempFormat.Contains('mmmm') then
  begin
    // month string literals converted to numbers
    for I := Low(AFormatSettings.LongMonthNames) to High(AFormatSettings.LongMonthNames) do
    begin
      TempString := TempString.Replace(AFormatSettings.LongMonthNames[I].ToLower(), I.ToString());
      TempString := TempString.Replace(AFormatSettings.ShortMonthNames[I].ToLower(), I.ToString());
    end;

    TempFormat := TempFormat.Replace('mmmm', 'mm');
    TempFormat := TempFormat.Replace('mmm', 'mm');
  end;

  if TempFormat.Contains('ddd') or TempFormat.Contains('dddd') then
  begin
    // day string literals are simply removed
    for I := Low(AFormatSettings.LongDayNames) to High(AFormatSettings.LongDayNames) do
    begin
      TempString := TempString.Replace(AFormatSettings.LongDayNames[I].ToLower(), EmptyStr);
      TempString := TempString.Replace(AFormatSettings.ShortDayNames[I].ToLower(), EmptyStr);
    end;

    TempFormat := TempFormat.Replace('dddd', EmptyStr);
    TempFormat := TempFormat.Replace('ddd', EmptyStr);
  end;

  TempFormat := TempFormat.Trim();
  TempString := TempString.Trim();

Again:
  // remove non relevant chars at beginning
  while not CharInSet(TempFormat.Chars[0], ['a'..'z']) do
  begin
    TempFormat := TempFormat.Substring(1, MaxInt);
    TempString := TempString.Substring(1, MaxInt);
  end;

  if TempString.Length > 0 then
  begin
    case TempFormat[1] of
      'd':
      begin
        if Day = 0 then GetNumber(TempString, Day);
        I := 0;
        while CharInSet(TempString.Chars[I], ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0', AFormatSettings.DateSeparator]) do
          Inc(I);
        TempString := TempString.Substring(I, MaxInt);
        TempFormat := TempFormat.Replace('dd', EmptyStr);
        goto Again;
      end;

      'm':
      begin
        if Month = 0 then GetNumber(TempString, Month);
        I := 0;
        while CharInSet(TempString.Chars[I], ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']) do
          Inc(I);
        TempString := TempString.Substring(I, MaxInt);
        TempFormat := TempFormat.Replace('mm', EmptyStr);
        goto Again;
      end;

      'y':
      begin
        if Year = 0 then GetNumber(TempString, Year);
        I := 0;
        while CharInSet(TempString.Chars[I], ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']) do
          Inc(I);
        TempString := TempString.Substring(I, MaxInt);
        goto Again;
      end;
    end;
  end;

  if (Day > 0) and (Month > 0) and (Year > 0) then
  begin
    try
      Result := EncodeDate(Year, Month, Day);
    except
      raise Exception.Create('uUtils.ParseExact(): Cannot encode.' + sLineBreak + sLineBreak +
        'Year: ' + Year.ToString() + sLineBreak +
        'Month: ' + Month.MaxValue.ToString() + sLineBreak +
        'Day: ' + Day.ToString());
    end;
  end
  else
  begin
    raise Exception.Create('uUtils.ParseExact(): Cannot parse all day, month and year');
  end;
end;

end.

 

Usage is as following:

uses
  uUtils.ParseExact;

var
  ADate: TDateTime;
  AFormatSettings: TFormatSettings;
begin
  AFormatSettings := TFormatSettings.Create('en-US');

  ADate := TDateTime.ParseExact('Sunday, 22 September 2019', 'dddd, dd mmmm yyyy', AFormatSettings);
  ShowMessage(DateToStr(ADate));

  ADate := TDateTime.ParseExact('Sun 15 Sep 2019', 'ddd dd mmm yyyy', AFormatSettings);
  ShowMessage(DateToStr(ADate));
end;

 

Edited by ertank

Share this post


Link to post

Just a sketch

 

It should be possible to translate the format string into a Regular Expression to check the format and extract the values.

 

The current format string

dddd, dd mmmm yyyy

could be translated to the following regular expression (using the en-US format settings)

(Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday), (\d{2}) (January|February|March|April|May|June|July|August|September|October|November|December) (\d{4})

here a small PoC

const
  dateToParseString = 'Sunday, 22 September 2019';

procedure Test();
var
  fmtset: TFormatSettings;
  pattern: string;
  match: TMatch;
  day: word;
  month: word;
  year: word;
begin
  fmtset := TFormatSettings.Create('en-US');
  pattern := //
    '(' + string.Join('|', fmtset.LongDayNames) + ')' + // "dddd"
    ', ' + // ", "
    '(\d{2})' + // "dd"
    ' ' + // " "
    '(' + string.Join('|', fmtset.LongMonthNames) + ')' + // "mmmm"
    ' ' + // " "
    '(\d{4})'; // "yyyy"

  match := TRegEx.match(dateToParseString, pattern);

  if not match.Success then
    raise Exception.Create('Invalid data');

  day := word.Parse(match.Groups.Item[2].Value);
  month := 1;
  while (month <= 12) and (fmtset.LongMonthNames[month] <> match.Groups.Item[3].Value) do
    inc(month);
  year := word.Parse(match.Groups.Item[4].Value);

  Writeln(FormatDateTime('yyyy-mm-dd', EncodeDate(year, month, day), fmtset));
end;

Well there is a lot to improve, but you should get the idea.

Share this post


Link to post
1 hour ago, Schokohase said:

 


  pattern := //
    '(' + string.Join('|', fmtset.LongDayNames) + ')' + // "dddd"
    ', ' + // ", "
    '(\d{2})' + // "dd"
    ' ' + // " "
    '(' + string.Join('|', fmtset.LongMonthNames) + ')' + // "mmmm"
    ' ' + // " "
    '(\d{4})'; // "yyyy"

Well there is a lot to improve, but you should get the idea.

Only thing that maybe missing is "ddd" and "mmm" aka ShortDayNames and ShortMonthNames above. I find RegEx powerful. Unfortunately, I am not familiar with it at all.

 

Would you add these two possible patterns in your sample code, please?

Share this post


Link to post

Oh that is very complicated ...

'(' + string.Join('|', fmtset.ShortDayNames) + ')' + // "ddd"
'(' + string.Join('|', fmtset.ShortMonthNames) + ')' + // "mmm"

... not

Share this post


Link to post

Seems like initial code I share is faster than RegEx after initial call. Moreover, RegEx is failing at certain formats at the moment which will probably make it slower in all cases when all features added.

 

Timing test code:

program Project2;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  uUtils.ParseExact,
  System.RegularExpressions,
  System.Diagnostics;



function Test(const Value: string; const fmtset: TFormatSettings): TDateTime;
var
  pattern: string;
  match: TMatch;
  day: word;
  month: word;
  year: word;
begin
  pattern := //
    '(' + string.Join('|', fmtset.LongDayNames) + ')' + // "dddd"
    ', ' + // ", "
    '(\d{2})' + // "dd"
    ' ' + // " "
    '(' + string.Join('|', fmtset.LongMonthNames) + ')' + // "mmmm"
    ' ' + // " "
    '(\d{4})'; // "yyyy"

  match := TRegEx.match(Value, pattern);

  if not match.Success then
    raise Exception.Create('Invalid data');

  day := word.Parse(match.Groups.Item[2].Value);
  month := 1;
  while (month <= 12) and (fmtset.LongMonthNames[month] <> match.Groups.Item[3].Value) do
    inc(month);
  year := word.Parse(match.Groups.Item[4].Value);

  try
    Result := EncodeDate(year, month, day);
  except
    Result := 0;
  end;
end;



const
  Date1 = 'Sunday, 22 September 2019';
  Date2 = 'Monday, 20 January 2018';
  Date3 = 'Sun 15 Sep 2019';

var
  ADate: TDateTime;
  AFormatSettings: TFormatSettings;
  Timing: TStopWatch;
begin
  try
    AFormatSettings := TFormatSettings.Create('en-US');

    WriteLn('Long code timings');
    Timing := TSTopWatch.StartNew();
    ADate := TDateTime.ParseExact(Date1, 'dddd, dd mmmm yyyy', AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TStopwatch.StartNew();
    ADate := TDateTime.ParseExact(Date2, 'dddd, dd mmmm yyyy', AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TStopwatch.StartNew();
    ADate := TDateTime.ParseExact(Date2, 'dddd, dd mmmm yyyy', AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TStopwatch.StartNew();
    ADate := TDateTime.ParseExact(Date3, 'ddd dd mmm yyyy', AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);


    WriteLn('RegEx code timings');
    Timing := TSTopWatch.StartNew();
    ADate := Test(Date1, AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TSTopWatch.StartNew();
    ADate := Test(Date2, AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TSTopWatch.StartNew();
    ADate := Test(Date2, AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

    Timing := TSTopWatch.StartNew();
    ADate := Test(Date3, AFormatSettings);
    Timing.Stop();
    WriteLn(DateToStr(ADate), 'Time: ' + Timing.Elapsed);

  except
    on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
  end;

  ReadLn;
end.

 

My system output:

Long code timings
22.09.2019Time: 00:00:00.0001696
20.01.2018Time: 00:00:00.0000132
20.01.2018Time: 00:00:00.0000157
15.09.2019Time: 00:00:00.0000134
RegEx code timings
22.09.2019Time: 00:00:00.0001281
20.01.2018Time: 00:00:00.0000404
20.01.2018Time: 00:00:00.0000255
Exception: Invalid data

 

Share this post


Link to post
On 7/30/2019 at 1:49 AM, ertank said:

I used to parse date string "Tue 17 Sep 2019" in an e-mail message using VarToDateTime() which served its purpose well. Now, format changed to "Sunday, 22 September 2019" and that function is not so helpful as before.

Indy has StrInternetToDateTime() and GMTToLocalDateTime() functions in its IdGlobalProtocols unit which support both of those formats.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×