Refactoring Enum to string using enum helpers

Mike Torrettinni · July 21, 2021

EDIT: The 3 options I'm showing here are what I use now in my code, and other even less clean and less maintainable examples of code to get enum names. But I narrowed down to these 3 and now I'm deciding which one will be the one I choose to refactor into.

I'm refactoring some enum usage and was testing 3 ways to convert enum to string:

TEnum = (enOne, enTwo, enThree, enFour, enFive, enSix);

1. (ToString) Using Magic strings:

case Self of
    enOne   : Result := 'One';

2. (ToString2) Using string consts:

const cOne: strig = 'One';
...
case Self of
    enOne   : Result := cOne;

3. (ToString3) Using array of enum of strings:

const cEnumStr: array[TEnum] of string = ('One', 'Two', 'Three', 'Four', 'Five', 'Six');
...
Result := cEnumStr[Self];

I prefer the 3rd way, but wanted to test to see results.

Looking at the asm, looks like 3rd option could be fastest, but the results show only negligible

ToString ToString2 ToString3

The results are pretty much same, negligible difference:

6 runs:

image.png.9b469e7f76e397aaa35fe985d89b2a8a.png

And code:

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  System.Diagnostics;

type
  TEnum = (enOne, enTwo, enThree, enFour, enFive, enSix);
  TEnumHelper = record helper for TEnum
    function ToString : string;   // use Magic strings
    function ToString2 : string;  // use const strings
    function ToString3 : string;  // use array[TEnum] cost of strings
  end;

const cEnumStr: array[TEnum] of string = ('One', 'Two', 'Three', 'Four', 'Five', 'Six');

  cOne   : string = 'One';
  cTwo   : string = 'Two';
  cThree : string = 'Three';
  cFour  : string = 'Four';
  cFive  : string = 'Five';
  cSix   : string = 'Six';

const
  cLoop: integer = 1000000;
  cNoOfTests: integer = 6;
  cNoOfEnums: integer = 6;

function TEnumHelper.ToString: string;
begin
  case Self of
    enOne   : Result := 'One';
    enTwo   : Result := 'Two';
    enThree : Result := 'Three';
    enFour  : Result := 'Four';
    enFive  : Result := 'Five';
    enSix   : Result := 'Six';
  end;
end;

function TEnumHelper.ToString2: string;
begin
  case Self of
    enOne   : Result := cOne;
    enTwo   : Result := cTwo;
    enThree : Result := cThree;
    enFour  : Result := cFour;
    enFive  : Result := cFive;
    enSix   : Result := cSix;
  end;
end;

function TEnumHelper.ToString3: string;
begin
  Result := cEnumStr[Self];
end;

procedure Run1;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString;
    end;
end;

procedure Run2;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString2;
    end;
end;

procedure Run3;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString3;
    end;
end;

procedure RunTests;
var vSW: TStopWatch;
    i: Integer;
begin
  for i := 1 to cNoOfTests do
  begin
    vSW := TStopWatch.StartNew;
    Run1;
    vSW.Stop;
    Writeln('ToString: ' + vSW.ElapsedMilliseconds.ToString);

    vSW := TStopWatch.StartNew;
    Run2;
    vSW.Stop;
    Writeln('ToString2:    ' + vSW.ElapsedMilliseconds.ToString);

    vSW := TStopWatch.StartNew;
    Run3;
    vSW.Stop;
    Writeln('ToString3:       ' + vSW.ElapsedMilliseconds.ToString);
  end;
end;

begin

  RunTests;
  Readln;
end.

Anybody knows of any faster options?

Edited July 21, 2021 by Mike Torrettinni
edit for clarification

Bill Meyer · July 21, 2021

Why not use RTTI? No need for maintenance of string constants. It's not clear from your three cases whether you want these to be display values, and if you do, then you will likely need an array of strings, possibly with a dimension for the selected language.

David Heffernan · July 21, 2021

Speed isn't the issue. The mistake that you make again and again. Maintainability is the only concern here.

Mike Torrettinni · July 21, 2021

4 minutes ago, Bill Meyer said:

Why not use RTTI? No need for maintenance of string constants. It's not clear from your three cases whether you want these to be display values, and if you do, then you will likely need an array of strings, possibly with a dimension for the selected language.

I know I can use System.TypInfo.GetEnumName but then I'm limited to the actual enum names instead of custom strings. Right?

Bill Meyer · July 21, 2021

1 minute ago, Mike Torrettinni said:

I know I can use System.TypInfo.GetEnumName but then I'm limited to the actual enum names instead of custom strings. Right?

That's correct. The question is what problem are you trying to solve? What is your purpose? I use a record helper to implement a two way conversion, string to enum, enum to string. The string forms are used in persisting values to a database.

If you need custom strings, then you will incur a maintenance issue, one way or another. As I mentioned earlier, if different languages are a factor, then you may need a multi-dimensional array of string constants.

Fr0sT.Brutal · July 21, 2021

No surprise that speed is the same... #1 and #2 are virtually the same, and they have one bitwise operation+one if+jump. #3 is just read from address.

I always prefer #3 because compiler will control if number of elements change. Sometimes, if needed, I use combined #3&#2 approach (array[TEnum] = (cS1, cS2...)).

Mike Torrettinni · July 21, 2021

Just now, Bill Meyer said:

If you need custom strings, then you will incur a maintenance issue, one way or another.

Yes, custom strings, but not for different languages. Imagine countries or states, usually US/CA for United states/Canada or CA for California, some values just can't be 1:1 with enum names. Or in very rare cases this could be OK, but in 99% cases my enums have prefixes (to avoid forcing scoped enums) and don't match display strings.

So, I'm refactoring from above 3 examples in my code to single example, so I have 1 type of implementation.

The maintenance seems to be easiest with array[TEnum] of , most annoying is with Magic strings, of course.

Mike Torrettinni · July 21, 2021

3 minutes ago, Fr0sT.Brutal said:

No surprise that speed is the same... #1 and #2 are virtually the same, and they have one bitwise operation+one if+jump. #3 is just read from address.

I always prefer #3 because compiler will control if number of elements change. Sometimes, if needed, I use combined #3&#2 approach (array[TEnum] = (cS1, cS2...)).

Before I looked into the asm, I thought that literal/magic strings get 'allocated' with every call. I see it's not the case.

Angus Robertson · July 21, 2021

Result := GetEnumName (TypeInfo (TEnum), Ord (FEnum)));

Angus

Mike Torrettinni · July 21, 2021

14 minutes ago, Angus Robertson said:

Result := GetEnumName (TypeInfo (TEnum), Ord (FEnum)));

Angus

I actually thought GetEnumName would be much slower, but is only about 4x slower:

Not bad at all for RTTI in Delphi 10.2.3! Perhaps new Delphi 11 RTTI will be even faster!

Edited July 21, 2021 by Mike Torrettinni

Tom F · July 21, 2021

How did you decide that this was code that you wanted to optimize?

Have you profiled your application to see if this conversion is a significant bottleneck?

These aren't rhetorical questions. I am genuinely interested in your answers. I hope you'll answer them here in a posting.

Mike Torrettinni · July 21, 2021

13 minutes ago, Tom F said:

How did you decide that this was code that you wanted to optimize?

Have you profiled your application to see if this conversion is a significant bottleneck?

These aren't rhetorical questions. I am genuinely interested in your answers. I hope you'll answer them here in a posting.

Was this posted in wrong thread? If there is anything missing in first post, let me know will try to give more details.

Edit: hm.. trying to see where you got the optimize and bottleneck from.. perhaps my definition of word 'refactoring' is different, but I used the word refactoring to point out I'm trying to come up with 1 method of how I define and retrieve enum names, for display. From 3 most common different versions I use, I want to consolidate all into 1 implementation type. Of course I didn't want to choose the option that is significantly slower. So, perhaps 'consolidation of various code into single implementation' is more precise than 'refactoring'. So, if this is not refactoring, it wouldn't be the first time I use the wrong word.

Edited July 21, 2021 by Mike Torrettinni

Tom F · July 21, 2021

2 minutes ago, Mike Torrettinni said:

Was this posted in wrong thread? If there is anything missing in first post, let me know will try to give more details.

Hi, Mike,

No, this wasn't posted in the wrong thread.

You said you were refactoring.

There are two reasons to refactor code that I'm aware of:

1. Make it easier to maintain.

2. Significantly improve the speed of the software.

To answer your question about which approach is better, we need to know of the above two are you doing? Since you've provided speed benchmarks, I assume that it's #2 -- you're trying to significantly speed up your software.

I suspect that you are prematurely optimizing your code for speed before you know where the bottlenecks are. Have you used a profiler to determine where to best focus on speeding up your application?

Even if you made this code infinitely fast, would you or your users be able to notice the difference? If your answer is "yes" – then we'd love to hear about this novel application. If your answer is "no" – then you're wasting your time.

Stano · July 21, 2021

The reason for your action is not stated in the topic of the topic. I'm convinced that a lot of people here think you're dealing with stupidity.
I needed to pull out the topic names for TMS. I used Copy ().

Mike Torrettinni · July 21, 2021

Just now, Tom F said:

Hi, Mike,

No, this wasn't posted in the wrong thread.

I think we have a little misunderstanding here, see my Edit in the post above.

Mike Torrettinni · July 21, 2021

4 minutes ago, Stano said:

The reason for your action is not stated in the topic of the topic. I'm convinced that a lot of people here think you're dealing with stupidity.
I needed to pull out the topic names for TMS. I used Copy ().

Aha, I think I use the word refactoring wrongly.

@Stano see my Edit in a few posts above, I tried to explain the purpose of the topic.

Edited July 21, 2021 by Mike Torrettinni

Stefan Glienke · July 21, 2021

Version3 is the best because its maintainable and gives compile error if anyone decides to add a new value - while cases don't and at best depending on the Compiler Version might give a warning about a non handled case.

Also the benchmark is pointless because like 90% of the workload you test is string assignments - here is a screenshot from samplingprofiler:

If you care for speed then get rid of the unnecessary function call. Unfortunately also depending on compiler version the inlining is not an option and rather makes things worse for functions returning managed types such as string You might be better by just directly indexing into the const string array.

Edited July 21, 2021 by Stefan Glienke

Stano · July 21, 2021

Quote

@Stano see my Edit in a few posts above, I tried to explain the purpose of the topic.

I was faster

Mike Torrettinni · July 21, 2021

6 minutes ago, Stefan Glienke said:

Version3 is the best because its maintainable and gives compile error if anyone decides to add a new value - while cases don't and at best depending on the Compiler Version might give a warning about a non handled case.

Also the benchmark is pointless because like 90% of the workload you test is string assignments - here is a screenshot from samplingprofiler:

Thanks, this make sense. I like option 3 because, like you said, is easier to maintain. Good to see others agree.

Tom F · July 21, 2021

24 minutes ago, Mike Torrettinni said:

Of course I didn't want to choose the option that is significantly slower.

If you haven't benchmarked your application to see how much time this code consumes in the context of your application, you have no way of knowing what "significantly slower" means.

I'd bet that you could slow down this function by a factor of 1000 and it would NOT noticeably impact your users' experience or computer resources.

You're wasting your time. Why have you chosen to waste your time working on something that doesn't matter?

As Glienke says, "Version3 is the best because it's maintainable and gives compile error if anyone decides to add a new value..."

Mike Torrettinni · July 21, 2021

5 minutes ago, Tom F said:

You're wasting your time

I really wish I didn't need to refactor the code, but in some cases the maintainance was taking too much time so this change was neccessary. Years ago I didn't know that there are better ways than to hardcode string values. So i really can't see this change being a waste of time.

Tom F · July 21, 2021

3 minutes ago, Mike Torrettinni said:

I really wish I didn't need to refactor the code, but in some cases the maintainance was taking too much time so this change was neccessary. Years ago I didn't know that there are better ways than to hardcode string values. So i really can't see this change being a waste of time.

If you're refactoring for maintainability, then I suggest you don't waste your time digging too deeply into speed benchmarks and certainly not into the assembly language.

Mike Torrettinni · July 21, 2021

3 minutes ago, Tom F said:

If you're refactoring for maintainability, then I suggest you don't waste your time digging too deeply into speed benchmarks and certainly not into the assembly language.

'Too deeply' is quite vague, can you be more precise in your suggestion how to refactor my code?

Mike Torrettinni · July 21, 2021

37 minutes ago, Stefan Glienke said:

Unfortunately also depending on compiler version the inlining is not an option and rather makes things worse for functions returning managed types such as string You might be better by just directly indexing into the const string array

I have a few benchmarks ready for Delphi 11. I know 10.2 has certain limitations with inlined functions,, even compared to 10.3, 10.4. I hope 11 will not regress.

Stano · July 21, 2021

Just to be sure, I quote:
Refactoring teaches us how to modify existing not very happily designed programs so that we get programs that will do the same, but their new design will allow easier maintenance and modifiability, thus significantly reducing costs ...
Power-hunting does not fall within this definition. Other reasons given by you yes.

Sign In

Refactoring Enum to string using enum helpers

Recommended Posts

Mike Torrettinni 199

Share this post

Link to post

Bill Meyer 339

Share this post

Link to post

David Heffernan 2494

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Bill Meyer 339

Share this post

Link to post

Fr0sT.Brutal 904

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Angus Robertson 696

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Tom F 88

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Tom F 88

Share this post

Link to post

Stano 146

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Stefan Glienke 2191

Share this post

Link to post

Stano 146

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Tom F 88

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Tom F 88

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Mike Torrettinni 199

Share this post

Link to post

Stano 146

Share this post

Link to post

Create an account or sign in to comment

Create an account