Jump to content
Mike Torrettinni

Refactoring Enum to string using enum helpers

Recommended Posts

Posted (edited)

EDIT: The 3 options I'm showing here are what I use now in my code, and other even less clean and less maintainable examples of code to get enum names. But I narrowed down to these 3 and now I'm deciding which one will be the one I choose to refactor into.

 

I'm refactoring some enum usage and was testing 3 ways to convert enum to string:

 

TEnum = (enOne, enTwo, enThree, enFour, enFive, enSix);

 

1. (ToString) Using Magic strings:

case Self of
    enOne   : Result := 'One';

2. (ToString2) Using string consts:

const cOne: strig = 'One';
...
case Self of
    enOne   : Result := cOne;

3. (ToString3) Using array of enum of strings:

const cEnumStr: array[TEnum] of string = ('One', 'Two', 'Three', 'Four', 'Five', 'Six');
...
Result := cEnumStr[Self];

 

I prefer the 3rd way, but wanted to test to see results.

 

Looking at the asm, looks like 3rd option could be fastest, but the results show only negligible

 

ToString                                                                                                        ToString2                                                                                                       ToString3

image.thumb.png.7cc1691af676bdfda4e78315c65cf24e.pngimage.thumb.png.39282cc1edf06e47293de4ef522e9a88.pngimage.thumb.png.ced835e4c7ecddb5b9ad5ad6071ed7d3.png

 

 

The results are pretty much same, negligible difference:

6 runs:

image.png.9b469e7f76e397aaa35fe985d89b2a8a.png

 

And code:

 

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  System.Diagnostics;

type
  TEnum = (enOne, enTwo, enThree, enFour, enFive, enSix);
  TEnumHelper = record helper for TEnum
    function ToString : string;   // use Magic strings
    function ToString2 : string;  // use const strings
    function ToString3 : string;  // use array[TEnum] cost of strings
  end;

const cEnumStr: array[TEnum] of string = ('One', 'Two', 'Three', 'Four', 'Five', 'Six');

  cOne   : string = 'One';
  cTwo   : string = 'Two';
  cThree : string = 'Three';
  cFour  : string = 'Four';
  cFive  : string = 'Five';
  cSix   : string = 'Six';

const
  cLoop: integer = 1000000;
  cNoOfTests: integer = 6;
  cNoOfEnums: integer = 6;

function TEnumHelper.ToString: string;
begin
  case Self of
    enOne   : Result := 'One';
    enTwo   : Result := 'Two';
    enThree : Result := 'Three';
    enFour  : Result := 'Four';
    enFive  : Result := 'Five';
    enSix   : Result := 'Six';
  end;
end;

function TEnumHelper.ToString2: string;
begin
  case Self of
    enOne   : Result := cOne;
    enTwo   : Result := cTwo;
    enThree : Result := cThree;
    enFour  : Result := cFour;
    enFive  : Result := cFive;
    enSix   : Result := cSix;
  end;
end;

function TEnumHelper.ToString3: string;
begin
  Result := cEnumStr[Self];
end;

procedure Run1;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString;
    end;
end;

procedure Run2;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString2;
    end;
end;

procedure Run3;
var i, j: integer;
    vStr: string;
begin
  for i := 1 to cLoop do
    for j := 0 to Pred(cNoOfEnums) do
    begin
      vStr := '';
      vStr := TEnum(j).ToString3;
    end;
end;

procedure RunTests;
var vSW: TStopWatch;
    i: Integer;
begin
  for i := 1 to cNoOfTests do
  begin
    vSW := TStopWatch.StartNew;
    Run1;
    vSW.Stop;
    Writeln('ToString: ' + vSW.ElapsedMilliseconds.ToString);

    vSW := TStopWatch.StartNew;
    Run2;
    vSW.Stop;
    Writeln('ToString2:    ' + vSW.ElapsedMilliseconds.ToString);

    vSW := TStopWatch.StartNew;
    Run3;
    vSW.Stop;
    Writeln('ToString3:       ' + vSW.ElapsedMilliseconds.ToString);
  end;
end;

begin

  RunTests;
  Readln;
end.

 

Anybody knows of any faster options?

 

 

Edited by Mike Torrettinni
edit for clarification

Share this post


Link to post

Why not use RTTI? No need for maintenance of string constants. It's not clear from your three cases whether you want these to be display values, and if you do, then you will likely need an array of strings, possibly with a dimension for the selected language.

Share this post


Link to post

Speed isn't the issue. The mistake that you make again and again. Maintainability is the only concern here. 

  • Like 1

Share this post


Link to post
4 minutes ago, Bill Meyer said:

Why not use RTTI? No need for maintenance of string constants. It's not clear from your three cases whether you want these to be display values, and if you do, then you will likely need an array of strings, possibly with a dimension for the selected language.

I know I can use System.TypInfo.GetEnumName but then I'm limited to the actual enum names instead of custom strings. Right?

Share this post


Link to post
1 minute ago, Mike Torrettinni said:

I know I can use System.TypInfo.GetEnumName but then I'm limited to the actual enum names instead of custom strings. Right?

That's correct. The question is what problem are you trying to solve? What is your purpose? I use a record helper to implement a two way conversion, string to enum, enum to string. The string forms are used in persisting values to a database. 

If you need custom strings, then you will incur a maintenance issue, one way or another. As I mentioned earlier, if different languages are a factor, then you may need a multi-dimensional array of string constants.

  • Thanks 1

Share this post


Link to post

No surprise that speed is the same... #1 and #2 are virtually the same, and they have one bitwise operation+one if+jump. #3 is just read from address.

I always prefer #3 because compiler will control if number of elements change. Sometimes, if needed, I use combined #3&#2 approach (array[TEnum] = (cS1, cS2...)).

  • Thanks 1

Share this post


Link to post
Just now, Bill Meyer said:

If you need custom strings, then you will incur a maintenance issue, one way or another.

Yes, custom strings, but not for different languages. Imagine countries or states, usually US/CA for United states/Canada or CA for California, some values just can't be 1:1 with enum names. Or in very rare cases this could be OK, but in 99% cases my enums  have prefixes (to avoid forcing scoped enums) and don't match display strings.

 

So, I'm refactoring from above 3 examples in my code to single example, so I have 1 type of implementation.

The maintenance seems to be easiest with array[TEnum] of , most annoying is with Magic strings, of course.

Share this post


Link to post
3 minutes ago, Fr0sT.Brutal said:

No surprise that speed is the same... #1 and #2 are virtually the same, and they have one bitwise operation+one if+jump. #3 is just read from address.

I always prefer #3 because compiler will control if number of elements change. Sometimes, if needed, I use combined #3&#2 approach (array[TEnum] = (cS1, cS2...)).

Before I looked into the asm, I thought that literal/magic strings get 'allocated' with every call. I see it's not the case.

Share this post


Link to post
Posted (edited)
14 minutes ago, Angus Robertson said:

Result := GetEnumName (TypeInfo (TEnum), Ord (FEnum))); 

 

Angus

I actually thought GetEnumName would be much slower, but is only about 4x slower:

 

image.thumb.png.dd86a245fc0ca185879e987fcaea53af.png

 

Not bad at all for RTTI in Delphi 10.2.3! Perhaps new Delphi 11 RTTI  will be even faster!

 

Edited by Mike Torrettinni

Share this post


Link to post

How did you decide that this was code that you wanted to optimize?

Have you profiled your application to see if this conversion is a significant bottleneck?   

These aren't rhetorical questions.  I am genuinely interested in your answers. I hope you'll answer them here in a posting.

Share this post


Link to post
Posted (edited)
13 minutes ago, Tom F said:

How did you decide that this was code that you wanted to optimize?

Have you profiled your application to see if this conversion is a significant bottleneck?   

These aren't rhetorical questions.  I am genuinely interested in your answers. I hope you'll answer them here in a posting.

Was this posted in wrong thread? If there is anything missing in first post, let me know will try to give more details.

 

Edit: hm.. trying to see where you got the optimize and bottleneck from.. perhaps my definition of word 'refactoring' is different, but I used the word refactoring to point out I'm trying to come up with 1 method of how I define and retrieve enum names, for display. From 3 most common different versions I use, I want to consolidate all into 1 implementation type. Of course I didn't want to choose the option that is significantly slower. So, perhaps 'consolidation of various code into single implementation' is more precise than 'refactoring'. So, if this is not refactoring, it wouldn't be the first time I use the wrong word.

Edited by Mike Torrettinni

Share this post


Link to post
2 minutes ago, Mike Torrettinni said:

Was this posted in wrong thread? If there is anything missing in first post, let me know will try to give more details.

Hi, Mike,

No, this wasn't posted in the wrong thread. 

You said you were refactoring.

There are two reasons to refactor code that I'm aware of: 

 

1. Make it easier to maintain.
 

2. Significantly improve the speed of the software. 

To answer your question about which approach is better, we need to know of the above two are you doing?  Since you've provided speed benchmarks, I assume that it's #2 -- you're trying to significantly speed up your software.

I suspect that you are prematurely optimizing your code for speed before you know where the bottlenecks are. Have you used a profiler to determine where to best focus on speeding up your application?

Even if you made this code infinitely fast, would you or your users be able to notice the difference?  If your answer is "yes" –  then we'd love to hear about this novel application. If your answer is "no" – then you're wasting your time. 

 

  • Like 1

Share this post


Link to post

The reason for your action is not stated in the topic of the topic. I'm convinced that a lot of people here think you're dealing with stupidity.
I needed to pull out the topic names for TMS. I used Copy ().

Share this post


Link to post
Just now, Tom F said:

Hi, Mike,

No, this wasn't posted in the wrong thread. 

 

I think we have a little misunderstanding here, see my Edit in the post above.

Share this post


Link to post
Posted (edited)
4 minutes ago, Stano said:

The reason for your action is not stated in the topic of the topic. I'm convinced that a lot of people here think you're dealing with stupidity.
I needed to pull out the topic names for TMS. I used Copy ().

Aha, I think I use the word refactoring wrongly.

@Stano see my Edit in a few posts above, I tried to explain the purpose of the topic.

Edited by Mike Torrettinni

Share this post


Link to post
Posted (edited)

Version3 is the best because its maintainable and gives compile error if anyone decides to add a new value - while cases don't and at best depending on the Compiler Version might give a warning about a non handled case.

Also the benchmark is pointless because like 90% of the workload you test is string assignments - here is a screenshot from samplingprofiler:

 

image.thumb.png.961661e847dc75f25892f3fc857985cb.png

 

If you care for speed then get rid of the unnecessary function call. Unfortunately also depending on compiler version the inlining is not an option and rather makes things worse for functions returning managed types such as string You might be better by just directly indexing into the const string array.

 

Edited by Stefan Glienke
  • Like 1
  • Thanks 1

Share this post


Link to post
Quote

@Stano see my Edit in a few posts above, I tried to explain the purpose of the topic.

I was faster:classic_smile:

Share this post


Link to post
6 minutes ago, Stefan Glienke said:

Version3 is the best because its maintainable and gives compile error if anyone decides to add a new value - while cases don't and at best depending on the Compiler Version might give a warning about a non handled case.

Also the benchmark is pointless because like 90% of the workload you test is string assignments - here is a screenshot from samplingprofiler:

 

image.thumb.png.961661e847dc75f25892f3fc857985cb.png

Thanks, this make sense. I like option 3 because, like you said, is easier to maintain. Good to see others agree.

Share this post


Link to post
24 minutes ago, Mike Torrettinni said:

Of course I didn't want to choose the option that is significantly slower. 

If you haven't benchmarked your application to see how much time this code consumes in the context of your application, you have no way of knowing what "significantly slower" means.

I'd bet that you could slow down this function by a factor of 1000 and it would NOT noticeably impact your users' experience or computer resources.

You're wasting your time.  Why have you chosen to waste your time working on something that doesn't matter?  
 

As Glienke says, "Version3 is the best because it's maintainable and gives compile error if anyone decides to add a new value..."

Share this post


Link to post
5 minutes ago, Tom F said:

You're wasting your time

I really wish I didn't need to refactor the code, but in some cases the maintainance was taking too much time  so this change was neccessary. Years ago I didn't know that there are better ways than to hardcode string values. So i really can't see this change being a waste of time.

Share this post


Link to post
3 minutes ago, Mike Torrettinni said:

I really wish I didn't need to refactor the code, but in some cases the maintainance was taking too much time  so this change was neccessary. Years ago I didn't know that there are better ways than to hardcode string values. So i really can't see this change being a waste of time.

If you're refactoring for maintainability, then I suggest you don't waste your time digging too deeply into speed benchmarks and certainly not into the assembly language.  

Share this post


Link to post
3 minutes ago, Tom F said:

If you're refactoring for maintainability, then I suggest you don't waste your time digging too deeply into speed benchmarks and certainly not into the assembly language.  

  'Too deeply' is quite vague, can you be more precise in your suggestion how to refactor my code?

Share this post


Link to post
37 minutes ago, Stefan Glienke said:

Unfortunately also depending on compiler version the inlining is not an option and rather makes things worse for functions returning managed types such as string You might be better by just directly indexing into the const string array

I have a few benchmarks ready for Delphi 11. I know 10.2 has certain limitations with inlined functions,, even compared to 10.3, 10.4. I hope 11 will not regress.

Share this post


Link to post

Just to be sure, I quote:
Refactoring teaches us how to modify existing not very happily designed programs so that we get programs that will do the same, but their new design will allow easier maintenance and modifiability, thus significantly reducing costs ...
Power-hunting does not fall within this definition. Other reasons given by you yes.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×