Jump to content
bravesofts

why shortstring in Delphi cannot used in our Programs using New IDE?

Recommended Posts

i read carefully this Emb docwiki page here

And this sentence caught my eye:

Note: ShortString is used by the Delphi desktop compilers, but is not supported by the Delphi mobile compilers. For more information, see Migrating Delphi Code to Mobile from Desktop.

does this mean a desktop program is compatible to work with this shortstring without any future problems while in mobile can't ...?

or shortstring is not compatible for any platform (is just  maintained for backward compatibility only. like coding a library for example ...)

-----

Also :

if my first understanding is true, why there is no shortstring in mobile platforms (I think the mobile platforms is the first and most deserving of this Short TYPE HAhA HAhA HAhA )

Also:

why Shortstring is Old and eliminated or Died while Byte Still Alive...

-------------

Note: My question is Related with this Question here.

Edited by bravesofts

Share this post


Link to post

Avoid using short strings. Short string is for backwards compatibility only (typically for older desktop apps migrated to the new).

They are ANSI based and do not support Unicode natively, and have been deprecated since 2009 when the Unicode string format was made king of the hill.

 

Byte still is a useful number format when you don't need a large range, 

  • Like 3

Share this post


Link to post
18 minutes ago, Lars Fosdal said:

Avoid using short strings. Short string is for backwards compatibility only (typically for older desktop apps migrated to the new).

They are ANSI based and do not support Unicode natively, and have been deprecated since 2009 when the Unicode string format was made king of the hill.

 

Byte still is a useful number format when you don't need a large range, 

Thank you so much ...

Share this post


Link to post
1 hour ago, Lars Fosdal said:

Byte still is a useful number format when you don't need a large range, 

That's not really why byte is useful. Byte is primarily useful to store binary data, typically as TArray<Byte>. It's rare to use Byte to store a count for example.

  • Like 2

Share this post


Link to post

Sure, that and array[0..n] of Byte; are pretty common.

Still, Byte sized variables and constants are used a lot in various APIs. 
Even nuggets like SizeOf(Byte); can be found sprinkled around the RTL and VCL, although I believe the universal result of that is 1.

  • Like 1

Share this post


Link to post

ShortString have a big advantage: they can be allocated on the stack. So they are perfect for small ASCII text, e.g. numbers to text conversion, or when logging information in plain English.
Using str() over a shortstring is for instance faster than using IntToString() and a temporary string (or AnsiString) on a multi-thread program, because the heap is not involved.

Of course, we could use a static array of AnsiChar, but ShortString have an advantage because they have their length encoded so they are safer and faster than #0 terminated strings.

 

So on mobile platform, we could end up by creating a new record type, re-inventing the wheel whereas the ShortString type is in fact still supported and generated by the compiler, and even used by the RTL at its lowest system level.

ShortString have been deprecated... and hidden. They could even be restored/unhidden by some tricks like https://www.idefixpack.de/blog/2016/05/system-bytestrings-for-10-1-berlin

 

Why? Because some people at Embarcadero thought it was confusing, and that the language should be "cleaned up" - I translate by "more C# / Java like", with a single string type.

This was the very same reason they did hide RawByteString and AnsiString...
More a marketing strategy than a technical decision IMHO.

 

I prefer the FPC more "conservative" way of preserving backward compatibility.
It is worth noting that the FPC compiler source code itself uses a lot of shortstring internally, so it would never be deprecated on FPC for sure. 😉

Edited by Arnaud Bouchez
  • Like 1

Share this post


Link to post
2 hours ago, Arnaud Bouchez said:

Why? Because some people at Embarcadero thought it was confusing, and that the language should be "cleaned up" - I translate by "more C# / Java like", with a single string type.

This was the very same reason they did hide RawByteString and AnsiString...
More a marketing strategy than a technical decision IMHO.

Marketing strategy? Really?

 

Just because you disagree with the decision to deprecate it doesn't make it a stupid decision, driven by "marketing". The deprecation of "object" was also controversial but that wasn't driven by marketing either. I'm pretty sure management, marketing and sales couldn't care less about these things.

  • Like 3

Share this post


Link to post
5 hours ago, bravesofts said:

i read carefully this Emb docwiki page here

And this sentence caught my eye:


Note: ShortString is used by the Delphi desktop compilers, but is not supported by the Delphi mobile compilers. For more information, see Migrating Delphi Code to Mobile from Desktop.

That notice has been in the documentation since XE4.  All ANSI types (P/AnsiChar, AnsiString/(N), ShortString) were disabled in the NEXTGEN mobile compilers in XE3.  Under NEXTGEN, UTF8String and RawByteString were re-enabled in 10.1 Berlin, and then the rest of the ANSI types were re-enabled in 10.4 Sydney when NEXTGEN was dropped completely so mobile platforms now match desktop platforms in terms of language features.  Looks like the ShortString documentation has not been updated yet to reflect that.

5 hours ago, bravesofts said:

does this mean a desktop program is compatible to work with this shortstring without any future problems

Yes.

5 hours ago, bravesofts said:

while in mobile can't ...?

Only in XE3 through 10.3.  In 10.4 onward, ShortString can be used on mobile.

5 hours ago, bravesofts said:

or shortstring is not compatible for any platform (is just  maintained for backward compatibility only. like coding a library for example ...)

No.

5 hours ago, bravesofts said:

if my first understanding is true, why there is no shortstring in mobile platforms

Because ALL ANSI types were initially eliminated, as mobile platforms are Unicode-based.  Then the types slowly started being re-introduced as needs arose, until eventually Embarcadero decided that having separate compiler architectures just wasn't working out.

5 hours ago, bravesofts said:

I think the mobile platforms is the first and most deserving of this Short TYPE

Hardly.

  • Like 5

Share this post


Link to post

Many people complained "Why so many string types and what are all these for??" so Emb decided to cut them off in favor of Tbytes but they didn't provide convenient string features at that time (Pos, concat, inline init etc) so much more people started complaining they absolutely need all of that string types so Emb had to bring them back 🙂

 

  • Like 1

Share this post


Link to post
3 minutes ago, Fr0sT.Brutal said:

Many people complained "Why so many string types and what are all these for??" so Emb decided to cut them off in favor of Tbytes but they didn't provide convenient string features at that time (Pos, concat, inline init etc) so much more people started complaining they absolutely need all of that string types so Emb had to bring them back 🙂

I don't remember anyone complaining :classic_biggrin:

 

The though process was a bit different. Mobile compilers were supposed to attract new generation of developers. To do that it needed to have some competing features. One was automatic memory management (hence ARC), another one was streamlining string and array indexing - hence zero-based strings. Short strings store length at index zero, so they were completely incompatible with that goal. AnsiStrings with ANSI encoding are not something that exists on mobile platforms, so they were deemed as unnecessary, too.  So 8-bit strings were removed.

 

Unfortunately, what they didn't take into account is impact on existing codebases. The same ones that were supposed to run on all other platforms (yes, you couldn't reuse GUI), but there is not reason to throw away non-GUI code. 

 

Zero based strings wreak havoc in string handling, causing subtle bugs all over the place. Most people after they learned about it would just turn the damn thing off. Removing all other 8-bit strings was also a mistake, because 8-bit strings make a whole a lot of sense in cross-platform, especially on Linux where UTF-8 encoding rules. So while throwing out AnsiString as such made sense, the rest did not. This is also why 8-bit strings were reintroduced, even before ARC was removed - which happened for completely different and unrelated reasons.  

 

And the last, but not the least. TBytes are completely different beast from strings. It is not that just that handling functions were missing. TBytes don't have COW, and also debugger support is extremely limited and you cannot easily inspect textual data stored within.

 

And we all know how many new customers they attracted to Delphi because it was suddenly cross-platform. 

  • Like 7

Share this post


Link to post
3 hours ago, Dalija Prasnikar said:

I don't remember anyone complaining

I've seen some articles with yells like "why Shortstring, Ansistring, Rawbytestring, WideString, Unicodestring, Utf8string, Whateverstring when you only need one Unicode string?!"

Indeed, AFAIK C# only has one string type and so does JS. OTOH, any Tbytes/Buffer representation lacks ability of clean operations ("if bytes = 'foo'" etc). However, most of string features could be implemented as helpers, overloaded operators and custom IDE viewers. If only Emba introduced all that stuff together with removing string types, probably this change was accepted more gladly.

  • Like 1

Share this post


Link to post
18 minutes ago, Fr0sT.Brutal said:

I've seen some articles with yells like "why Shortstring, Ansistring, Rawbytestring, WideString, Unicodestring, Utf8string, Whateverstring when you only need one Unicode string?!"

Indeed, AFAIK C# only has one string type and so does JS. OTOH, any Tbytes/Buffer representation lacks ability of clean operations ("if bytes = 'foo'" etc). However, most of string features could be implemented as helpers, overloaded operators and custom IDE viewers. If only Emba introduced all that stuff together with removing string types, probably this change was accepted more gladly.

Delphi is much more low level than C# or Java or JS.  It has all those string types for a reason and for interacting with particular OS and other APIs.

 

When you need to handle such low level things in Java or any other language the whole thing usually ends up being a memory and performance hog, just like it happened in initial mobile compilers, because you cannot deal directly with particular string representation and you have to juggle data back and forth. And handling encoding and other issues is never easy there, too. 

 

Some features can be implemented through helpers, but you cannot accomplish everything in satisfactory manner. 

 

For instance getting pointer to UTF8 string to use with OS API you needed additional TMarshaller variable. And not only you need the variable, but the code behind it is slower.

 

Without 8-bit string compiler support

procedure Output(const aMsg: string);
var
  M: TMarshaller;
begin
  LOGD(M.AsUtf8(aMsg).ToPointer);
end;

With 8-bit string compiler support

procedure Output(const aMsg: string);
begin
  LOGD(PUtf8Char(Utf8String(aMsg)));
end;

 

As far as new users are concerned, they only need to know and use generic string type. If and when they get in touch with other string types, it is pretty simple to explain what is purpose of each type and how is is used. We have different integer types, we have different string types.

  • Like 3

Share this post


Link to post
4 hours ago, Dalija Prasnikar said:

Zero based strings wreak havoc in string handling

And even without them we're now stuck with the zero based string helpers. What a turd.

  • Like 3

Share this post


Link to post
1 hour ago, Dalija Prasnikar said:

Delphi is much more low level than C#

Moreover, C# came along in 2000, and as a .NET language, arguably had no prior existence in any form. Delphi built on the tradition of Turbo Pascal, which appeared in 1983. And C, the root ancestor to C#, had no string type at all.

Share this post


Link to post
12 minutes ago, Bill Meyer said:

Moreover, C# came along in 2000, and as a .NET language, arguably had no prior existence in any form. Delphi built on the tradition of Turbo Pascal, which appeared in 1983. And C, the root ancestor to C#, had no string type at all.

I always though Delphi is the root ancestor to C# :classic_biggrin: The only commonality C# has with C are braces.

  • Like 2

Share this post


Link to post
1 minute ago, Dalija Prasnikar said:

I always though Delphi is the root ancestor to C# :classic_biggrin: The only commonality C# has with C are braces.

And yet, it is considered a member of the C family. And probably for no better reason than the use of those braces. And the braces are probably my least favorite aspect of that language.

Share this post


Link to post

I don't mind the braces, TBH - but some of the expression operators still irk me, probably because I don't use them all day.

Share this post


Link to post
4 minutes ago, Lars Fosdal said:

I don't mind the braces, TBH - but some of the expression operators still irk me, probably because I don't use them all day.

My dislike for braces dates from a time when I frequently used printouts, and back in the day, it was all too easy for some of them to either disappear or be sufficiently unclear as to be missed. 

 

I am also not keen on some of the expression operators, but I seem to adapt more easily to those than to the cursed braces.

Share this post


Link to post
32 minutes ago, Dalija Prasnikar said:

I always though Delphi is the root ancestor to C#

It's Java. Very early on many C# programs were also valid Java programs.

  • Like 1

Share this post


Link to post
2 hours ago, Dalija Prasnikar said:

Some features can be implemented through helpers, but you cannot accomplish everything in satisfactory manner. 

 

For instance getting pointer to UTF8 string to use with OS API you needed additional TMarshaller variable. And not only you need the variable, but the code behind it is slower.

As old-school Delphier I understand why all these string types exist but honestly I'd cut some of them down. Ansistrings with codepages could be removed - RawBytes and Utf8 seems sufficient. Even Utf8 as string type could be avoided in some cases. Your example :

Without 8-bit string compiler support

procedure Output(const aMsg: string);
begin
  LOGD(TUTF8Encoding.ToBytes(aMsg));
end;

 

Edited by Fr0sT.Brutal
  • Like 1

Share this post


Link to post
3 hours ago, Fr0sT.Brutal said:

As old-school Delphier I understand why all these string types exist but honestly I'd cut some of them down. Ansistrings with codepages could be removed - RawBytes and Utf8 seems sufficient. Even Utf8 as string type could be avoided in some cases.

AnsiString as such makes sense only on Windows, but I don't think removing it is necessary. More problematic are plethora of "compatibility" functions named AnsiXXX that work on string and not on AnsiString.

3 hours ago, Fr0sT.Brutal said:

Your example :


Without 8-bit string compiler support

procedure Output(const aMsg: string);
begin
  LOGD(TUTF8Encoding.ToBytes(aMsg));
end;

 

Not really, because LOGI expects pointer to a string not TBytes. So that code does not compile. Not everything can be squeezed in TBytes and not everything makes sense to be bytes. LOGD logs text messages, the fact there is conversion to UTF8 involved is just example. You might as well have API that would directly use UTF-16 encoded string, or you may already work with UTF8 encoded strings.

Share this post


Link to post
3 hours ago, David Heffernan said:

It's Java. Very early on many C# programs were also valid Java programs.

Some syntactical elements, yes and the fact both have GC. But C# also shares plenty of language concepts with Delphi that don't exist in Java. For instance, properties, records, 

 

But my comment was partially a joke, so I am not going to argue about this. Certainly C# and Java share some common ground, way, way more than they both share with C.

Share this post


Link to post
4 hours ago, Bill Meyer said:

And yet, it is considered a member of the C family. And probably for no better reason than the use of those braces. And the braces are probably my least favorite aspect of that language.

I think that family has too many outliers. It is just bunch of languages grouped together because they share few syntactical elements - but they are not roots, they are just letters of the alphabet.

  • Like 1

Share this post


Link to post
On 9/22/2021 at 10:03 AM, Arnaud Bouchez said:

ShortString have a big advantage: they can be allocated on the stack.

 

Is this ever going to introduce a human-perceivable performance increase?

 

Quote

 

Why? Because some people at Embarcadero thought it was confusing, and that the language should be "cleaned up" - I translate by "more C# / Java like", with a single string type.

...More a marketing strategy than a technical decision IMHO.

 

Basically everything has a single string type nowadays. Again, I can't conceive of human-perceptible results by fiddling with different string types, but it does slow down human coding and make the language more difficult to learn. It's also more code for Embarcadero to maintain. And as the saying goes, "Every line of code that doesn't need to be written is a line of code guaranteed to be bug-free."  I don't think "We have less string types!" is really a winning marketing slogan. Although they did completely run out of ideas for promoting Interbase and came up with the below ad....

 

Quote

I prefer the FPC more "conservative" way of preserving backward compatibility.

Fair enough; I prefer the "break everything at random" approach to keep people on their toes, constantly refactoring and adopting language changes. :classic_biggrin:

 

 

Interbase zqnppa0zy7l01.jpg

Share this post


Link to post
14 hours ago, Fr0sT.Brutal said:

I've seen some articles with yells like "why Shortstring, Ansistring, Rawbytestring, WideString, Unicodestring, Utf8string, Whateverstring when you only need one Unicode string?!"

That was probably me. They even came up with a string type to address the problem of too many string types. :classic_rolleyes: Python first transitioned to Unicode in 2000 and ended up with two string types. It was a disaster, so they broke backward compatibility with 3.0 in 2010 to have a single string type again, the 2.x line had to be supported concurrently for a decade to get people to finally convert their codebases, and it was all a big mess. Delphi had the luxury of not switching to Unicode until ten years after Python, so they got to see all the problems with two string types. Then they decided to do a "hold my beer" and come up with about 256 different string types... and then get rid of them... and then bring them back on platforms there wasn't even legacy code for.

 

14 hours ago, Fr0sT.Brutal said:

Indeed, AFAIK C# only has one string type and so does JS.

C#, Javascript, Python, Ruby, Swift, Go, R, PHP (but still no native Unicode support!), Kotlin... you'd think if 29 string types were useful, more languages would have at least two...

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×