Jump to content

Leaderboard


Popular Content

Showing content with the highest reputation on 05/12/21 in Posts

  1. David Heffernan

    Micro optimization: Math.InRange

    I'm betting that improving the performance of InRange has no impact on the performance of these reports.
  2. Remy Lebeau

    Binary data in String?

    Yes, quite lucky. Most ANSI locales use 1 byte per character, and UTF-16 uses 1 codeunit per character for most Western languages. So, you usually end up with 1 byte -> 2 bytes -> 1 byte conversion, hence why the final size was the same byte size, but may or may not be the same bytes as the original. There is more involved than just nul-padding, which typically only applies for bytes in the $00..$7F (ASCII) range. For non-ASCII characters, it is not a matter of simply padding '<HH>' to '<HH>#0', there is an actual mapping process involved. For example, if Windows-1252 were the locale used for the conversion, and byte $80 (Euro) were encountered, it would be converted to the Unicode character U+20AC, which is bytes $AC $20 in UTF-16LE, not $80 $00 like you are thinking. But yes, the individual bytes of the EXE data would mostly get doubled when converted to Unicode, and then truncated to single bytes when converted back to ANSI. But that does not necessarily mean that you will end up with the same bytes that you started with. For example, using Windows-1252 again, byte $81 (amongst a few others) would end up converted to either Unicode character U+FFFD (Replacement Character) or U+003F (ASCII '?') depending on the converter's implementation, which would thus be bytes $FD $FF or $3F $00 in UTF-16LE respectively, and then converted back to ANSI as byte $3F, which is clearly not the same as the original. If you absolutely need a charset that ensures no data loss when round-tripping bytes from ANSI to Unicode to ANSI, you can use codepage 437 for that, see Is there a code page that matches ASCII and can round trip arbitrary bytes through Unicode? The Unicode won't have the same character values as the original bytes in the ranges of $00..$1F and $7F..$FF, but the result of converting the Unicode back to codepage 437 will produce the original bytes. Nul-padding is not guaranteed, but yes, the String data inside the stream can get messed up. Do not use a TStringStream for binary data. Use TMemoryStream or TBytesStream instead.
  3. LSP simply does not work well yet in any sort of complex scenario. I wrote to EMB support about certain things not working with LSP mode and worked in Classic mode and they wrote: Although several issues related to the LSP server were included in the Sydney 10.4.2 update, several additional issues remain but are slated to be addressed in a future release or update. Examples of open defects affecting LSP: New LSP does not recognize newly build classes. https://quality.embarcadero.com/browse/RSP-31922 LSP not recognising subtype alias https://quality.embarcadero.com/browse/RSP-33546 LSP doesnt show errors https://quality.embarcadero.com/browse/RSP-33060 The Code Insight using LSP work only for small "hello world" applications. https://quality.embarcadero.com/browse/RSP-33403 Here are a few details which may affect Code Insight/Find Declaration: 1) It only works effectively on the first project in a project group, so if you have a project group with several projects, it is best to open just one project at a time. 2) When you see [Calculating...] it means that the LSP server is working and indexing, it's best to click elsewhere and then try again. 3) if you see an error 'cannot find file XXXXX' even though you are in the file in the IDE, it means the file cannot be found in the index database, so the LSP server still needs time to index it.
  4. Remy Lebeau

    Binary data in String?

    Most likely, that code predated the shift to Unicode in Delphi 2009. No, it doesn't. It has the potential to corrupt the data. This is exactly why you SHOULD NOT put binary data into a UnicodeString. Doesn't work. It fills only 1/2 of the UnicodeString's memory with the non-textual binary (because SizeOf(WideChar) is 2, so the SetLength() is allocating twice the number of bytes as were read in), then converts the entire UnicodeString (including the unused bytes) from UTF-16 to ANSI producing complete garbage, and then writes that garbage as-is to file. So yes, the same number of bytes MAY be written as were read in (but that is not guaranteed), but those bytes are useless. That code is copying the binary as-is into an AnsiString of equal byte size, converting that AnsiString to a UTF-16 UnicodeString using the user's default locale, then converting that UnicodeString from UTF-16 back to ANSI using the same locale. Depending on the particular locale used, that MAY be a lossy conversion, you MIGHT end up with the same bytes that you started with, or you MIGHT NOT. This has nothing to do with pointers. You are simply performing 2 separate data conversions (binary/ANSI -> UTF-16 -> binary/ANSI ) that just HAPPEN to produce the same results as the input IN YOUR ENVIRONMENT.
  5. Anders Melander

    Micro optimization: Math.InRange

    ... can cause overflows. Consider the extremes.
  6. I love these topics. Under 64 bits: Math.InRange: 859 IsInRange: 858 If: 860 IsInRangeEx: 858 IsInRangeEx2: 858 Under 32bits: Math.InRange: 1288 IsInRange: 906 If: 906 IsInRangeEx: 858 IsInRangeEx2: 858 The code for IsInRangeEx and isInRangeEx2: function IsInRangeEx(const AValue, AMin, AMax: Cardinal): Boolean; inline; begin Result := (AValue - AMin) <= (aMax - aMin); end; function IsInRangeEx2(const AValue, AMin, AMax: Integer): Boolean; inline; begin Result := ((AValue-AMax)*(aValue-AMin) <= 0); end;
  7. Rollo62

    Overloocked Format( ) options

    Thats true, this is why I work with my predefined constants for some most usual settings, to avoid typos.
  8. dummzeuch

    Overloocked Format( ) options

    The second syntax is meant to be used with variables rather than constants.
  9. Stefan Glienke

    Micro optimization: Math.InRange

    While your assessment on Math.InRange is true (it is coded in a bad way plus the compiler produces way too many conditional jumps - https://quality.embarcadero.com/browse/RSP-21955) you certainly need to read some material on how to properly microbenchmark and how to read assembly. First of all, even though the Delphi compiler is pretty terrible at optimizing away dead code it might omit the if statement if there is nothing to do after it. Second - be careful if one of your loops spans multiple cache lines while others don't this affects the outcome slightly and can in such a case affect the result in a noticeable way. Third - with a static test like this you prove nothing - the branch predictor will do its job. If you want to benchmark the raw performance of one vs the other you need to give it random data which does not follow the "not in range for a while, in range for a while, not in range until the end" pattern
  10. Uwe Raabe

    Overloocked Format( ) options

    The documentation found in System.SysUtils.pas is pretty comprehensive. A thorough read is recommended. { The Format routine formats the argument list given by the Args parameter using the format string given by the Format parameter. Format strings contain two types of objects--plain characters and format specifiers. Plain characters are copied verbatim to the resulting string. Format specifiers fetch arguments from the argument list and apply formatting to them. Format specifiers have the following form: "%" [index ":"] ["-"] [width] ["." prec] type A format specifier begins with a % character. After the % come the following, in this order: - an optional argument index specifier, [index ":"] - an optional left-justification indicator, ["-"] - an optional width specifier, [width] - an optional precision specifier, ["." prec] - the conversion type character, type The following conversion characters are supported: d Decimal. The argument must be an integer value. The value is converted to a string of decimal digits. If the format string contains a precision specifier, it indicates that the resulting string must contain at least the specified number of digits; if the value has less digits, the resulting string is left-padded with zeros. u Unsigned decimal. Similar to 'd' but no sign is output. e Scientific. The argument must be a floating-point value. The value is converted to a string of the form "-d.ddd...E+ddd". The resulting string starts with a minus sign if the number is negative, and one digit always precedes the decimal point. The total number of digits in the resulting string (including the one before the decimal point) is given by the precision specifer in the format string--a default precision of 15 is assumed if no precision specifer is present. The "E" exponent character in the resulting string is always followed by a plus or minus sign and at least three digits. f Fixed. The argument must be a floating-point value. The value is converted to a string of the form "-ddd.ddd...". The resulting string starts with a minus sign if the number is negative. The number of digits after the decimal point is given by the precision specifier in the format string--a default of 2 decimal digits is assumed if no precision specifier is present. g General. The argument must be a floating-point value. The value is converted to the shortest possible decimal string using fixed or scientific format. The number of significant digits in the resulting string is given by the precision specifier in the format string--a default precision of 15 is assumed if no precision specifier is present. Trailing zeros are removed from the resulting string, and a decimal point appears only if necessary. The resulting string uses fixed point format if the number of digits to the left of the decimal point in the value is less than or equal to the specified precision, and if the value is greater than or equal to 0.00001. Otherwise the resulting string uses scientific format. n Number. The argument must be a floating-point value. The value is converted to a string of the form "-d,ddd,ddd.ddd...". The "n" format corresponds to the "f" format, except that the resulting string contains thousand separators. m Money. The argument must be a floating-point value. The value is converted to a string that represents a currency amount. The conversion is controlled by the CurrencyString, CurrencyFormat, NegCurrFormat, ThousandSeparator, DecimalSeparator, and CurrencyDecimals global variables, all of which are initialized from locale settings provided by the operating system. For example, Currency Format preferences can be set in the International section of the Windows Control Panel. If the format string contains a precision specifier, it overrides the value given by the CurrencyDecimals global variable. p Pointer. The argument must be a pointer value. The value is converted to a string of the form "XXXX:YYYY" where XXXX and YYYY are the segment and offset parts of the pointer expressed as four hexadecimal digits. s String. The argument must be a character, a string, or a PChar value. The string or character is inserted in place of the format specifier. The precision specifier, if present in the format string, specifies the maximum length of the resulting string. If the argument is a string that is longer than this maximum, the string is truncated. x Hexadecimal. The argument must be an integer value. The value is converted to a string of hexadecimal digits. If the format string contains a precision specifier, it indicates that the resulting string must contain at least the specified number of digits; if the value has less digits, the resulting string is left-padded with zeros. Conversion characters may be specified in upper case as well as in lower case--both produce the same results. For all floating-point formats, the actual characters used as decimal and thousand separators are obtained from the DecimalSeparator and ThousandSeparator global variables. Index, width, and precision specifiers can be specified directly using decimal digit string (for example "%10d"), or indirectly using an asterisk charcater (for example "%*.*f"). When using an asterisk, the next argument in the argument list (which must be an integer value) becomes the value that is actually used. For example "Format('%*.*f', [8, 2, 123.456])" is the same as "Format('%8.2f', [123.456])". A width specifier sets the minimum field width for a conversion. If the resulting string is shorter than the minimum field width, it is padded with blanks to increase the field width. The default is to right-justify the result by adding blanks in front of the value, but if the format specifier contains a left-justification indicator (a "-" character preceding the width specifier), the result is left-justified by adding blanks after the value. An index specifier sets the current argument list index to the specified value. The index of the first argument in the argument list is 0. Using index specifiers, it is possible to format the same argument multiple times. For example "Format('%d %d %0:d %d', [10, 20])" produces the string '10 20 10 20'. The Format function can be combined with other formatting functions. For example S := Format('Your total was %s on %s', [ FormatFloat('$#,##0.00;;zero', Total), FormatDateTime('mm/dd/yy', Date)]); which uses the FormatFloat and FormatDateTime functions to customize the format beyond what is possible with Format. Each of the string formatting routines that uses global variables for formatting (separators, decimals, date/time formats etc.), has an overloaded equivalent requiring a parameter of type TFormatSettings. This additional parameter provides the formatting information rather than the global variables. For more information see the notes at TFormatSettings. }
  11. Vandrovnik

    Micro optimization: Math.InRange

    I used "inc(x);" for 64bit (global variable). You can have nop in a procedure: procedure Nop; assembler; asm nop end; But as far as I know, this procedure cannot be inlined, so there will be a call.
  12. David Heffernan

    TArray<T> helper

    It's a helper for TArray, not for TArray<T>, by necessity. I think spring4d has something similar.
×