David Heffernan

October 2, 2020

1 hour ago, Rollo62 said:

Maybe I should say that it contains maybe 95% string and 5% binary data, as it may come from various sources.

It kinda makes no sense then that you also say that the data is ASCII (0..127). I'm very confused.

October 2, 2020

55 minutes ago, Dany Marmur said:

Visual Code + a select extension can provide a very good UI to both Git and SVN.

Just so long as VS Code doesn't auto uninstall every fortnight like it does with me......

October 2, 2020

7 minutes ago, Rollo62 said:

I tend to see the right candidate would be AnsiString, as it supports codepages for maybe future use,
but the support in Delphi of AnsiString I also have in question.

Shall it stay, or shall it go ? ( according to a well known song )

At the start of this thread you said that the data was binary. Now you say it is ASCII. Hard to give advice on this basis.

October 2, 2020

Would be perverse to use 16 bit Char to store 8 bit data. In terms of performance byte strings and byte arrays are similar but if anything byte arrays will be faster. Precisely because they don't have coy on write. No idea why you thing strings perform better.

My guess is that your antipathy to byte arrays is a hangover from the legacy Delphi anti pattern that byte arrays are handled as strings.

October 1, 2020

Flashing can be quite visually aggressive. It is often preferable to highlight a control in some other way.

October 1, 2020

In the long run it will be less efficient to continue using these bridges.

September 29, 2020

3 hours ago, Steve Maughan said:

Thanks — I wasn't aware of System.ZIP2. I'll take a look

I can't understand why you would. Aren't you likely just to end up changing your code for no reason, given that the defect is almost certainly not in your compression library?

September 28, 2020

1 minute ago, aehimself said:

Ummm... I did not? I just offered a free to use alternative.

It's pretty bad advice. Changing algorithm and implementation without any justification or rationale. Seems like you are advocating trying libraries at random. If every time you encounter an issue you replace the lirbsry, after a while you'll have run out of libraries.

September 28, 2020

2 hours ago, Steve Maughan said:

I thought these were just a wrapper around standard ZLib routines

They are.

2 hours ago, Steve Maughan said:

It could have been corrupted after being saved.

That's the a very plausible explanation. File corruption is something that does happen.

You'll want to reproduce the issue before trying to solve the problem. And if it is file corruption then the solution is somebody else's problem.

September 28, 2020

Just now, aehimself said:

If you don't need anything fancy, you can use System.ZIP (or the updated System.ZIP2, which is a drop-in replacement offering some extras). I'm using Zip2 with smaller modifications, works like a charm.

How did you diagnose that the defect was in ZCompressStream or ZCompressStream?

September 28, 2020

As usual the most simple explanation was the answer. If a string variable is empty then the obvious explanation is that you either assigned it to be empty, or didn't assign it at all.

September 28, 2020

Kind of odd that you wouldn't just use a byte, TBytes.

September 28, 2020

Make a minimal reproduction.

FastMM reports leaks only in the delphi heap. madExcept reports those leaks and also leaks in many other system resources.

September 28, 2020

Websearch took me here

https://delphi.fandom.com/wiki/Delphi_Release_Dates

September 27, 2020

Imagine if you have users with names that don't begin with one of the 26 letters used in the English language?

What you should do is abandon this UI approach and let the user type.

September 27, 2020

http://docwiki.embarcadero.com/RADStudio/Rio/en/Breakpoint_List_Window

September 26, 2020

13 minutes ago, timfrost said:

With the normal options, any invalid UTF8 sequences should be returned as Ufffd in the returned Unicode string, and you can walk the result and drop them.

Asker seems to want to remove certain valid UTF8 sequences..... So this won't help.

Nobody can help with no clear spec.

September 26, 2020

1 hour ago, aehimself said:

I didn't know that there is an actual expression for this. Sounds familiar though, I guess I did it lots of times as well.

We've all done it. It never works out.

September 26, 2020

There's a lot of noise in here. It seems you don't really understand where these characters are coming from and are in trial and error programming mode.

The advice from the wise heads here is to understand what is going on, and then work out how to tackle it.

You don't seem to want to heed that advice. That's fine, it's your choice. But we don't need a blow by blow account of your trial and error coding. That's only meaningful to you.

September 24, 2020

REMOVED, sorry, was dupe

September 24, 2020

1 hour ago, Mike Torrettinni said:

I was referring to variable iterating For..in loop - in this case full array record content is copied into variable. No? At least that's what I see from the simple example above.

Yes that is correct. My argument stands.

Copying the full record into a local is only expensive (compared to reading a value from the record in-situ in the array) if the record is large. For a small record, copying a handful of bytes costs no more than reading even a single byte.

I'm arguing against your claim that there would be a performance hit using a for in loop even for small records. For large records there will be a hit. Not for small records.

September 24, 2020

40 minutes ago, Mike Torrettinni said:

Iterating a few large records (390MB) or lots of small records (10000x 0.039MB), could have similar performance effects.

No. Because when you iterate over an array, you typically don't read all of the content of each item.

Imagine that all you do is look inside each record for an integer ID. If the record is huge, then you can just read a single integer, and move on, if using a classic for loop with an array of record. But if you use a for in loop you have to copy the entire record before reading the single integer ID. That's wasteful.

In the case of a smaller record, let's say small enough to fit into a cache line, then reading an integer from the record has the essentially same cost as copying the entire record and then picking out the integer.

So the trade off depends hugely on the size of the record, and what proportion of it you actually need to access.

September 24, 2020

25 minutes ago, borni69 said:
I agree with you...

And after this back and forward discussion in this thread I think I have a solution.

I will only send out characters and commands that are handled by TjsonString.create()

So I think my code will be something like this

PS: there is 0b 0b before E-K in rawText
 rawText := 'E-K ble æøå  Test // 98';
  ajson := TjsonObject.Create;
  try
   ajson.AddPair('text',TJSONString.Create(rawText));
   RawtextOut := ajson.tostring;
  finally
    ajson.Free;
  end;

  textOut:='';
  for ch in RawtextOut  do
  begin
   if (ch >= #32)  then
   textOut := textOut+ch

  end;

  memo1.Lines.Text := textOut;
result

{"text":"E-K ble æøå Test \/\/ 98"} characters I dont want are removed...

also line break tabs will be handled correct .. \r\n

Thanks for all your help..

B

Better hope that there are no line breaks .....

September 24, 2020

14 minutes ago, Mike Torrettinni said:

or smaller records but you have a large number of them, so iteration copies smaller memory but a lot of times

Why would there be an issue for small records? Presumably you are iterating over the array because you want to look at the content of the record. For a small enough record, there won't be any difference in perf between reading a field and copying the entire record.

September 24, 2020

50 minutes ago, borni69 said:

Sometimes they copy text from word / email etc, and then we get characters we dont want.. I am not sure what they are, we like to keep linbreak tabs ect, but not this character showing as a ? or a as seen in above image.

Your problem is not how to remove characters, it is to work out what characters are to be removed. As is so often the case, the hardest part of any programming tasks is determining the correct specification.

Sign In

David Heffernan

Content Count

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Calendar

Posts posted by David Heffernan

Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Contributing to projects on GitHub with Subversion

Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Tbutton Flashing

Contributing to projects on GitHub with Subversion

Any Known Issues with ZCompressStream?

Any Known Issues with ZCompressStream?

Any Known Issues with ZCompressStream?

Any Known Issues with ZCompressStream?

Local string variable value is not assigned for 2nd and following calls

Workaround for binary data in strings ...

Local string variable value is not assigned for 2nd and following calls

Delphi Version Numbers

RadioGroup layout

Removing breakpoints

Remove non-utf8 characters from a utf8 string

Remove non-utf8 characters from a utf8 string

Remove non-utf8 characters from a utf8 string

Is variable value kept after For.. in ... do loop?

Is variable value kept after For.. in ... do loop?

Is variable value kept after For.. in ... do loop?

Remove non-utf8 characters from a utf8 string

Is variable value kept after For.. in ... do loop?

Remove non-utf8 characters from a utf8 string

Browse

Activity