Jump to content
bazzer747

Replacing Apostrophe

Recommended Posts

Hi,

I'm reading in a large file which I'm scraping from a website into a text file and am having a problem with surnames like O'Donnell and O'Brian. The input text for names like these show in the text like 'O'Donnell' - characters O' for the apostrophe. These surnames need to match an existing table of usernames but won't unless I replace O' with an apostrophe.

 

I'm trying this code to do this:

if AnsiContainsText( cLast, ''' )   // cLast holds surname  'O'Donnell'

  then AnsiReplaceStr( cLast, ''', '' ); 

 

So replacing ' with and apostrophe. This isn't working. When I debug the first line recognises the characters ' but the second line replaces nothing.

 

Any thoughts on why this isn't working (or a better way to do this would be appreciated.

 

Share this post


Link to post
AnsiReplaceStr( cLast, ''', '' );  

Is that an empty string or is it the forum software that plays tricks on us?
It should look like

AnsiReplaceStr( cLast, ''', '''' );  

 

Share this post


Link to post

Also everyone failed to notice a small detail:
 

function AnsiReplaceStr(const AText, AFromText, AToText: string): string;

It's a function not a procedure. You can try:

cLast:=AnsiReplaceStr( cLast, ''', '''' );

 

 

Share this post


Link to post

Also note that "'" is bad style (magic number), more correct is "'", sometime web masters could change to the latter one

Share this post


Link to post

Thank you all. yes, missed that it was a function so this code:

 

        if AnsiContainsText( cLast, ''' )   then
            cLast:= AnsiReplaceStr( cLast, ''', chr(39) );

 

Now works as expected and returns O'Donovan which is what matches the name in the existing table. (and of course, will manage other similar names).

Lars - that was just an empty string there, I just wanted to remove the characters at that stage.

Share this post


Link to post

There also is a built-in function to decode HTML text:

uses
  System.NetEncoding;
...
  var S := TNetEncoding.HTML.Decode(sHtml);

 

Edited by Uwe Raabe
  • Like 3

Share this post


Link to post
11 hours ago, Uwe Raabe said:

There also is a built-in function to decode HTML text:


uses
  System.NetEncoding;
...
  var S := TNetEncoding.HTML.Decode(sHtml);

 

However, it only supports decoding numeric entities, and references to reserved characters. Since apos is not a reserved character, it will not decode '''  The documentation even says so:

 

https://docwiki.embarcadero.com/Libraries/en/System.NetEncoding.THTMLEncoding

Quote

THTMLEncoding only encodes reserved HTML characters: "&<>. THTMLEncoding supports decoding any HTML numeric character reference, such as &#169; or &#254;, as well as the character entity references of reserved HTML characters: &quot;, &amp;, &lt;, &gt;.

 

Warning: Decoding character entity references of non-reserved characters, such as &apos; or &copy;, is not supported. The input data must not contain any other character entity references. Otherwise, the output data may be corrupted.

 

Edited by Remy Lebeau

Share this post


Link to post
2 minutes ago, Lars Fosdal said:

Nope

Nope on your nope 😉 That was more or less actual when the question was asked (12 yr ago) but now, according to your estimations, it's 3x legacy. IE8 in 22 is hardly a something to consider

Share this post


Link to post
27 minutes ago, Fr0sT.Brutal said:

Nope on your nope 😉 That was more or less actual when the question was asked (12 yr ago) but now, according to your estimations, it's 3x legacy. IE8 in 22 is hardly a something to consider

So, TNetEncoding.HTML.Decode needs to be updated to support HTML5...

Edit: Looks like a significant expansion of named entities.

HTML5: https://www.w3.org/TR/2011/WD-html5-20110525/named-character-references.html
HTML4: https://www.w3.org/TR/html4/sgml/entities.html

 

 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×