Jump to content
karl Jonson

Hex2Binary

Recommended Posts

2 hours ago, karl Jonson said:

Hi,

What's the best method to convert hexadecimal (e.g. F0) to binary (e.g. 11110000) ?

Thank you.

This is what I use:

 

// aHex is expected hex string of chars: 0..9, A..F
function Hex2Bin(const aHex: string): string;
const
  // Array of [hex, binary] pairs
  cBinArray: Array[0..15, 0..1] of string =
    (('0', '0000'), ('1', '0001'), ('2', '0010'), ('3', '0011'), ('4', '0100'), ('5', '0101'), ('6', '0110'), ('7', '0111'),
     ('8', '1000'), ('9', '1001'), ('A', '1010'), ('B', '1011'), ('C', '1100'), ('D', '1101'), ('E', '1110'), ('F', '1111'));
var
  i: integer;
  x: string;
begin
  Result:='';

  // Iterate hex string
  for x in aHex do
    // For each hex char find binary result in cBinArray
    for i := Low(cBinArray) to High(cBinArray) do
      if cBinArray[i, 0] = x then
      begin
        // Concatenate binary results
        Result := Result + cBinArray[i, 1];
        Break;
      end;
end;

Note: it expects valid Hex string input (0..9 and A..F chars), so if you need to validate if input is valid hex string, or make it UpperCase (a..f -> A..F), make necessary checks.

  • Like 1

Share this post


Link to post
Guest

NOTE: in RAD Studio 10.3.3 already exist this function in "System.Classes.pas" unit

  • function HexToBin(Text: PWideChar; Buffer: PAnsiChar; BufSize: Integer): Integer; overload;
  • function HexToBin(Text: PAnsiChar; Buffer: PAnsiChar; BufSize: Integer): Integer; overload;
  • function HexToBin(Text: PWideChar; var Buffer; BufSize: Integer): Integer; overload; inline;
  • function HexToBin(Text: PAnsiChar; var Buffer; BufSize: Integer): Integer; overload; inline;
  • function HexToBin(Text: PWideChar; Buffer: Pointer; BufSize: Integer): Integer; overload; inline;
  • function HexToBin(Text: PAnsiChar; Buffer: Pointer; BufSize: Integer): Integer; overload; inline;

 

maybe some like this, using Mike concept!

 

function fncMyHexToBin(const lHexValue: string): string;
const
  lHexChars: array [0 .. 15] of char        = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
  lBinValues: array [0 .. 15] of Ansistring = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');
var
  lEachHexChar: char;
begin
  Result := '';
  //
  for lEachHexChar in lHexValue do
    try
      Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
    except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
    end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('zFF0r0ABu11')); // chars that not allow to Hex values, will be = ''
  Memo1.Lines.Add('fncMyHexToBin = ' + fncMyHexToBin('zFF0r0ABu11'));
  //
  Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('z'));
  Memo1.Lines.Add('fncMyHexToBin = ' + fncMyHexToBin('z'));
end;

 

image.thumb.png.eb727657efad0ac81f372877c387556d.png

 

hug

Edited by Guest

Share this post


Link to post
3 hours ago, emailx45 said:

for lEachHexChar in lHexValue do try Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1]; except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us! end;

Ugh. You can't rely on getting an AV.

 

Don't ever write code like this.

  • Like 2

Share this post


Link to post

I'd probably write it something like this:

 

function HexToBin(const HexValue: string): string;
const
  BinaryValues: array [0..15] of string = (
    '0000', '0001', '0010', '0011',
    '0100', '0101', '0110', '0111',
    '1000', '1001', '1010', '1011',
    '1100', '1101', '1110', '1111'
  );
var
  HexDigit: Char;
  HexDigitValue: Integer;
  Ptr: PChar;
begin
  SetLength(Result, Length(HexValue) * 4);
  Ptr := Pointer(Result);
  for HexDigit in HexValue do
  begin
    case HexDigit of
    '0'..'9':
      HexDigitValue := Ord(HexDigit) - Ord('0');
    'a'..'f':
      HexDigitValue := 10 + Ord(HexDigit) - Ord('a');
    'A'..'F':
      HexDigitValue := 10 + Ord(HexDigit) - Ord('A');
    else
      raise EConvertError.CreateFmt('Invalid hex digit ''%s'' found in ''%s''', [HexDigit, HexValue]);
    end;
    Move(Pointer(BinaryValues[HexDigitValue])^, Ptr^, 4 * SizeOf(Char));
    Inc(Ptr, 4);
  end;
end;

 

Some notes:

 

  1. A case statement makes this quite readable in my view.
  2. You really don't want to be wasting time using Pos to search within a string. You can get the value directly with arithmetic.
  3. I prefer to perform just a single allocation, rather than use repeated allocations with concatenation.
  4. You might want to consider how to treat leading zeros. For instance how should you treat 0F, should that be 00001111 or 1111? I'd expect that both would be desirable in different situations, so an option in an extra argument to the function would be needed.
  • Like 2
  • Thanks 3

Share this post


Link to post
3 minutes ago, Alexander Elagin said:

Copies four characters from the BinaryValues constant array item at index HexDigitValue to the location pointed by Ptr.

Aha, pretty neat trick. Thanks!

Share this post


Link to post
Guest

You can also optimize David's Move by replacing it with this

Quote

PUInt64(Ptr)^ := PUInt64(BinaryValues[HexDigitValue])^; // move 4 chars in Unicode

 

Share this post


Link to post

@David Heffernan Few remarks about your code if you don't mind :

1- Its pointless to use string when characters are fixed in size ... Simply use static array of X char.

2- Its also pointless to calculate index when you already used a case ... Simply declare your array using char-range. In your case, compiler generated additional instructions to compute the index. 


function HexToBin2(const HexValue: string): string;
type
  TChar4 = array [0 .. 3] of Char;
  PChar4 = ^TChar4;
const
  Table1: array ['0' .. '9'] of TChar4 = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001');
  Table2: array ['a' .. 'f'] of TChar4 = ('1010', '1011', '1100', '1101', '1110', '1111');
var
  HexDigit: Char;
  P: PChar4;
begin
  SetLength(Result, Length(HexValue) * 4);
  P := PChar4(Result);
  for HexDigit in HexValue do
  begin
    case HexDigit of
      '0' .. '9':
        P^ := Table1[HexDigit];
      'a' .. 'f':
        P^ := Table2[HexDigit];
      'A' .. 'F':
        P^ := Table2[Chr(Ord(HexDigit) xor $20)];
    else
      raise EConvertError.CreateFmt('Invalid hex digit ''%s'' found in ''%s''', [HexDigit, HexValue]);
    end;
    Inc(P);
  end;
end;

 

  • Like 5
  • Thanks 1

Share this post


Link to post
Guest
9 hours ago, David Heffernan said:

You might want to consider how to treat leading zeros. For instance how should you treat 0F, should that be 00001111 or 1111?

Where is the problem with "0F" or just "F" in my function or by Mike?

 

image.thumb.png.b66ead92ba9f0551765fe070f3d06880.png

 

9 hours ago, David Heffernan said:

You really don't want to be wasting time using Pos to search within a string. You can get the value directly with arithmetic.

Can you measure the time losted?

 

look the function size (in code) by Embarcadero in RAD 10.3.3 Arch! This is readable?

function HexToBin(Text: PWideChar; Buffer: PAnsiChar; BufSize: Integer): Integer;
var
  I: Integer;
  b1, b2: Byte;
begin
  I := BufSize;
  while I > 0 do
  begin
    if (Ord(Text[0]) > 255) or (Ord(Text[1]) > 255) then
      Break;
    b1 := H2BConvert[Ord(Text[0])];
    b2 := H2BConvert[Ord(Text[1])];
    if (b1 = $FF) or (b2 = $FF) then
      Break;
    Buffer[0] := AnsiChar((b1 shl 4) + b2);
    Inc(Buffer);
    Inc(Text, 2);
    Dec(I);
  end;
  Result := BufSize - I;
end;

 

Edited by Guest

Share this post


Link to post
Guest
9 hours ago, David Heffernan said:

Don't ever write code like this.

Sorry! Don't ever answer like this!

The world is rounded, for that the sun light just one side by time!

This allow that others, see like is beauty of moonlight!

 

Exception treated:

function fncMyHexToBin(const lHexValue: string): string;
const
  lHexChars: array [0 .. 15] of char        = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
  lBinValues: array [0 .. 15] of Ansistring = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');
var
  lEachHexChar: char;
begin
  Result := '';
  //
  for lEachHexChar in lHexValue do
    try
      Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
    except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
      // If Embarcadero use... I can too!
    end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  try
    Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('zFF0r0ABu11')); // chars that not allow to Hex values, will be = ''
    Memo1.Lines.Add('fncMyHexToBin = ' + fncMyHexToBin('zFF0r0ABu11'));
    //
    Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('0F'));
    Memo1.Lines.Add('fncMyHexToBin = ' + fncMyHexToBin('0F'));
    //
    Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('F'));
    Memo1.Lines.Add('fncMyHexToBin = ' + fncMyHexToBin('F'));
  except
    on E: Exception do
      showMessage('Exception dont treated: ' + sLineBreak + E.ClassName + sLineBreak + E.Message)
  end;
end;

Exception dont treated:

image.thumb.png.c864bc8524a294af00f87069d885ad25.png   image.thumb.png.7f879228e366bbec737fe9cfff43a448.png

 

hug

Edited by Guest

Share this post


Link to post

No guarantee that an out of bounds array access leads to an exception. You have just been unlucky that you've seen one every time you ran your code.

 

Once again, nobody should ever write code like that.

  • Like 1

Share this post


Link to post
Guest

@emailx45 I am fan of you do what ever you like, so here a better version of yours without try..except and it is safe

Quote

  lBinValues: array[0..16] of Ansistring = ('', '0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');

  for lEachHexChar in lHexValue do
    Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars)];

And you also can remove UpperCase by adding the small case letters to the table.

Share this post


Link to post

@Kas Ob. @emailx45 Relying on AV is potentially dangerous ! 

Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
{
Result  = Result + Content
Content = Address^
Address = @lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1]
If pos fails      => Address =  lBinValues - 1 
Address^          => if Address points to a valid location that has a read access then no AV ! Otherwise an AV.
Result + Content  => An exception may occur if content does not point to a valid location / invalid AnsiString ... otherwise no exception (HAZARD) !
}

So far ... you just have been lucky because the location (lBinValues - 1) does not point to a valid Location/AnsiString. Why ? because you used an array of char before lBinValues. But remember, compilers in general can optimize/insert/remove/align/reorder things ! 

Here is what happens when I just simulate what I explained :

const
  Boom: AnsiString = 'Boooom!!!'; // lBinValues - 1

function fncMyHexToBin(const lHexValue: string): string;
// I just reordered constants
const
  lBinValues: array [0 .. 15] of AnsiString = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101',
    '1110', '1111');
  lHexChars: array [0 .. 15] of char = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');

var
  lEachHexChar: char;
begin
  Result := '';

  for lEachHexChar in lHexValue do
    try
      Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
    except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
      // If Embarcadero use... I can too!
    end;
end;

procedure test;
var
  s: string;
begin
  Writeln(Boom); // Just to prevent compiler from omitting Boom.
  s := fncMyHexToBin('123x2');
  Writeln(s); // <------ Booooooommmmmm
end;

begin
  test();
  readln;
end.

 

Share this post


Link to post
Guest

Thank you Mahdi, and i can't agree more about letting exception lose and their danger and insecure bahaviour, for that i fixed it (for him!) and i think you missed that i removed the "-1" and added an empty string '' for the failed pos (=0), hence made it safe,

Share this post


Link to post
5 minutes ago, Kas Ob. said:

Thank you Mahdi, and i can't agree more about letting exception lose and their danger and insecure bahaviour, for that i fixed it (for him!) and i think you missed that i removed the "-1" and added an empty string '' for the failed pos (=0), hence made it safe,

Yep I missed that ... my bad 🙂 

But still doesn't handle invalid chars.

Share this post


Link to post
Guest
1 hour ago, Mahdi Safsafi said:

procedure test; var s: string; begin Writeln(Boom); // Just to prevent compiler from omitting Boom. s := fncMyHexToBin('123x2'); Writeln(s); // <------ Booooooommmmmm end; begin test(); readln; end.

image.thumb.png.be7d8c5372452dd0a51c4c73ae534d4d.png

 


const
  Boom: Ansistring = 'Boooom!!!'; // lBinValues - 1

procedure test;
var
  s: string;
begin
  Form1.Memo1.Lines.Add('Boom: ' + Boom); // Just to prevent compiler from omitting Boom.
  s := fncMyHexToBin('123x2');
  Form1.Memo1.Lines.Add('s: ' + s); // <------ Booooooommmmmm
end;

procedure TForm1.btnTestMahdiClick(Sender: TObject);
begin
  test;
end;

initialization

ReportMemoryLeaksOnShutdown := true;

finalization

end.

XX

Share this post


Link to post

@Mahdi Safsafi you are the winner! 🙂

 

I benchmarked 3 methods and results are like this:

 

(time in ms)

fncMyHexToBin (Emailx45) = 15722 = 100%
Hex2Bin (Mike)                  =   6170 =   39%
HexToBin2 (Mahdi)         =     925 =     5%

 

I assume using Pos, For loop and string concatenation kills our performance, @emailx45, while Mahdi's doesn't use any of it.

 

Of course, credit goes also to @David Heffernan because Mahdi's function is evolution of David's example.

 

Thanks! 🙂

Edited by Mike Torrettinni
  • Like 2

Share this post


Link to post

I wonder how it would look in assembly if you filled the out buffer with zeros, then swapped out the 1's by going shr/shl on a 64bit register.

I guess the potential gain would be eaten by the time required for stuffing the hex data into the register.

Share this post


Link to post
Guest

@Lars Fosdal Special For You !

 

I did it differently from what you suggested, the following is for one hex char to show the assembly, and as you wanted it, with no lookup table, the code i used to get the decimal value from Hex char is the only trick i have at mind by old habit i think, but searching the internet showed many tricks that can be utilized, few of them are branch free too. 

// Convert one Hex char into 4 chars representing 4 bit Binary string of HexChar
// HexBuffer must point to 4 Char (8 bytes) allocated space
procedure CharToBin_ASM32(HexChar: Char; HexBuffer: PChar);
asm
        push edi
        mov edi,edx
        //      Get the decimal value of one Hex Char (= half byte)
        movzx   eax, HexChar
        mov     ecx, 57
        sub     ecx, eax
        sar     ecx, 31
        and     ecx, 39
        neg     ecx
        add     eax, ecx
        add     eax,  - 48
        //      Produce 4 Chars presenting 4 bits of HexChar
        xor     ecx,ecx
        mov     dx,$1
        test    al,4
        cmovne  cx,dx
        shl     ecx,16
        test    al,8
        cmovne  cx,dx
        add     ecx,$00300030
        mov     [edi],ecx
        xor     ecx,ecx
        test    al,1
        cmovne  cx,dx
        shl     ecx,16
        test    al,2
        cmovne  cx,dx
        add     ecx,$00300030
        mov     [edi+4],ecx
        pop edi
end;

Branch free! , i also tried MMX instruction approach

// Convert one Hex char into 4 chars representing 4 bit Binary string of HexChar
// HexBuffer must point to 4 Char (8 bytes) allocated space
procedure CharToBin_MMX(HexChar: Char; HexBuffer: PChar);
const
  DEC_TO_BIN_WORD_MASK: array[0..3] of UInt16 = ($01, $02, $04, $08);
  DEC_TO_BIN_FF_TO_CHARONE_DISTANCE: array[0..3] of UInt16 = ($FFCF, $FFCF, $FFCF, $FFCF);
asm
        //      Get the decimal value of one Hex Char (= half byte)
        movzx   eax, HexChar
        mov     ecx, 57
        sub     ecx, eax
        sar     ecx, 31
        and     ecx, 39
        neg     ecx
        add     eax, ecx
        add     eax,  - 48
        //      Produce 4 Chars presenting 4 bits of HexChar
        movd    mm0, eax
        pxor    mm1, mm1
        punpckldq mm0, mm0
        packssdw mm0, mm0
        pand    mm0, qword ptr[DEC_TO_BIN_WORD_MASK]
        pcmpeqw mm0, mm1
        psubw   mm0, qword ptr[DEC_TO_BIN_FF_TO_CHARONE_DISTANCE]
        pshufw  mm0, mm0, $1B                     // reverse the result
        movq    qword ptr[HexBuffer], mm0
        emms
end;

Now that is a beauty, only "emms" will kill big part of the performance, but that lose can be recovered partly by delaying it until a full string being processed, means pay its price once.

The advantage of MMX instruction that can be easily modified to convert two Hex chars at the same speed in the converting, while in XMM it will double that converting 4 Hex chars, YMM the same..., also while we have plenty of registers we can parallel two bytes also at the same time.

Another thing is the consts in the MMX version can be loaded into mm2 and mm3, means any loop will be a little faster.

Share this post


Link to post

Nice!  My ASM knowledge predates MMX, so this was a learning experience 🙂
How does it measure up speedwise to the others?

Share this post


Link to post
16 minutes ago, Lars Fosdal said:

How does it measure up speedwise to the others?

How do you think it will compare against a lookup table? How would you expect computing an answer at runtime compare to computing the answer before compile time? 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×