Hex2Binary

Recommended Posts

Hi,

What's the best method to convert hexadecimal (e.g. F0) to binary (e.g. 11110000) ?

Thank you.

Define a simple map from the 16 hex digits to the 4 character binary strings. Iterate over each hex digit and concatenate.

2 hours ago, karl Jonson said:

Hi,

What's the best method to convert hexadecimal (e.g. F0) to binary (e.g. 11110000) ?

Thank you.

This is what I use:

```// aHex is expected hex string of chars: 0..9, A..F
function Hex2Bin(const aHex: string): string;
const
// Array of [hex, binary] pairs
cBinArray: Array[0..15, 0..1] of string =
(('0', '0000'), ('1', '0001'), ('2', '0010'), ('3', '0011'), ('4', '0100'), ('5', '0101'), ('6', '0110'), ('7', '0111'),
('8', '1000'), ('9', '1001'), ('A', '1010'), ('B', '1011'), ('C', '1100'), ('D', '1101'), ('E', '1110'), ('F', '1111'));
var
i: integer;
x: string;
begin
Result:='';

// Iterate hex string
for x in aHex do
// For each hex char find binary result in cBinArray
for i := Low(cBinArray) to High(cBinArray) do
if cBinArray[i, 0] = x then
begin
// Concatenate binary results
Result := Result + cBinArray[i, 1];
Break;
end;
end;```

Note: it expects valid Hex string input (0..9 and A..F chars), so if you need to validate if input is valid hex string, or make it UpperCase (a..f -> A..F), make necessary checks.

• 1

NOTE: in RAD Studio 10.3.3 already exist this function in "System.Classes.pas" unit

• function HexToBin(Text: PWideChar; Buffer: PAnsiChar; BufSize: Integer): Integer; overload;
• function HexToBin(Text: PAnsiChar; Buffer: PAnsiChar; BufSize: Integer): Integer; overload;
• function HexToBin(Text: PWideChar; var Buffer; BufSize: Integer): Integer; overload; inline;
• function HexToBin(Text: PAnsiChar; var Buffer; BufSize: Integer): Integer; overload; inline;
• function HexToBin(Text: PWideChar; Buffer: Pointer; BufSize: Integer): Integer; overload; inline;
• function HexToBin(Text: PAnsiChar; Buffer: Pointer; BufSize: Integer): Integer; overload; inline;

maybe some like this, using Mike concept!

```function fncMyHexToBin(const lHexValue: string): string;
const
lHexChars: array [0 .. 15] of char        = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
lBinValues: array [0 .. 15] of Ansistring = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');
var
lEachHexChar: char;
begin
Result := '';
//
for lEachHexChar in lHexValue do
try
Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('zFF0r0ABu11')); // chars that not allow to Hex values, will be = ''
//
end;```

hug

Edited by Guest

3 hours ago, emailx45 said:

for lEachHexChar in lHexValue do try Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1]; except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us! end;

Ugh. You can't rely on getting an AV.

Don't ever write code like this.

• 2

I'd probably write it something like this:

```function HexToBin(const HexValue: string): string;
const
BinaryValues: array [0..15] of string = (
'0000', '0001', '0010', '0011',
'0100', '0101', '0110', '0111',
'1000', '1001', '1010', '1011',
'1100', '1101', '1110', '1111'
);
var
HexDigit: Char;
HexDigitValue: Integer;
Ptr: PChar;
begin
SetLength(Result, Length(HexValue) * 4);
Ptr := Pointer(Result);
for HexDigit in HexValue do
begin
case HexDigit of
'0'..'9':
HexDigitValue := Ord(HexDigit) - Ord('0');
'a'..'f':
HexDigitValue := 10 + Ord(HexDigit) - Ord('a');
'A'..'F':
HexDigitValue := 10 + Ord(HexDigit) - Ord('A');
else
raise EConvertError.CreateFmt('Invalid hex digit ''%s'' found in ''%s''', [HexDigit, HexValue]);
end;
Move(Pointer(BinaryValues[HexDigitValue])^, Ptr^, 4 * SizeOf(Char));
Inc(Ptr, 4);
end;
end;```

Some notes:

1. A case statement makes this quite readable in my view.
2. You really don't want to be wasting time using Pos to search within a string. You can get the value directly with arithmetic.
3. I prefer to perform just a single allocation, rather than use repeated allocations with concatenation.
4. You might want to consider how to treat leading zeros. For instance how should you treat 0F, should that be 00001111 or 1111? I'd expect that both would be desirable in different situations, so an option in an extra argument to the function would be needed.
• 2
• 3

57 minutes ago, David Heffernan said:

Move(Pointer(BinaryValues[HexDigitValue])^, Ptr^, 4 * SizeOf(Char));

@David Heffernan  What does this do?

5 minutes ago, Mike Torrettinni said:

@David Heffernan  What does this do?

Copies four characters from the BinaryValues constant array item at index HexDigitValue to the location pointed by Ptr.

• 1

3 minutes ago, Alexander Elagin said:

Copies four characters from the BinaryValues constant array item at index HexDigitValue to the location pointed by Ptr.

Aha, pretty neat trick. Thanks!

You can also optimize David's Move by replacing it with this

Quote

PUInt64(Ptr)^ := PUInt64(BinaryValues[HexDigitValue])^; // move 4 chars in Unicode

@David Heffernan Few remarks about your code if you don't mind :

1- Its pointless to use string when characters are fixed in size ... Simply use static array of X char.

2- Its also pointless to calculate index when you already used a case ... Simply declare your array using char-range. In your case, compiler generated additional instructions to compute the index.

```
function HexToBin2(const HexValue: string): string;
type
TChar4 = array [0 .. 3] of Char;
PChar4 = ^TChar4;
const
Table1: array ['0' .. '9'] of TChar4 = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001');
Table2: array ['a' .. 'f'] of TChar4 = ('1010', '1011', '1100', '1101', '1110', '1111');
var
HexDigit: Char;
P: PChar4;
begin
SetLength(Result, Length(HexValue) * 4);
P := PChar4(Result);
for HexDigit in HexValue do
begin
case HexDigit of
'0' .. '9':
P^ := Table1[HexDigit];
'a' .. 'f':
P^ := Table2[HexDigit];
'A' .. 'F':
P^ := Table2[Chr(Ord(HexDigit) xor \$20)];
else
raise EConvertError.CreateFmt('Invalid hex digit ''%s'' found in ''%s''', [HexDigit, HexValue]);
end;
Inc(P);
end;
end;```

• 5
• 1

Holding the nibble binary text in a fixed length array is rather nice, I approve of that. Good stuff.

9 hours ago, David Heffernan said:

You might want to consider how to treat leading zeros. ﻿For instance how should you treat 0F, should that be 00001111 or 1111?

Where is the problem with "0F" or just "F" in my function or by Mike?

9 hours ago, David Heffernan said:

You really don't want to be wasting time using Pos to search within a string. You can get the value directly with arithmetic.

Can you measure the time losted?

```function HexToBin(Text: PWideChar; Buffer: PAnsiChar; BufSize: Integer): Integer;
var
I: Integer;
b1, b2: Byte;
begin
I := BufSize;
while I > 0 do
begin
if (Ord(Text[0]) > 255) or (Ord(Text[1]) > 255) then
Break;
b1 := H2BConvert[Ord(Text[0])];
b2 := H2BConvert[Ord(Text[1])];
if (b1 = \$FF) or (b2 = \$FF) then
Break;
Buffer[0] := AnsiChar((b1 shl 4) + b2);
Inc(Buffer);
Inc(Text, 2);
Dec(I);
end;
Result := BufSize - I;
end;```

Edited by Guest

9 hours ago, David Heffernan said:

Don't ever write code like this.

Sorry! Don't ever answer like this!

The world is rounded, for that the sun light just one side by time!

This allow that others, see like is beauty of moonlight!

Exception treated:

```function fncMyHexToBin(const lHexValue: string): string;
const
lHexChars: array [0 .. 15] of char        = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
lBinValues: array [0 .. 15] of Ansistring = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');
var
lEachHexChar: char;
begin
Result := '';
//
for lEachHexChar in lHexValue do
try
Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
// If Embarcadero use... I can too!
end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
try
Memo1.Lines.Add('Hex2Binxxxxxx = ' + Hex2Bin('zFF0r0ABu11')); // chars that not allow to Hex values, will be = ''
//
//
except
on E: Exception do
showMessage('Exception dont treated: ' + sLineBreak + E.ClassName + sLineBreak + E.Message)
end;
end;```

Exception dont treated:

hug

Edited by Guest

No guarantee that an out of bounds array access leads to an exception. You have just been unlucky that you've seen one every time you ran your code.

Once again, nobody should ever write code like that.

• 1

@emailx45 I am fan of you do what ever you like, so here a better version of yours without try..except and it is safe

Quote

lBinValues: array[0..16] of Ansistring = ('', '0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101', '1110', '1111');

for lEachHexChar in lHexValue do
Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars)];

And you also can remove UpperCase by adding the small case letters to the table.

@emailx45 Relying on AV is potentially dangerous !

```Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
{
Result  = Result + Content
Address = @lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1]
If pos fails      => Address =  lBinValues - 1
Address^          => if Address points to a valid location that has a read access then no AV ! Otherwise an AV.
Result + Content  => An exception may occur if content does not point to a valid location / invalid AnsiString ... otherwise no exception (HAZARD) !
}```

So far ... you just have been lucky because the location (lBinValues - 1) does not point to a valid Location/AnsiString. Why ? because you used an array of char before lBinValues. But remember, compilers in general can optimize/insert/remove/align/reorder things !

Here is what happens when I just simulate what I explained :

```const
Boom: AnsiString = 'Boooom!!!'; // lBinValues - 1

function fncMyHexToBin(const lHexValue: string): string;
// I just reordered constants
const
lBinValues: array [0 .. 15] of AnsiString = ('0000', '0001', '0010', '0011', '0100', '0101', '0110', '0111', '1000', '1001', '1010', '1011', '1100', '1101',
'1110', '1111');
lHexChars: array [0 .. 15] of char = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');

var
lEachHexChar: char;
begin
Result := '';

for lEachHexChar in lHexValue do
try
Result := Result + lBinValues[Pos(UpperCase(lEachHexChar), lHexChars) - 1];
except // case the "char" is not found, we have a "AV"! then.... doesnt matter for us!
// If Embarcadero use... I can too!
end;
end;

procedure test;
var
s: string;
begin
Writeln(Boom); // Just to prevent compiler from omitting Boom.
s := fncMyHexToBin('123x2');
Writeln(s); // <------ Booooooommmmmm
end;

begin
test();
end.```

Thank you Mahdi, and i can't agree more about letting exception lose and their danger and insecure bahaviour, for that i fixed it (for him!) and i think you missed that i removed the "-1" and added an empty string '' for the failed pos (=0), hence made it safe,

5 minutes ago, Kas Ob. said:

Thank you Mahdi, and i can't agree more about letting exception lose and their danger and insecure bahaviour, for that i fixed it (for him!) and i think you missed that i removed the "-1" and added an empty string '' for the failed pos (=0), hence made it safe,

Yep I missed that ... my bad 🙂

But still doesn't handle invalid chars.

1 hour ago, Mahdi Safsafi said:

procedure test; var s: string; begin Writeln(Boom); // Just to prevent compiler from omitting Boom. s := fncMyHexToBin('123x2'); Writeln(s); // <------ Booooooommmmmm end;﻿ begin test(); readln; end.

```
const
Boom: Ansistring = 'Boooom!!!'; // lBinValues - 1

procedure test;
var
s: string;
begin
Form1.Memo1.Lines.Add('Boom: ' + Boom); // Just to prevent compiler from omitting Boom.
s := fncMyHexToBin('123x2');
Form1.Memo1.Lines.Add('s: ' + s); // <------ Booooooommmmmm
end;

procedure TForm1.btnTestMahdiClick(Sender: TObject);
begin
test;
end;

initialization

ReportMemoryLeaksOnShutdown := true;

finalization

end.```

XX

@Mahdi Safsafi you are the winner! 🙂

I benchmarked 3 methods and results are like this:

(time in ms)

fncMyHexToBin (Emailx45) = 15722 = 100%
Hex2Bin (Mike)                  =   6170 =   39%
HexToBin2 (Mahdi)         =     925 =     5%

I assume using Pos, For loop and string concatenation kills our performance, @emailx45, while Mahdi's doesn't use any of it.

Of course, credit goes also to @David Heffernan because Mahdi's function is evolution of David's example.

Thanks! 🙂

Edited by Mike Torrettinni
• 2

I wonder how it would look in assembly if you filled the out buffer with zeros, then swapped out the 1's by going shr/shl on a 64bit register.

I guess the potential gain would be eaten by the time required for stuffing the hex data into the register.

@Lars Fosdal Special For You !

I did it differently from what you suggested, the following is for one hex char to show the assembly, and as you wanted it, with no lookup table, the code i used to get the decimal value from Hex char is the only trick i have at mind by old habit i think, but searching the internet showed many tricks that can be utilized, few of them are branch free too.

```// Convert one Hex char into 4 chars representing 4 bit Binary string of HexChar
// HexBuffer must point to 4 Char (8 bytes) allocated space
procedure CharToBin_ASM32(HexChar: Char; HexBuffer: PChar);
asm
push edi
mov edi,edx
//      Get the decimal value of one Hex Char (= half byte)
movzx   eax, HexChar
mov     ecx, 57
sub     ecx, eax
sar     ecx, 31
and     ecx, 39
neg     ecx
//      Produce 4 Chars presenting 4 bits of HexChar
xor     ecx,ecx
mov     dx,\$1
test    al,4
cmovne  cx,dx
shl     ecx,16
test    al,8
cmovne  cx,dx
mov     [edi],ecx
xor     ecx,ecx
test    al,1
cmovne  cx,dx
shl     ecx,16
test    al,2
cmovne  cx,dx
mov     [edi+4],ecx
pop edi
end;```

Branch free! , i also tried MMX instruction approach

```// Convert one Hex char into 4 chars representing 4 bit Binary string of HexChar
// HexBuffer must point to 4 Char (8 bytes) allocated space
procedure CharToBin_MMX(HexChar: Char; HexBuffer: PChar);
const
DEC_TO_BIN_WORD_MASK: array[0..3] of UInt16 = (\$01, \$02, \$04, \$08);
DEC_TO_BIN_FF_TO_CHARONE_DISTANCE: array[0..3] of UInt16 = (\$FFCF, \$FFCF, \$FFCF, \$FFCF);
asm
//      Get the decimal value of one Hex Char (= half byte)
movzx   eax, HexChar
mov     ecx, 57
sub     ecx, eax
sar     ecx, 31
and     ecx, 39
neg     ecx
//      Produce 4 Chars presenting 4 bits of HexChar
movd    mm0, eax
pxor    mm1, mm1
punpckldq mm0, mm0
packssdw mm0, mm0
pcmpeqw mm0, mm1
psubw   mm0, qword ptr[DEC_TO_BIN_FF_TO_CHARONE_DISTANCE]
pshufw  mm0, mm0, \$1B                     // reverse the result
movq    qword ptr[HexBuffer], mm0
emms
end;```

Now that is a beauty, only "emms" will kill big part of the performance, but that lose can be recovered partly by delaying it until a full string being processed, means pay its price once.

The advantage of MMX instruction that can be easily modified to convert two Hex chars at the same speed in the converting, while in XMM it will double that converting 4 Hex chars, YMM the same..., also while we have plenty of registers we can parallel two bytes also at the same time.

Another thing is the consts in the MMX version can be loaded into mm2 and mm3, means any loop will be a little faster.

Nice!  My ASM knowledge predates MMX, so this was a learning experience 🙂
How does it measure up speedwise to the others?

16 minutes ago, Lars Fosdal said:

How﻿ does it measure up speedwise to the others﻿?

How do you think it will compare against a lookup table? How would you expect computing an answer at runtime compare to computing the answer before compile time?