Jud 1 Posted June 27, 2023 TBits is still limited to 2^31 bits, even on the 64-bit platform. Is there a replacement for it that will allow > 2^32 bits (i.e. int64?) (I could write one, but a lot of people are more efficient with this kind of thing than I,) Share this post Link to post
Der schöne Günther 316 Posted June 27, 2023 The maximum size TBits can be before bugging out is Integer.MaxValue - 31. That's roughly 256 Megabytes of consecutive boolean storage spage. You really need that much? Share this post Link to post
Jud 1 Posted June 27, 2023 10 hours ago, Der schöne Günther said: The maximum size TBits can be before bugging out is Integer.MaxValue - 31. That's roughly 256 Megabytes of consecutive boolean storage spage. You really need that much? Yes, I need more. I current;y need about 300 billion bits. I can write it myself, but I thought there might be one already available. Share this post Link to post
Uwe Raabe 2057 Posted June 27, 2023 3 hours ago, Jud said: I current;y need about 300 billion bits. Wow! If my math is correct that needs more than 32GB of memory. I have no idea what the purpose is, but perhaps there are other approaches to achieve the same. 1 Share this post Link to post
David Heffernan 2345 Posted June 27, 2023 5 hours ago, Jud said: Yes, I need more. I current;y need about 300 billion bits. I can write it myself, but I thought there might be one already available. How many such instances of this type do you need in memory at any time? And what's the expected number of bits that are set at any time? Do you really need to store all bits, both 0 and 1. Can't you just stor the 1s and infer the 0s from the fact that they aren't stored as 1s? Share this post Link to post
Jud 1 Posted June 27, 2023 32 minutes ago, David Heffernan said: How many such instances of this type do you need in memory at any time? And what's the expected number of bits that are set at any time? Do you really need to store all bits, both 0 and 1. Can't you just stor the 1s and infer the 0s from the fact that they aren't stored as 1s? I estimate that about 5% of the bits will be 1s. They will be read-only in the actual run, so the data will be shared among 20 threads. I don't understand how storing just the 1s would work. That would result in a series of about 15,000,000 1s, but that would be useless. Share this post Link to post
Jud 1 Posted June 27, 2023 3 hours ago, Uwe Raabe said: Wow! If my math is correct that needs more than 32GB of memory. I have no idea what the purpose is, but perhaps there are other approaches to achieve the same. 1. 32GB is nothing today. Every computer here that it might be running on has at last 128GB of RAM. 2. No, this is the best approach. Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 5 hours ago, Jud said: That would result in a series of about 15,000,000 1s, but that would be useless. No. You use a hashed collection. Like a dictionary, but one without a value, which is known as a set. But at 5% full then I don't think it will get you any benefit, because of the overhead of the hashed collection. Share this post Link to post
Rollo62 536 Posted June 28, 2023 How about the idea to store this in a 2D-image as Bitfield, of maybe 17320x17320 pixel ? Of course storage will be maybe more efficient, but access would be slower. Thinking further, using such Bitfield with involvement of the GPU .... 2 Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 2 hours ago, Rollo62 said: How about the idea to store this in a 2D-image as Bitfield, of maybe 17320x17320 pixel ? How could this be more efficient than a contiguous array? Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 2 hours ago, Rollo62 said: Thinking further, using such Bitfield with involvement of the GPU .... What about the cost of moving data between gpu and main memory? Benefits of gpu are in highly parallel computation. Where is that in this scenario. Just saying GPU doesn't make something fast or efficient. Share this post Link to post
Rollo62 536 Posted June 28, 2023 Like said, just an alternative idea, with pros and cons. Only @Jud may know if this is workable for him somehow, or not. Share this post Link to post
Brandon Staggs 277 Posted June 28, 2023 (edited) EDIT: Bad brain day. Of course there is no problem getting whatever chunk of memory you want out of the available address space. Edited June 28, 2023 by Brandon Staggs Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 18 minutes ago, Brandon Staggs said: I would not want to count on always being able to allocate a 32 GB block, even if I could guarantee all my hardware had 128 GB of RAM. If you don't want to use a hash bucket for your 5% of items that are true, because of performance, you are going to have to write your own class for this. Personally I would stripe it into 1GB chunks or something. I have written my own bits class because I wanted fast boolean operations on all of the bits with other instances. It's not a hard thing to design and striping your values across multiple allocated blocks would not add much overhead. Why would you not expect to get a 32GB block when the available address space is 128TB? Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 unit Bitset; interface uses SysUtils, Math; type TBitSet = record private FBitCount: Int64; FSets: array of set of 0..255; class function SetCount(BitCount: Int64): Int64; static; procedure MakeUnique; procedure GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); function GetIsEmpty: Boolean; procedure SetBitCount(Value: Int64); function GetSize: Int64; public class operator In(const Bit: Int64; const BitSet: TBitSet): Boolean; class operator Equal(const bs1, bs2: TBitSet): Boolean; class operator NotEqual(const bs1, bs2: TBitSet): Boolean; property BitCount: Int64 read FBitCount write SetBitCount; property Size: Int64 read GetSize; property IsEmpty: Boolean read GetIsEmpty; procedure Clear; procedure IncludeAll; procedure Include(const Bit: Int64); procedure Exclude(const Bit: Int64); end; implementation { TBitSet } procedure TBitSet.MakeUnique; begin // this is used to implement copy-on-write so that the type behaves like a value SetLength(FSets, Length(FSets)); end; procedure TBitSet.GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); begin Assert(InRange(Bit, 0, FBitCount-1)); SetIndex := Bit shr 8; // shr 8 = div 256 BitIndex := Bit and 255; // and 255 = mod 256 end; function TBitSet.GetIsEmpty: Boolean; var i: Int64; begin for i := 0 to High(FSets) do begin if FSets[i]<>[] then begin Result := False; Exit; end; end; Result := True; end; procedure TBitSet.SetBitCount(Value: Int64); var Bit, BitIndex: Integer; SetIndex: Int64; begin if (Value<>FBitCount) or not Assigned(FSets) then begin Assert(Value>=0); FBitCount := Value; SetLength(FSets, SetCount(Value)); if Value>0 then begin (* Ensure that unused bits are cleared, necessary give the CompareMem call in Equal. This also means that state does not persist when we decrease and then increase BitCount. For instance, consider this code: var bs: TBitSet; ... bs.BitCount := 2; bs.Include(1); bs.BitCount := 1; bs.BitCount := 2; Assert(not (1 in bs)); *) GetSetIndexAndBitIndex(Value - 1, SetIndex, BitIndex); for Bit := BitIndex + 1 to 255 do begin System.Exclude(FSets[SetIndex], Bit); end; end; end; end; function TBitSet.GetSize: Int64; begin Result := Length(FSets)*SizeOf(FSets[0]); end; class function TBitSet.SetCount(BitCount: Int64): Int64; begin Result := (BitCount + 255) shr 8; // shr 8 = div 256 end; class operator TBitSet.In(const Bit: Int64; const BitSet: TBitSet): Boolean; var SetIndex: Int64; BitIndex: Integer; begin BitSet.GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); Result := BitIndex in BitSet.FSets[SetIndex]; end; class operator TBitSet.Equal(const bs1, bs2: TBitSet): Boolean; begin Result := (bs1.FBitCount=bs2.FBitCount) and CompareMem(Pointer(bs1.FSets), Pointer(bs2.FSets), bs1.Size); end; class operator TBitSet.NotEqual(const bs1, bs2: TBitSet): Boolean; begin Result := not (bs1=bs2); end; procedure TBitSet.Clear; var i: Int64; begin MakeUnique; for i := 0 to High(FSets) do begin FSets[i] := []; end; end; procedure TBitSet.IncludeAll; var i: Int64; begin for i := 0 to BitCount-1 do begin Include(i); end; end; procedure TBitSet.Include(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Include(FSets[SetIndex], BitIndex); end; procedure TBitSet.Exclude(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Exclude(FSets[SetIndex], BitIndex); end; end. This is based on code of mine that has is limited to integer bit count. I've not tested it extended to Int64, but I'm sure anyone that wanted to use random code like this would test. 2 Share this post Link to post
Brandon Staggs 277 Posted June 28, 2023 33 minutes ago, David Heffernan said: Why would you not expect to get a 32GB block when the available address space is 128TB? LOL, I don't know what I was thinking. That was obviously wrong. I'm deleting the post since it's pointless noise now. Share this post Link to post
Jud 1 Posted June 28, 2023 17 hours ago, David Heffernan said: No. You use a hashed collection. Like a dictionary, but one without a value, which is known as a set. But at 5% full then I don't think it will get you any benefit, because of the overhead of the hashed collection. Performance is an issue. The plan is to do at least 10 trillion accesses to the bit vector - more if I can manage it. Share this post Link to post
Jud 1 Posted June 28, 2023 12 hours ago, David Heffernan said: What about the cost of moving data between gpu and main memory? Benefits of gpu are in highly parallel computation. Where is that in this scenario. Just saying GPU doesn't make something fast or efficient. Interesting idea, but I don't know how to deal with the GPU. This will be running 20 instances in parallel, but the accesses will be pretty much random. The target calls for doing at least 10 trillion accesses. Share this post Link to post
Jud 1 Posted June 28, 2023 11 hours ago, Rollo62 said: Like said, just an alternative idea, with pros and cons. Only @Jud may know if this is workable for him somehow, or not. I doubt that it would be workable, since they are natually one dimentional, and the program just needs to see if a bit is 1 or 0, and it needs to do a large number of them. Share this post Link to post
David Heffernan 2345 Posted June 28, 2023 (edited) 11 minutes ago, Jud said: This will be running 20 instances in parallel, but the accesses will be pretty much random. The target calls for doing at least 10 trillion accesses. Depending on what you do when you query these bits you may find that the memory is the bottleneck and threading won't help. Did you do any benchmarking yet? Also if performance matters then delphi is invariably not the answer. Edited June 28, 2023 by David Heffernan Share this post Link to post
Jud 1 Posted June 28, 2023 9 hours ago, David Heffernan said: unit Bitset; interface uses SysUtils, Math; type TBitSet = record private FBitCount: Int64; FSets: array of set of 0..255; class function SetCount(BitCount: Int64): Int64; static; procedure MakeUnique; procedure GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); function GetIsEmpty: Boolean; procedure SetBitCount(Value: Int64); function GetSize: Int64; public class operator In(const Bit: Int64; const BitSet: TBitSet): Boolean; class operator Equal(const bs1, bs2: TBitSet): Boolean; class operator NotEqual(const bs1, bs2: TBitSet): Boolean; property BitCount: Int64 read FBitCount write SetBitCount; property Size: Int64 read GetSize; property IsEmpty: Boolean read GetIsEmpty; procedure Clear; procedure IncludeAll; procedure Include(const Bit: Int64); procedure Exclude(const Bit: Int64); end; implementation { TBitSet } procedure TBitSet.MakeUnique; begin // this is used to implement copy-on-write so that the type behaves like a value SetLength(FSets, Length(FSets)); end; procedure TBitSet.GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); begin Assert(InRange(Bit, 0, FBitCount-1)); SetIndex := Bit shr 8; // shr 8 = div 256 BitIndex := Bit and 255; // and 255 = mod 256 end; function TBitSet.GetIsEmpty: Boolean; var i: Int64; begin for i := 0 to High(FSets) do begin if FSets[i]<>[] then begin Result := False; Exit; end; end; Result := True; end; procedure TBitSet.SetBitCount(Value: Int64); var Bit, BitIndex: Integer; SetIndex: Int64; begin if (Value<>FBitCount) or not Assigned(FSets) then begin Assert(Value>=0); FBitCount := Value; SetLength(FSets, SetCount(Value)); if Value>0 then begin (* Ensure that unused bits are cleared, necessary give the CompareMem call in Equal. This also means that state does not persist when we decrease and then increase BitCount. For instance, consider this code: var bs: TBitSet; ... bs.BitCount := 2; bs.Include(1); bs.BitCount := 1; bs.BitCount := 2; Assert(not (1 in bs)); *) GetSetIndexAndBitIndex(Value - 1, SetIndex, BitIndex); for Bit := BitIndex + 1 to 255 do begin System.Exclude(FSets[SetIndex], Bit); end; end; end; end; function TBitSet.GetSize: Int64; begin Result := Length(FSets)*SizeOf(FSets[0]); end; class function TBitSet.SetCount(BitCount: Int64): Int64; begin Result := (BitCount + 255) shr 8; // shr 8 = div 256 end; class operator TBitSet.In(const Bit: Int64; const BitSet: TBitSet): Boolean; var SetIndex: Int64; BitIndex: Integer; begin BitSet.GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); Result := BitIndex in BitSet.FSets[SetIndex]; end; class operator TBitSet.Equal(const bs1, bs2: TBitSet): Boolean; begin Result := (bs1.FBitCount=bs2.FBitCount) and CompareMem(Pointer(bs1.FSets), Pointer(bs2.FSets), bs1.Size); end; class operator TBitSet.NotEqual(const bs1, bs2: TBitSet): Boolean; begin Result := not (bs1=bs2); end; procedure TBitSet.Clear; var i: Int64; begin MakeUnique; for i := 0 to High(FSets) do begin FSets[i] := []; end; end; procedure TBitSet.IncludeAll; var i: Int64; begin for i := 0 to BitCount-1 do begin Include(i); end; end; procedure TBitSet.Include(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Include(FSets[SetIndex], BitIndex); end; procedure TBitSet.Exclude(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Exclude(FSets[SetIndex], BitIndex); end; end. This is based on code of mine that has is limited to integer bit count. I've not tested it extended to Int64, but I'm sure anyone that wanted to use random code like this would test. Share this post Link to post
Jud 1 Posted June 28, 2023 9 hours ago, David Heffernan said: unit Bitset; interface uses SysUtils, Math; type TBitSet = record private FBitCount: Int64; FSets: array of set of 0..255; class function SetCount(BitCount: Int64): Int64; static; procedure MakeUnique; procedure GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); function GetIsEmpty: Boolean; procedure SetBitCount(Value: Int64); function GetSize: Int64; public class operator In(const Bit: Int64; const BitSet: TBitSet): Boolean; class operator Equal(const bs1, bs2: TBitSet): Boolean; class operator NotEqual(const bs1, bs2: TBitSet): Boolean; property BitCount: Int64 read FBitCount write SetBitCount; property Size: Int64 read GetSize; property IsEmpty: Boolean read GetIsEmpty; procedure Clear; procedure IncludeAll; procedure Include(const Bit: Int64); procedure Exclude(const Bit: Int64); end; implementation { TBitSet } procedure TBitSet.MakeUnique; begin // this is used to implement copy-on-write so that the type behaves like a value SetLength(FSets, Length(FSets)); end; procedure TBitSet.GetSetIndexAndBitIndex(Bit: Int64; out SetIndex: Int64; out BitIndex: Integer); begin Assert(InRange(Bit, 0, FBitCount-1)); SetIndex := Bit shr 8; // shr 8 = div 256 BitIndex := Bit and 255; // and 255 = mod 256 end; function TBitSet.GetIsEmpty: Boolean; var i: Int64; begin for i := 0 to High(FSets) do begin if FSets[i]<>[] then begin Result := False; Exit; end; end; Result := True; end; procedure TBitSet.SetBitCount(Value: Int64); var Bit, BitIndex: Integer; SetIndex: Int64; begin if (Value<>FBitCount) or not Assigned(FSets) then begin Assert(Value>=0); FBitCount := Value; SetLength(FSets, SetCount(Value)); if Value>0 then begin (* Ensure that unused bits are cleared, necessary give the CompareMem call in Equal. This also means that state does not persist when we decrease and then increase BitCount. For instance, consider this code: var bs: TBitSet; ... bs.BitCount := 2; bs.Include(1); bs.BitCount := 1; bs.BitCount := 2; Assert(not (1 in bs)); *) GetSetIndexAndBitIndex(Value - 1, SetIndex, BitIndex); for Bit := BitIndex + 1 to 255 do begin System.Exclude(FSets[SetIndex], Bit); end; end; end; end; function TBitSet.GetSize: Int64; begin Result := Length(FSets)*SizeOf(FSets[0]); end; class function TBitSet.SetCount(BitCount: Int64): Int64; begin Result := (BitCount + 255) shr 8; // shr 8 = div 256 end; class operator TBitSet.In(const Bit: Int64; const BitSet: TBitSet): Boolean; var SetIndex: Int64; BitIndex: Integer; begin BitSet.GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); Result := BitIndex in BitSet.FSets[SetIndex]; end; class operator TBitSet.Equal(const bs1, bs2: TBitSet): Boolean; begin Result := (bs1.FBitCount=bs2.FBitCount) and CompareMem(Pointer(bs1.FSets), Pointer(bs2.FSets), bs1.Size); end; class operator TBitSet.NotEqual(const bs1, bs2: TBitSet): Boolean; begin Result := not (bs1=bs2); end; procedure TBitSet.Clear; var i: Int64; begin MakeUnique; for i := 0 to High(FSets) do begin FSets[i] := []; end; end; procedure TBitSet.IncludeAll; var i: Int64; begin for i := 0 to BitCount-1 do begin Include(i); end; end; procedure TBitSet.Include(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Include(FSets[SetIndex], BitIndex); end; procedure TBitSet.Exclude(const Bit: Int64); var SetIndex: Int64; BitIndex: Integer; begin MakeUnique; GetSetIndexAndBitIndex(Bit, SetIndex, BitIndex); System.Exclude(FSets[SetIndex], BitIndex); end; end. This is based on code of mine that has is limited to integer bit count. I've not tested it extended to Int64, but I'm sure anyone that wanted to use random code like this would test. That looks like what I'm looking for!!! I'm going to try it shortly. Thanks! Share this post Link to post
Jud 1 Posted June 29, 2023 49 minutes ago, David Heffernan said: Depending on what you do when you query these bits you may find that the memory is the bottleneck and threading won't help. Did you do any benchmarking yet? Also if performance matters then delphi is invariably not the answer. No benchmarking with the bit vector yet, but I've definitely encountered the memory bottleneck. In old tests with 8 threads (hyperthreaded 4-core i7), I would get about 5.5x over a single thread on CPU-intensive tasks. But on tests that were memory-intensive, I would get around 3x on 16 threads. Share this post Link to post
Jud 1 Posted June 29, 2023 1 hour ago, David Heffernan said: Depending on what you do when you query these bits you may find that the memory is the bottleneck and threading won't help. Did you do any benchmarking yet? Also if performance matters then delphi is invariably not the answer. Actually I did benchmarking. The first version of this program read a 30GB boolean array from a drive and used that. Naturally I did a single-thread before multithreading. I forgot the ratio, but 20 threads on a 12th generation i7 was something like 8-9x faster than the single-thread version. Now I'm working on replacing the 30GB boolean array with a much larger bit vector. Share this post Link to post
David Heffernan 2345 Posted June 29, 2023 Hyoerthreading is probably useless. I've never found a task which benefits from it. I guess they must exist though, or is it just pure marketing? Modern AMD chipsets seems to have remarkable scaling though. Share this post Link to post