Jump to content
dummzeuch

Delphi’s TZipFile working on a stream

Recommended Posts

Recent versions of Delphi (for a suitable definition of “recent”) come with a TZipFile class implemented in the unit System.Zip. This class has the really neat feature that it can not only read the contents of a ZIP file directly from a stream but also extract a file from that stream to a TBytes array, thus it does not require any file system access.

E.g. imagine you have got a memory stream containing the contents of a ZIP archive. [read on in the blog post]

  • Like 1
  • Thanks 1

Share this post


Link to post

Great post! I THINK the reason why the Position is necessary is actually that you can put multiple ZIP archives into the same stream and it always has to point to the start when you load a TZipFile instance from a stream. 

Share this post


Link to post
4 hours ago, dummzeuch said:

Recent versions of Delphi (for a suitable definition of “recent”) come with a TZipFile class implemented in the unit System.Zip. This class has the really neat feature that it can not only read the contents of a ZIP file directly from a stream but also extract a file from that stream to a TBytes array, thus it does not require any file system access.

E.g. imagine you have got a memory stream containing the contents of a ZIP archive. [read on in the blog post]

Did TZipFile ever got a proper update? Although it's good to have something built in, https://github.com/ccy/delphi-zip adds LZMA and probably ZIP64 support. I myself forked that too to fix a memory leak and make it open some corrupted ZIP files too.

I wonder if I can drop Zip2 in my projects and to use the built-in version again.

  • Like 1

Share this post


Link to post

What if there are multiple files within the Zip archive?  I need to extract a specific item within a Zip file.  I'm using a third party component that works like this (streamlined version):

 

(n.b.  MyUnArchiver is a third party zip component on a Delphi Form)
  
MyUnArchiver.FileName := 'D:\testdata\largefile.zip';  // the zip file on disk

MyUnzippedStream := TStringStream.Create;  // I use StringStream for processing, but any stream works here
try
  MyUnArchiver.OpenArchive; 
  MyUnArchiver.ExtractToStream('archive item name', MyUnzippedStream);  // name of the given archive item within the zip file
  MyUnZippedStream.Position := 0;  // the extraction will set the position to end of stream, so this is necessary
// process MyUnZippedStream
..
finally
  MyUnzippedStream.Free;
end;


  I'd like to reduce my third party dependencies, but it's a "maybe later" priority.  The third party component (and I am not affiliated with them) is called ZipForge.

 

 

Share this post


Link to post

It does:

 

zip1.thumb.PNG.ee51809e2c6751e9bfb4aba806e0e29b.PNG

zip2.thumb.PNG.e02306990439f6c6a7839b6e74428d0b.PNG

 

However - as I mentioned - Delphi's TZIPFile still has it's limitations which ZipForge for example overcame already. This includes but (possibly) not limited to LZMA compression, proper Zip64 support and replacing one file inside the ZIP archive.

  • Like 1

Share this post


Link to post
4 minutes ago, aehimself said:

and replacing one file inside the ZIP archive.

Yeah, that one is really annoying. I used to have a class helper that added TZipFile.Delete and Remove methods but one of the Delphi versions after XE2 broke that one as the required TZipFile internal data structures are no longer accessible to class helpers.

  • Sad 1

Share this post


Link to post
Posted (edited)

Look out with the TZipFile class, there is a bug which causes empty zips when it gets too large (or you insert very big files)

 

i posted a bug a few years ago.

(i go search if it's fixed)

 

can not find it, but problem is that it did execute normally without raising an exception, but you end(ed) up with an empty zip file.

 

 

Edited by mvanrijnen

Share this post


Link to post
8 minutes ago, Anders Melander said:

Yeah, that one is really annoying. I used to have a class helper that added TZipFile.Delete and Remove methods but one of the Delphi versions after XE2 broke that one as the required TZipFile internal data structures are no longer accessible to class helpers.

I'd kill for those, they are truly a missing feature.

Share this post


Link to post
12 minutes ago, mvanrijnen said:

Look out with the TZipFile class, there is a bug which causes empty zips when it gets too large (or you insert very big files)

Do you mean Zip64?

"The original .ZIP format had a 4 GiB (232 bytes) limit on various things (uncompressed size of a file, compressed size of a file, and total size of the archive), as well as a limit of 65,535 (216) entries in a ZIP archive. In version 4.5 of the specification (which is not the same as v4.5 of any particular tool), PKWARE introduced the "ZIP64" format extensions to get around these limitations, increasing the limits to 16 EiB (264 bytes)."

Share this post


Link to post
56 minutes ago, Anders Melander said:

Yes, but I don't see how this extracts a single file to a stream of some sort.  And extracting to a file on disk makes these useless for my purposes.  It's almost there, but I think this is why I started using the third party years ago.  Well, that and at the time as I recall there was a problem with unicode characters in the items within the zip file (memory fades on this though.)   

Share this post


Link to post
12 minutes ago, John Terwiske said:

I don't see how this extracts a single file to a stream of some sort

 

Something like this:

var
  ZipFile: TZipFile;
  SourceStream: TStream
  TargetStream: TStream;
  LocalHeader: TZipHeader
begin
  ...
  TargetStream := TMemoryStream.Create;
  try
    ZipFile.Read('foobar.dat', SourceStream, LocalHeader);
    try
      TargetStream.CopyFrom(SourceStream, 0);
    finally
      SourceStream.Free;
    end;
    ...do something with TargetStream...
  finally
    TargetStream.Free;
  end;
end;

 

19 minutes ago, John Terwiske said:

at the time as I recall there was a problem with unicode characters in the items within the zip file

There used to be a problem with unicode in comments but I believe that has been fixed.

Share this post


Link to post

I'm using this helper method to unzip the first file from a Base64 encoded string:

 

Function UnzipBase64(inEncodedString: String): TBytes;
Var
 ms: TMemoryStream;
 tb: TBytes;
 zip: TZipFile;
 zipstream: TStream;
 header: TZipHeader;
Begin
 ms := TMemoryStream.Create;
 Try
  tb := TNetEncoding.Base64.DecodeStringToBytes(inEncodedString);
  ms.Write(tb, Length(tb));
  ms.Position := 0;
  zip := TZipFile.Create;
  Try
   zip.Open(ms, zmRead);
   If zip.FileCount = 0 Then Raise Exception.Create('ZIP file is valid, but it does not contain any files!');
   zipstream := nil;
   zip.Read(0, zipstream, header);
   Try
    zip.Close;
    SetLength(Result, zipstream.Size);
    zipstream.Read(Result, zipstream.Size);
   Finally
    FreeAndNil(zipstream);
   End;
  Finally
   FreeAndNil(zip);
  End;
 Finally
  FreeAndNil(ms);
 End;
End;

Could be shortened as far as I see, but it shows what you want. Open a zip file from stream, and extract a file from it to an other stream.

Share this post


Link to post
3 minutes ago, Anders Melander said:

Tsk, tsk.

Kind of a habit, I'm always using FreeAndNil. At least If Assigned(something) works properly 🙂

  • Like 1

Share this post


Link to post
36 minutes ago, aehimself said:

Kind of a habit, I'm always using FreeAndNil. At least If Assigned(something) works properly 🙂

Ignore the Tsk, tsk.

Share this post


Link to post
32 minutes ago, Mark- said:

Ignore the Tsk, tsk.

I'm not ignoring it but I doubt I'll change my coding style because of this. It's all a matter of personal taste... and while nil-ing a local variable is indeed useless right before exiting a method... I already got used to it and the overhead is negligible.

Share this post


Link to post
2 hours ago, Anders Melander said:

 

Something like this:


var
  ZipFile: TZipFile;
  SourceStream: TStream
  TargetStream: TStream;
  LocalHeader: TZipHeader
begin
  ...
  TargetStream := TMemoryStream.Create;
  try
    ZipFile.Read('foobar.dat', SourceStream, LocalHeader);
    try
      TargetStream.CopyFrom(SourceStream, 0);
    finally
      SourceStream.Free;
    end;
    ...do something with TargetStream...
  finally
    TargetStream.Free;
  end;
end;

 

There used to be a problem with unicode in comments but I believe that has been fixed.

Thank you.  It would appear that whatever issue I was having with unicode is no longer a problem with the Delphi zip code/classes.  Your solution works with all of the non-spanning zip files I've got on hand for testing.  It's about 6% slower vs. the third party solution I'm currently using, and this matters with these files that are in the 2Gbyte range.  I'll look into this some more... someday.  🙂

Share this post


Link to post
55 minutes ago, John Terwiske said:

It's about 6% slower vs. the third party solution I'm currently using, and this matters with these files that are in the 2Gbyte range.

One thing you can do to make it a bit faster is to read directly from the decompression stream (SourceStream in my example) instead of copying it to a memory stream. AFAIR the decompression stream buffers internally and is bidirectional so you should be able to treat it as a memory stream.

Share this post


Link to post
2 hours ago, aehimself said:

zip.Read(0, zipstream, header);
Try
  zip.Close;
  SetLength(Result, zipstream.Size);
  zipstream.Read(Result, zipstream.Size);
Finally
  FreeAndNil(zipstream);
End;

 

...or the shorter version (yes, I know you said it):

zip.Read(0, Result);

 

Share this post


Link to post

Yep, that was exactly my idea when I made the shots about .Read and .Extract and I saw that they can output to TBytes directly... 🙂

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×