chkaufmann 17 Posted September 30, 2021 Hi, I use ExtractHeaderFields() from Web.HttpApp when I parse a post upload. With the following value for "Content" (containing special german characters) this function fails: 'form-data; name="File1"; filename="Test1MitäÄ-Umlaut.pdf"' I get this error: System.SysUtils 33477 TEncoding.GetString System.NetEncoding 1007 TURLEncoding.Decode Web.HTTPApp 2108 ExtractHeaderFields Now I'm not sure if the input is wrong or if I have to use a different function to parse the content of this header (Content-Disposition:). Thanks for any help. Christian Share this post Link to post
Remy Lebeau 1394 Posted September 30, 2021 5 hours ago, chkaufmann said: With the following value for "Content" (containing special german characters) this function fails: 'form-data; name="File1"; filename="Test1MitäÄ-Umlaut.pdf"' That header is malformed. HTTP headers simply can't have un-encoded non-ASCII characters like that. There are competing standards for how they need to be encoded, though. There is RFC 2183, RFC 2047, RFC 7578, RFC 8187, HTML5, etc. 5 hours ago, chkaufmann said: I get this error: System.SysUtils 33477 TEncoding.GetString System.NetEncoding 1007 TURLEncoding.Decode Web.HTTPApp 2108 ExtractHeaderFields TURLEncoding decodes %HH sequences into bytes, and then charset-decodes those bytes into Unicode. IIRC, it expects the bytes to be UTF-8 encoded by default. Share this post Link to post
chkaufmann 17 Posted October 2, 2021 Ok, the request that comes with this malformed Content-Disposition is created by another Delphi application where I use Indy components. The code looks like this: mPartStream := TIdMultiPartFormDataStream.Create; postDataStream := mPartStream; FHttp.Request.ContentType := mPartStream.RequestContentType; for ix := 0 to FPostNames.Count -1 do begin if FPostFiles[ix].IsNull then mPartStream.AddFormField(FPostNames[ix], FPostValues[ix], 'UTF-8').ContentTransfer := '8bit' else mPartStream.AddFile(FPostNames[ix], FPostFiles[ix].PathName, FPostContentTypes[ix]); end; FPostFiles[ix].PathName is the Windows path of a file. Should I encode it on my side? Or do I have to set another parameter to ensure correct encoding? Christian Share this post Link to post
Remy Lebeau 1394 Posted October 4, 2021 (edited) Which version of Delphi is that other app written in, and what version of Indy is it using? TIdMultipartFormDataStream encodes non-ASCII characters in all field names and filenames according to RFC 2047, which your earlier example is NOT encoded as, so I doubt the example is coming from Indy, unless maybe it is a really old version. On Windows, depending on what the OS system language is set to, TIdMultipartFormDataStream uses either UTF-8 or the OS language as the charset to encode characters to bytes. And then depending on the charset used, it uses either Quoted-Printable or Base64 to encode those bytes in the Content-Disposition header. These values are reflected in the HeaderCharSet and HeaderEncoding properties of each TIdFormDataField object that the TIdMultipartFormDataStream.Add(...) methods create. Double-check what these values are actually being set to on your system, but you can also set them yourself as needed. I would suggest using HeaderCharSet='utf-8' and HeaderEncoding='B', eg: var mPartStream := TIdMultiPartFormDataStream; field: TIdFormDataField; ... mPartStream := TIdMultiPartFormDataStream.Create; FHttp.Request.ContentType := mPartStream.RequestContentType; for ix := 0 to FPostNames.Count -1 do begin if FPostFiles[ix].IsNull then begin field := mPartStream.AddFormField(FPostNames[ix], FPostValues[ix], 'UTF-8'); field.ContentTransfer := '8bit'; end else begin field := mPartStream.AddFile(FPostNames[ix], FPostFiles[ix].PathName, FPostContentTypes[ix]); end; field.HeaderCharSet := 'UTF-8'; field.HeaderEncoding := 'B'; end; Though, I suspect even this will not give you the end result you are looking for with ExtractHeaderFields(), if it is really trying to url-decode fields that are not url-encoded to begin with. HTTP does not use url-encoding in header content, so what you are experiencing really sounds like a logic bug in the HttpApp framework. But at least this should give your code access to the stream's encoded Base64 data, which you can then decode manually to a Unicode string, such as with Indy's DecodeHeader() function in the IdCoderHeader unit. Edited October 4, 2021 by Remy Lebeau Share this post Link to post
chkaufmann 17 Posted October 4, 2021 I use Delphi 10.4.2 and the Indy library coming with the default installation. I found that I can set the "Decode" parameter of ExtractHeaderFields() to false, then it works fine. Christian Share this post Link to post