Recent

Author Topic: Synapse (TImapSend, TMimePart) Another problem with encoding  (Read 1502 times)

mith

  • New Member
  • *
  • Posts: 19
Synapse (TImapSend, TMimePart) Another problem with encoding
« on: November 26, 2019, 11:33:20 pm »
Hello,

I'm trying to fetch emails from a mail account through IMAP protocol. Currently with success, but there is one issue, that I can't handle.
I got messages with different charsets (like UTF-8, ISO-*, us-ascii). Probably because of that I can't get the proper value from headers, and the body also contains unreadable chars.
Some of them can be avoided by a function DecodeQuotedPrintable, but this doesn't resolve the problem.
My question is - how synapse handles the different charsets and encodings (auto?). Is there a simple way, that could be used for getting utf8 strings at the end?
Let's say I want to list all of the senders and mail titles on the stringgrid with proper characters.
Code: Pascal  [Select][+][-]
  1.             imaps.FetchMess(idx,MimeMess.Lines);
  2.             MimeMess.DecodeMessage;
  3.             Memo1.Lines.Add(MimeMess.Header.Subject);
  4.             Memo1.Lines.AddStrings(MimeMess.MessagePart.Lines);

This is a very simple example, I'm doing nothing with encoding/charset currently, but tried many google's hints.
I'll aprreciate for any hints and advises.

3mptyCode

  • Newbie
  • Posts: 3
Re: Synapse (TImapSend, TMimePart) Another problem with encoding
« Reply #1 on: March 29, 2020, 02:53:52 pm »
Hello mith,
did you get any solution for your problem ? Hopefully you can tell me how to format the characters correctly.

Also i couldn't find any documentation or possible options for MimeMess.DecodeMessage.

best regards

mith

  • New Member
  • *
  • Posts: 19
Re: Synapse (TImapSend, TMimePart) Another problem with encoding
« Reply #2 on: March 29, 2020, 03:30:34 pm »
Unfortunately didn't find a 100% effective solution, also didn't have a time to find better one.
The best effect I got is by setting UTF8 encoding before message (and its parts) are decoded.

Code: Pascal  [Select][+][-]
  1. imap.FetchMess(n,mess.Lines);
  2. mess.MessagePart.TargetCharset:=UTF_8;
  3. mess.MessagePart.ConvertCharset:=true;
  4. mess.DecodeMessage;


Code: Pascal  [Select][+][-]
  1. if mess.MessagePart.GetSubPartCount > 0 then //multipart
  2.   begin
  3.     for cnt:=0 to mess.MessagePart.GetSubPartCount-1 do
  4.           begin
  5.             m:=mess.MessagePart.GetSubPart(cnt);
  6.             m.TargetCharset:=UTF_8;
  7.             m.ConvertCharset:=true;
  8.             m.DecodePart;
  9. [...]

As I use Windows Service to get the messages I'm converting string with WinCPToUTF8() function (it's a part of lazUTF8 unit)
e.g.
Code: Pascal  [Select][+][-]
  1. files.Add(WinCPToUTF8(m.Filename));

For headers, subjects, content I'm using this kind of workaround:

Code: Pascal  [Select][+][-]
  1. subject:=WinCPToUTF8(mess.Header.Subject);
   
Hope it helps you a little.

3mptyCode

  • Newbie
  • Posts: 3
Re: Synapse (TImapSend, TMimePart) Another problem with encoding
« Reply #3 on: March 30, 2020, 09:31:26 pm »
Hello mith,

first of all thank you for the very quick response, even if your post was not that new.

Sadly I didn't get satisfying results with your procedure or s.th. went wrong, but you gave me one important hint: "format before the DecodeMessage command"

Do you really geht well formated subject-, from- and to-strings out of your procedures (as well with special characters, emojis, japanese signs ?

I spent about 2-3 hours today to get a solution and wrote a new function (tested with around 50 different emails with different types)

Regular Code for Decoding - attention to the order of commands:
Code: Pascal  [Select][+][-]
  1.           lvMailContentSL := TStringList.Create();
  2.           lvMailHeaderSL := TStringList.Create();
  3.  
  4.           imap.SelectROFolder('Test');
  5.  
  6.           // get last SelectedRecent messages
  7.           for j:=1 to lvMailCnt do
  8.           begin
  9.             lvMailContentSL.Clear;
  10.             lvMailHeaderSL.Clear;
  11.  
  12.             imap.FetchMess(j,MimeMess.Lines);
  13.             lvMailContentSL.Text:= MimeMess.Lines.Text;
  14.  
  15.             MimeMess.DecodeMessage;
  16.             lvMailHeaderSL.Text := MimeMess.MessagePart.Headers.Text;
  17.             lvMailHeaderLastLine := lvMailHeaderSL[lvMailHeaderSL.Count-1]; // show last line
  18.             lvMailHeaderSL.Text := copy(lvMailContentSL.Text, 1, pos(lvMailHeaderLastLine, lvMailContentSL.Text)+length(lvMailHeaderLastLine));
  19.             MimeMess.Lines.Text := ReplaceString(lvMailContentSL.Text, lvMailHeaderSL.Text, fDecodeMIME(lvMailHeaderSL.Text));
  20.             MimeMess.DecodeMessage;
  21.  
  22.             // deocded headerlines
  23.             Memo1.Lines.Add('--------------------------------------------------');
  24.             Memo1.Lines.Add('[ Message-ID: ' + IntToStr(j) + ' ]');
  25.             Memo1.Lines.Add('From: ' + MimeMess.Header.From);
  26.             Memo1.Lines.Add('Ref: ' + MimeMess.Header.Subject);
  27.             Memo1.Lines.Add('To: ' + MimeMess.Header.ToList.Text);
  28.          end


Decode Function:
Code: Pascal  [Select][+][-]
  1. function TfMain.fDecodeMIME(eString:string):String;
  2. var vStrOut:String;
  3.    vTempStr, vStrNew:String;
  4.    vPosStart, vPosEnd:integer;
  5. begin
  6.      vStrOut:=eString;
  7.      // ISO-8859-1 Pre-Format
  8.      vStrOut := ReplaceString(vStrOut,'=?iso-8859-1?', '=?ISO-8859-1?');
  9.      vStrOut := ReplaceString(vStrOut,'=?ISO-8859-1?', '=?UTF-8?'); // will be treated like UTF8
  10.  
  11.      //  UTF8 Pre-Format
  12.      vStrOut := ReplaceString(vStrOut,'=?utf-8?', '=?UTF-8?');
  13.      vStrOut := ReplaceString(vStrOut, #13+#10 + ' =?UTF-8?', '=?UTF-8?'); // Linebreakt & Space
  14.      vStrOut := ReplaceString(vStrOut, #13+#10 + '      =?UTF-8?', '=?UTF-8?'); // Linebreakt & TAB
  15.      vStrOut := ReplaceString(vStrOut, '?= =?UTF-8?', '?==?UTF-8?');
  16.      vStrOut := ReplaceString(vStrOut, '-8?q?', '-8?Q?');
  17.      vStrOut := ReplaceString(vStrOut, '-8?b?', '-8?B?');
  18.  
  19.      // replace all UFT8 occurrences
  20.      while Pos('=?UTF-8?',vStrOut) > 0 do
  21.      begin
  22.        vPosStart := Pos('=?UTF-8?',vStrOut);
  23.        vTempStr := Copy(vStrOut, vPosStart, Length(vStrOut)-vPosStart+1);
  24.        vPosEnd := Pos('?=',vTempStr)+1;
  25.        vTempStr := Copy(vTempStr, 1, vPosEnd);
  26.        vStrNew := vTempStr;
  27.        // now Decode depending on UTF8-Type
  28.        if Pos('=?UTF-8?Q?',vStrNew) > 0 then // (Q)uoted
  29.        begin
  30.          vStrNew := ReplaceString(vStrNew,'_', ' ');
  31.          vStrNew := ReplaceString(vStrNew,'=?UTF-8?Q?', '');
  32.          vStrNew := DecodeQuotedPrintable(vStrNew);
  33.          vStrNew := Copy(vStrNew, 1, Length(vStrNew)-1); // last Char after Decode Quoted is '?' -> cut 1 char
  34.          end
  35.        else if Pos('=?UTF-8?B?',vStrNew) > 0 then // (B)ase64
  36.        begin
  37.          vStrNew := ReplaceString(vStrNew,'=?UTF-8?B?', '');
  38.          vStrNew := DecodeBase64(vStrNew);
  39.        end;
  40.  
  41.        // Trim formated Result
  42.        vStrNew := Trim(vStrNew);
  43.        // replace String in Original Text by position..
  44.        vStrOut := ReplaceString(vStrOut, vTempStr, vStrNew);
  45.      end;
  46.  
  47.      Result := vStrOut;
  48. end;









« Last Edit: April 03, 2020, 06:19:36 pm by 3mptyCode »

mith

  • New Member
  • *
  • Posts: 19
Re: Synapse (TImapSend, TMimePart) Another problem with encoding
« Reply #4 on: March 30, 2020, 10:06:36 pm »
To be perfectly sure I checked last 100 database record with the email data. All the subject are with the correct chars. I'm sorry, I don't know if the code will work with emoji or signs from a exotic charsets, didn't check it. For my purposes it was enought to have correct data for emails sent by polish users, and this looks achieved.
It should be said that my windows daemon works on the serwer with Cp1250 encoding (polish). Maybe it's a important information.

 

TinyPortal © 2005-2018