CDO and UTF-7

Posted 31 Mar 2005 by Dean Harding

It's been a while since I last posted, sorry, it's hard to find time to post all the time. Seriously, I don't know how people like Scoble do it!

Anyway, I've had a bit more fun with using CDO from .NET. I was having problems when trying to load an email message that contained body-parts encoded with the "unicode-1-1-utf-7" encoding. This encoding seems to be most popular with DSN (delivery status notification) messages, but theoretically any mail client should be able to encode with that encoding.

Checking the documentation for the GetDecodedContentStream method on MSDN had some rather obtuse text about how UTF-7, UTF-8 and few other encodings get set to some generic Unicode encoding and some sample code for copying it to a temporary stream. Unfortunately, it didn't really explain the problem very well (I mean, why does it use this generic Unicode thing?) and the sample code was for the opposite of what I was doing (i.e. adding body-parts in UTF-7/UTF-8, not decoding an email). Finally, it doesn't really explain the error I got, which was something like "A disk error occurred during a write operation" -- write operation? First of all, I wasn't even doing a write, and second of all, once I'd loaded the message there was no disk involved!

So anyway, after a bunch of messing around, I finally came up with some code that works. Say you have some MIME-encoded text like this:


MIME-Version: 1.0
Content-Type: text/plain; charset=unicode-1-1-utf-7

Here's a unicode character "+ANw-" it doesn't work.

If you just take the the body-part of the CDO.Message object, and called GetDecodedContentStream on it, when you try to call ReadText it'd return that error.

Anyway, my solution is below. Basically, I had to call CopyTo to copy the stream to another stream with an Encoding of "UTF-7", and everything worked perfectly then.


private string GetBodyPartText(CDO.IBodyPart bodyPart)
{
    ADODB.Stream stream = bodyPart.GetDecodedContentStream();
    try
    {
        if (stream.Type != ADODB.StreamTypeEnum.adTypeText)
        {
            // This is an error,  assume a content-type of text/*
        }

        // Special-case for unicode-1-1-utf-7
        if (bodyPart.Charset.ToLower() == "unicode-1-1-utf-7")
        {
            ADODB.Stream tempStream = new ADODB.StreamClass();
            tempStream.Open(Missing.Value,
                ADODB.ConnectModeEnum.adModeUnknown,
                ADODB.StreamOpenOptionsEnum.adOpenStreamUnspecified,
                null, null);
            tempStream.Charset = "utf-7";
            tempStream.Type = ADODB.StreamTypeEnum.adTypeText;
            stream.CopyTo(tempStream, -1);
            tempStream.Position = 0;

            stream = tempStream;
        }

        return stream.ReadText((int) ADODB.StreamReadEnum.adReadAll);
    }
    finally
    {
        stream.Close();
    }
}

Let's just hope this is useful to someone else in the future!