Convert Word to HTML then render HTML on webpage

If you are using DOCX you can allways use Open XML SDK from Microsoft, it's pretty easy to use and clean. A sample taken from MSDN

// This example shows the simplest conversion. No images are converted.
// A cascading style sheet is not used.
byte[] byteArray = File.ReadAllBytes("Test.docx");
using (MemoryStream memoryStream = new MemoryStream())
{
    memoryStream.Write(byteArray, 0, byteArray.Length);
    using (WordprocessingDocument doc =         WordprocessingDocument.Open(memoryStream, true))
    {
        HtmlConverterSettings settings = new HtmlConverterSettings()
        {
            PageTitle = "My Page Title"
        };
        XElement html = HtmlConverter.ConvertToHtml(doc, settings);

        // Note: the XHTML returned by ConvertToHtmlTransform contains objects of type
        // XEntity. PtOpenXmlUtil.cs defines the XEntity class. See
        // http://blogs.msdn.com/ericwhite/archive/2010/01/21/writing-entity-references-using-linq-to-xml.aspx
        // for detailed explanation.
        //
        // If you further transform the XML tree returned by ConvertToHtmlTransform, you
        // must do it correctly, or entities do not serialize properly.

        File.WriteAllText("Test.html", html.ToStringNewLineOnAttributes());
    }
}

You might also want to take a look to the Word automation services http://blogs.office.com/b/microsoft-word/archive/2009/12/16/word-automation-services_3a00_-what-it-does.aspx


We use http://www.aspose.com/ (I think the one we use is Aspose words) to perform s similar task, and it works quite well. (there is a cost involved)

I would suggest that converting to HTML gives the worst rendition of the document. One solution we use, is to generate a Jpeg image of the document and display that.

If you need to be able to perform operations like find and copy/pasting text - I would recommend converting the document to a .pdf, and displaying it inline, in whichever standard pdf viewer the client machine has installed.

Tags:

C#

Ms Word