Search code examples
c#asp.netexcelxls

Export HTML string to Excel file in ASP.Net using C#


I have this table in database MySql version 8.0.17 which contains HTML code on field contents

<p><h3><span style=color:#0000ff;><strong>test</strong></span></h3></p>
<h4><strong>- test1</strong></h4>
<h4><strong>- test2</strong></h4>

I am not DB admin so I can only read from this table which i cannot modify..

when exporting this table in excel format on the xls file all HTML tags are found

enter image description here

How to do resolve this?

Thanks in advance for any help.

My code below

private void MTxlssp()
{
    MySqlCommand cmd =
        new MySqlCommand("SP");

    DataTable dt = GetData(cmd);
    GridView GridView1 = new GridView
    {
        AllowPaging = false,
        DataSource = dt
    };

    GridView1.DataBind();

    Thread.Sleep(3000);
    Response.Clear();
    Response.Buffer = true;

    Response.AddHeader("content-disposition", attachment;filename=\"test.xls\"");

    Response.ContentEncoding = Encoding.UTF8;
    Response.Charset = "";
    Response.ContentType = "application/vnd.ms-excel";

    HttpCookie cookie2 = new HttpCookie("ExcelDownloadFlag")
    {
        Value = "Flag",
        Expires = DateTime.Now.AddDays(1)
    };
    Response.AppendCookie(cookie2);

    using (StringWriter sw = new StringWriter())
    {
        HtmlTextWriter hw = new HtmlTextWriter(sw);

        GridView1.HeaderRow.BackColor = Color.White;

        foreach (TableCell cell in GridView1.HeaderRow.Cells)
        {
            cell.BackColor = GridView1.HeaderStyle.BackColor;
        }
        foreach (GridViewRow row in GridView1.Rows)
        {
            row.BackColor = Color.White;
            foreach (TableCell cell in row.Cells)
            {
                if (row.RowIndex % 2 == 0)
                {
                    cell.BackColor = GridView1.AlternatingRowStyle.BackColor;
                }
                else
                {
                    cell.BackColor = GridView1.RowStyle.BackColor;
                }
                cell.CssClass = "textmode";
            }
        }

        GridView1.RenderControl(hw);
        string style = @"<style> .textmode { } </style>";
        Response.Write(style);
        Response.Output.Write(sw.ToString());
        Response.Flush();
        Response.End();
    }
}

Solution

  • If its just the display text you need then maybe the library Nuglify will help you. It supports text extraction from HTML:

    var html = @"<p><h3><span style=color:#0000ff;><strong>test</strong></span></h3></p>
    <h4><strong>- test1</strong></h4>
    <h4><strong>- test2</strong></h4>";
    
    var result = Uglify.HtmlToText(html);
    
    Console.WriteLine(result.Code);  
    

    You can download it from here: https://github.com/trullock/NUglify or get it from Nuget.

    I have just run it on your html sample and it produces:

     test - test1 - test2 
    

    I am assuming thats what you want based on your comment to @Ivan Khorin