Search code examples
c#windows-phone-8openxmlopenxml-sdk

Read/import existing Excel file programmatically (cell-by-cell) in Windows Phone 8


I am working on a Windows Phone 8 app to READ/WRITE Excel files. I asked a question here about this and the comment provided and many other links led me to OpenXml.

All of this got me good on how to create an Excel file and how to launch it. But now I am stuck at very basic of these all i.e. How to read an existing Excel file (probably created outside using MS Excel) cell-by-cell i.e. I want to access each cells and their values through my code. In the openXML thing I did this:

Stream localFile = App.GetResourceStream(new Uri("/ReadExcel;component/jai.xlsx"
                                                    ,UriKind.Relative)).Stream;
MemoryStream ms = new MemoryStream();
localFile.CopyTo(ms);

DocumentFormat.OpenXml.Packaging.SpreadsheetDocument spreadsheetDoc =
DocumentFormat.OpenXml.Packaging.SpreadsheetDocument.Open(localFile, true);
{
    var a = spreadsheetDoc.Package;
    // Do work here
}

But it gives me error:

The type 'System.IO.Packaging.Package' is defined in an assembly that is not 
referenced. You must add a reference to assembly 'WindowsBase, Version=4.0.0.0

So basically I am stuck at this WindowsBase.dll. I tried all various ways to import an assembly i.e. unblock and all, but nothing works.

So all I want to do is to programmatically access the content of an existing Excel file in my code cell-by-cell.

Please help or suggest whether it is even possible as of now in WP8.


Solution

  • I used the following method to read cells from an xlsx Excel file on Windows Phone 8:

    1. Add the Microsoft Compression library to your project using NuGet
    2. Adapt the code sample from the developer network to your needs - it shows how to read cells from an Excel file (and it needs the Compression lib)

    Since I already extended the code a bit to handle empty columns and empty files properly you can also use my code:

    public class ExcelReader
    {
        List<string> _sharedStrings;
    
        List<Dictionary<string, string>> _derivedData;
    
        public List<Dictionary<string, string>> DerivedData
        {
            get
            {
                return _derivedData;
            }
        }
        List<string> _header;
    
        public List<string> Headers { get { return _header; } }
    
        // e.g. cellID = H2 - only works with up to 26 cells
        private int GetColumnIndex(string cellID)
        {
            return cellID[0] - 'A';
        }
    
        public void StartReadFile(Stream input)
        {
            ZipArchive z = new ZipArchive(input, ZipArchiveMode.Read);
            var worksheet = z.GetEntry("xl/worksheets/sheet1.xml");
            var sharedString = z.GetEntry("xl/sharedStrings.xml");
    
            // get shared string
            _sharedStrings = new List<string>();
            // if there is no content the sharedStrings will be null
            if (sharedString != null)
            {
                using (var sr = sharedString.Open())
                {
                    XDocument xdoc = XDocument.Load(sr);
                    _sharedStrings =
                        (
                        from e in xdoc.Root.Elements()
                        select e.Elements().First().Value
                        ).ToList();
                }
            }
    
            // get header
            using (var sr = worksheet.Open())
            {
                XDocument xdoc = XDocument.Load(sr);
                // get element to first sheet data
                XNamespace xmlns = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
                XElement sheetData = xdoc.Root.Element(xmlns + "sheetData");
    
                _header = new List<string>();
                _derivedData = new List<Dictionary<string, string>>();
    
                // worksheet empty?
                if (!sheetData.Elements().Any())
                    return;
                // build header first
                var firstRow = sheetData.Elements().First();
                // full of c
                foreach (var c in firstRow.Elements())
                {
                    // the c element, if have attribute t, will need to consult sharedStrings
                    string val = c.Elements().First().Value;
                    if (c.Attribute("t") != null)
                    {
                        _header.Add(_sharedStrings[Convert.ToInt32(val)]);
                    } else
                    {
                        _header.Add(val);
                    }
    
                }
    
                // build content now
                foreach (var row in sheetData.Elements())
                {
                    // skip row 1
                    if (row.Attribute("r").Value == "1")
                        continue;
                    Dictionary<string, string> rowData = new Dictionary<string, string>();
                    // the "c" elements each represent a column
                    foreach (var c in row.Elements())
                    {
                        var cellID = c.Attribute("r").Value; // e.g. H2
    
                        // each "c" element has a "v" element representing the value
                        string val = c.Elements().First().Value;
                        // a string? look up in shared string file
                        if (c.Attribute("t") != null)
                        {
                            rowData.Add(_header[GetColumnIndex(cellID)], _sharedStrings[Convert.ToInt32(val)]);
                        } else
                        {
                            // number
                            rowData.Add(_header[GetColumnIndex(cellID)], val);
                        }
                    }
                    _derivedData.Add(rowData);
                }
            }
        }
    }
    

    This works for simple Excel files having one work sheet and some text and number cells. It assumes there is a header row.

    Usage is as follows:

    var excelReader = new ExcelReader();
    excelReader.StartReadFile(excelStream);
    

    After reading excelReader.Headers contains the header names, excelReader.DerivedData contains the rows. Each row is a Dictionary having the header as key and the data as value. Empty cells won't be in there.

    Hope this gets you started.