Search code examples
c#excelspreadsheetopenxmlopenxml-sdk

Get embedded objects in OOXML spreadsheet


I am struggling with getting a result with the following code using Open XML SDK. I want to find all embedded objects in a OOXML/XLSX spreadsheet and report each found object on screen. How to do this?

I think "DataParts" might not be the right place to start, but even DataParts do not show anything.

var files = Directory.EnumerateFiles(@"SOME DIRECTORY");
foreach (var file in files)
{
    Console.WriteLine(file);
        using (var spreadsheet = SpreadsheetDocument.Open(file, false))
        {
            var embedded_objects = spreadsheet.DataParts.ToList();
            int count = embedded_objects.Count;

            if (count > 0)
            {
                foreach (var item in embedded_objects)
                {
                    Console.WriteLine(item);
                }
            }
            else
            {
                Console.WriteLine(count);
            }
        }
}

Solution

  • @Sathish Guru V, thanks for giving me a hint on how to progress. Several hours later, I have something that is working for me, though I expected "embedded objects" to be a wider definition than just "OLE" e.g. pictures is also something I thought was an embedded object.

    Here's the code now. Any suggestions on how to improve are, are welcome.

    EDIT: I found a way to also count images. Code has been updated:

    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Spreadsheet;
    
    using (var spreadsheet = SpreadsheetDocument.Open(filepath, false))
    {
        var list = spreadsheet.WorkbookPart.WorksheetParts.ToList();
        foreach (var item in list)
        {
            int count_ole = item.EmbeddedObjectParts.Count(); // Register the number of OLE
            int count_image = item.ImageParts.Count(); // Register number of images
            int count = count_ole + count_image; // Sum
            if (count == 0) // If no embedded objects, inform user
            {
                embedobj_message = $"--> {count} embedded objects detected";
                Console.WriteLine(embedobj_message);
                return embedobj_message;
            }
            else
            {
                embedobj_message = $"--> {count} embedded objects detected";
                Console.WriteLine(embedobj_message);
                var embed_ole = item.EmbeddedObjectParts.ToList(); // Register each OLE to a list
                var embed_image = item.ImageParts.ToList(); // Register each image to a list
                int embedobj_no = 0;
                foreach (var part in embed_ole) // Inform user of each object
                {
                    embedobj_no++;
                    Console.WriteLine($"--> Embedded object #{embedobj_no}");
                    Console.WriteLine($"----> Content Type: {part.ContentType.ToString()}");
                    Console.WriteLine($"----> URI: {part.Uri.ToString()}");
    
                }
                foreach (var part in embed_image) // Inform user of each object
                {
                    embedobj_no++;
                    Console.WriteLine($"--> Embedded object #{embedobj_no}");
                    Console.WriteLine($"----> Content Type: {part.ContentType.ToString()}");
                    Console.WriteLine($"----> URI: {part.Uri.ToString()}");
                }
            }
        }
    }