Search code examples
c#.netfilesystemssystem.io.filesystem.io.compression

Could you use System.IO.Abstractions to treat the differences between a ZipArchive and an actual file system folder transparently?


I have a program that works with projects consisting of multiple files. Each project is saved in a folder of the same name, with project file of the same name, plus one or more support files.

C:\...\Projects\Project01\
C:\...\Projects\Project01\project01.prj
C:\...\Projects\Project01\support-a.xyz
C:\...\Projects\Project01\support-b.xyz
C:\...\Projects\Project01\Sub\afile.xyz

Now I want to be able to zip a Project folder into a zip archive of the same name:

C:\...\Projects\Project01.zip

I would like my program to be able to access the files in the archive as if they are files in a folder. The problem is that the file system doesn't handle a path that combines the location of the zip file with the relative path to a ZipArchiveEntry inside the zip file, e.g. C:\Temp\ProjectsProject01.zip\Sub\afile.xyz. You can't create a FileStream with a path like that. I could make function to detect an archive in a path and then use either ZipArchive.GetEntry().Open() or a FileStream to return a Stream. But the problem goes deeper than that. My existing code also uses FileInfo, and that won't work with these hybrid paths either.

I think I have two choices, either add a bunch of logic to handle the two scenarios properly, or perhaps I could devise implementations of types in System.IO.Abstractions to handle this transparently, delegating to the file system or the ZipArchive classes as needed. But I wonder if I'm missing some fundamental reason this won't work, such as problems with accessing multiple files in the archive concurrently, when to dispose the archive, etc. I suspect that if this would work, it would already have been done. But the fact that Windows Explorer seems to implement this behavior makes me wonder why it isn't available in System.IO.


Solution

  • Windows API Functions as well as their .NET wrappers for file access work with filesystems.

    A filesystem in Windows requires a kernel-mode driver. Standard drivers like the one for NTFS or FAT serve only the corresponding formats. They don't know how to represent a compound container as a directory.

    To "inject" a directory into the existing filesystem, you can use one of these techniques:

    • a junction point. In their most common case, junction points redirect the OS requests within the same filesystem (which doesn't solve your problem). There exist more sophisticated cases, but they are rare and hardly apply to your case either.
    • a virtual filesystem that you can mount to a drive letter or to a junction point in the regular filesystem. The complication here is that you need a driver for this task. We offer the CBFS Connect product that one uses for creating virtual filesystems, and there exist some alternatives.
    • a filesystem filter driver that supports virtual files and directories. Our CBFS Filter product supports virtual files, but we intentionally don't support virtual directories - one can use CBFS Connect and junction points in this scenario.

    Now, the above is just the driver-related part. You would also need an actual implementation of the filesystem logic (here, "filesystem" is not a layout on the disk like NTFS or FAT, but the rules of managing a set of files and directories upon a request from the OS and a driver). It is of course possible to present a ZIP File as a filesystem; however, its format is suitable mainly for read-only filesystems. I.e., presenting a ZIP as a read-only filesystem is much easier than supporting updates. The reason is that an implementation of a filesystem requires random access to file data (also, for writing) and an ability to add and delete files. ZIP format doesn't support these types of operations easily; when modifying a file, you would have to re-write the large pieces of the ZIP archive.

    File Explorer, for ZIP and other formats support, implements the Shell technology called "Shell namespace extension". This technology is limited via the Shell, i.e., whatever you inject into the Shell namespace is accessible only via the shell API functions and interfaces. Filesystem-related Windows functions don't work with that namespace.

    With all of the above in mind, it is reasonable to look for a solution that already includes a driver and support for ZIP format.

    If you don't require a specific file format (ZIP), we have an interesting solution named CBFS Storage (also, a component library), that offers a container for data which you can mount to a drive letter or a junction point. It is free from ZIP's limitations and has built-in compression and encryption support. The product is also driver-based and its driver implements SolFS, the filesystem that we've been developing for over 20 years. You can keep the container file (vault) on the disk as a regular file (similar to ZIP), or, there is a MemoryVault component available that creates in-memory disks.