Search code examples
.netwindowscachingcd

Reading data from a CD/DVD. Does Windows create some sort of temporary cache?


As a brief explanation, I've been rather vexed while working with some .net code over the last few weeks while attempting to read data from a disc. The program, in the simplest terms uses itextsharp to check the page counts for PDF files (if that matters). The read speed from the the CD drive has been wildly inconsistent, even attempting to use the same code/disc to read from etc. with repeated testing. Then I had a bit of a Eureka moment. At least enough of one to come ask some people much more informed than I am.

It seems that the first time enumerating through the files on the disc takes much, much longer than subsequent iterations. I then realized my disc drive light would light up when I was reading through the data on the disc the first time, but it wouldn't light up on subsequent runs through that same data (re-running the program to look at the same files, basically). Not so coincidentally, those later read times are way, way faster. On the off chance my program was somehow storing this data, I fully closed down Visual Studio, checked task manager for processes etc. and re-opened VS. The results seem to be the same. Open the disc drive and re-insert the disc? The process resets (first time slow, etc.)

My assumption here has to be some mechanic in Windows? that temporarily stores disc data "somewhere" and when you call on that data again, it uses the data it's stored "somewhere" instead of reading from the disc. It certainly makes sense from a layman's point of view when you start considering how much faster it is to read information from the average HDD vs a CD. I'm guessing this is common knowledge sort of stuff in certain circles, but I honestly don't use physical media all that much in this sort of context. It's usually more of a "copy it locally then do stuff" sort of process.

So my question would be "is that a thing?" or is there some other sort of madness going on someone might identify? And if I'm not far off the mark with my guess, is there a way I can use this "caching" feature more efficiently? e.g. Force it to temporarily cache the disc or something before I try to read from it? Or really any other sort of "best practice" sort of ideas anybody might come up with. Let me know if there's any additional information I can provide.


Solution

  • Sure, optical drives are no different from hard disks, they just have much slower seek times. For both, the file system cache is absolutely crucial, it keeps a copy of the data that was previously read from the drive. Windows dedicates a pretty large amount of RAM to this cache, an easy gigabyte on modern machines with enough RAM. Also the core reason that on a 32-bit operating system, half of the entire address space is dedicated to the operating system.

    So inevitably, the first access to the drive is going to be slow, the cache cannot possibly have the data available yet and the app must wait until it is read from the disk. This operates at mechanical time instead of electronic time, you can hear it for an optical drive. Optical discs are CLV - constant linear velocity. The closer the track to the center of the disc, the faster the disc must spin. Great for packing density, very lousy for seek times. De/accelerating the disc can make quite a clatter. Hard drives are not that quiet either. This takes many milliseconds. Subsequent accesses are very fast, just a memory-to-memory copy from the file system cache data, it operates at memory bus speeds. 5 GB/sec and up, about a microsecond constant overhead.