Our application is using Commons VFS to read various types of files. We use the automatic file type detection VFS provides, via its file extension mapping.
The problem: VFS misclassifies gz files (ie. files whose name ends in .gz
) as regular files, rather than as GZIP files. This prevents us from using VFS to read the (decompressed) content of gz files, without some special-case manually hack-arounding.
I've traced the problem to org.apache.commons.vfs2.impl.FileContentInfoFilenameFactory.create()
, which calls
FileNameMap fileNameMap = URLConnection.getFileNameMap();
contentType = fileNameMap.getContentTypeFor(name);
This loads the file content-types.properties
from the current Java installation. This file (on Windows, at least) contains this mapping:
application/octet-stream: \
description=Generic Binary Stream;\
file_extensions=.saveme,.dump,.hqx,.arc,.obj,.lib,.bin,.exe,.zip,.gz
According to the source code, org.apache.commons.vfs2.impl.FileTypeMap
allows this mapping to take precedence over the file extension map with which VFS was configured.
Can anyone think of a way of either (a) extending a class or two of VFS to work around this problem or (b) configuring VFS and/or Java itself so that VFS correctly classifies gz files?
Create a class like the following, to override the getContentTypeFor
method of FileNameMap
and exclude the troublesome application/octet-stream
entry:
public static class MyFileNameMap implements FileNameMap
{
private FileNameMap delegate = URLConnection.getFileNameMap();
@Override
public String getContentTypeFor( String fileName )
{
String contentType = delegate.getContentTypeFor( fileName );
if( "application/octet-stream".equals( contentType ) )
{
// Sun's java classifies zip and gzip as application/octet-stream,
// which VFS then uses, instead of looking at its extension
// map for a more specific mime type
return null;
}
return contentType;
}
}
Install this new class via:
URLConnection.setFileNameMap( new MyFileNameMap() );
Now when you call FileSystemManager.resolveFile()
, VFS will choose the correct file type for gz
files by falling back to its extensions map.
Note: This is a global change to the current JVM, so be careful if you are using any other code that needs this mime type entry for things like .exe
files.