I have been closely following the Open ZFS development scene for OS X for the last few years. Things have progressed significantly over the last several months since the sad problems that occurred with Greenbytes, etc., but I have been pleased to see that we're finally close to getting real Spotlight support in place. I noticed this email passing by the other day from Jorgen Lundman (who has put a great deal of personal time into getting this going and contributing to the community) and thought perhaps others here might be interested in chiming in on this, his, topic regarding implementing Spotlight support for ZFS on OS X:
To summarize, I think the crux of this question boils down to this:
So then, what does "mds" use to iterate the mounted file systems? I do not
think the sources for "Spotlight-800.28" was ever released so we can't just
go look and learn, like we did for xnu, and IOkit.
It doesn't use the BSD getfsstat(), more likely it asks IOKit, and for some
reason rejects the lower mounts.
And the body of the email for convenience:
Hey guys,
So one of our long-term issues in OpenZFSonOSX is to play nice with Spotlight.
We have reached the point where everything sometimes pretends to work.
For example;
# mdfind helloworld4
/Volumes/hfs1/helloworld4.jpg
/Volumes/hfs2/helloworld4.jpg
/Volumes/zfs1/helloworld4.jpg
/Volumes/zfs2/helloworld4.jpg
Great, picks it up in our regular (control group) HFS mounted filesystems,
as well as the 2 ZFS mounts.
Mounted as:
/dev/disk2 on /Volumes/zfs1 (zfs, local, journaled)
/dev/disk2s1 on /Volumes/zfs2 (zfs, local, journaled)
# diskutil list
/dev/disk1
#: TYPE NAME SIZE IDENTIFIER
0: GUID_partition_scheme *42.9 GB disk1
1: ZFS 42.9 GB disk1s1
2: 6A945A3B-1DD2-11B2-99A6-080020736631 8.4 MB disk1s9
/dev/disk2
#: TYPE NAME SIZE IDENTIFIER
0: zfs_pool_proxy FEST *64.5 MB disk2
1: zfs_filesystem_proxy ssss 64.5 MB disk2s1
So you can see, the actual pool disk is /dev/disk1, and the fake nodes we
create for mounting as /dev/disk2*, as it appears to be required by
Spotlight to work at all. We internally also let the volumes auto-mount,
from issuing "diskutil mount -mountPoint %s %s".
We are not a VOLFS, so there is no ".vol/" directory, nor will mdutil -t
work. But these two points are true for MS-DOS as well, and that does work
with Spotlight.
We correctly reply to zfs.fsbundle's zfs.util for "-p" (volume name) and
"-k" (get uuid), done pre-flight to mounting by DA.
Using FSMegaInfo tool, we can confirm that stat, statfs, readdir, and
similar tests appear to match that of HFS.
So then, the problem.
The problem comes from mounting zfs inside zfs. Ie,
When we mount
/Volumes/hfs1/
/Volumes/hfs1/hfs2/
/Volumes/zfs1/
/Volumes/zfs1/zfs2/
# mdfind helloworld4
/Volumes/hfs1/helloworld4.jpg
/Volumes/hfs1/hfs2/helloworld4.jpg
/Volumes/zfs1/helloworld4.jpg
Absent is of course, "/Volumes/zfs1/zfs2/helloworld4.jpg".
Interestingly, this works
# mdfind -onlyin /Volumes/zfs1/zfs2/ helloworld4
/Volumes/zfs1/zfs2/helloworld4.jpg
And additionally, mounting in reverse:
/Volumes/hfs2/
/Volumes/hfs2/hfs1/
/Volumes/zfs2/
/Volumes/zfs2/zfs1/
# mdfind helloworld4
/Volumes/hfs2/helloworld4.jpg
/Volumes/hfs2/hfs1/helloworld4.jpg
/Volumes/zfs2/helloworld4.jpg
So whichever ZFS filesystem was mounted first, works, but not the second.
So the individual ZFS filesystems are both equal. It is as if it doesn't
realise the lower mount is its own device.
So then, what does "mds" use to iterate the mounted fileystems? I do not
think the sources for "Spotlight-800.28" was ever released so we can't just
go look and learn, like we did for xnu, and IOkit.
It doesn't use the BSD getfsstat(), more likely it asks IOKit, and for some
reason rejects the lower mounts.
Some observations:
# /System/Library/Filesystems/zfs.fs/zfs.util -k disk2
87F06909-B1F6-742F-7355-F0D597849138
# /System/Library/Filesystems/zfs.fs/zfs.util -k disk2s1
8F60C810-2D29-FCD5-2516-2D02EED4566B
# grep uu /Volumes/zfs1/.Spotlight-V100/VolumeConfiguration.plist
<key>uuid.87f06909-b1f6-742f-7355-f0d597849138</key>
# grep uu /Volumes/zfs1/zfs2/.Spotlight-V100/VolumeConfiguration.plist
<key>uuid.8f60c810-2d29-fcd5-2516-2d02eed4566b</key>
Any assistance is appreciated, the main issue tracking Spotlight is;
https://github.com/openzfsonosx/zfs/issues/116
The branch for it;
https://github.com/openzfsonosx/zfs/tree/issue116
vfs_getattr;
https://github.com/openzfsonosx/zfs/blob/issue116/module/zfs/zfs_vfsops.c#L2307
It appears to be down to some undocumented expectations in the vfs_vget
method, to lookup entries based entirely on the inode number. Ie, stat /.vol/16777222/1102011
It is expected that vfs_vget
sets the vnode_name
correctly here, using a call like vnode_update_identity() or similar.