Why is Solr not extracting based on conventions?

I currently have solr 5.5.0 installed on a Windows 7 machine.

I am trying to get a project working that was built by a dev who recently left our company. This was dropped in my lap and I have no prior experience with solr so I am stumbling along trying to figure it out.

The problem I am having is that when I upload a file, it does not seem to be extracting custom fields that were defined like this:

public class SolrIndexFile
{
    [SolrUniqueKey("id")]
    public string Id { get; set; }

    [SolrField("attr_resourcename")]
    public string Path { get; set; }

    [SolrField("extension_s")]
    public string Extension { get; set; }

    [SolrField("bytes_s")]
    public string Bytes { get; set; } 
}

At first I thought I needed to specify a schema.xml but as I read more (and solr renamed it to schema.xml.bak) I figured out that now solr5 is using the managed-schema.

Then I thought I needed to manually add those field names. But then I saw the conventions (albiet I think I saw it in the schema.xml file) but it seems like those conventions should still hold true.

So now I am back to square one trying to figure out how to get those fields into the extract. Here is the code that actually uploads the file.

using (var fileStream = File.OpenRead(tmp))
{
    _solr.Extract(new ExtractParameters(fileStream, index.Id, index.Path)
    {
        ExtractFormat = ExtractFormat.Text,
        ExtractOnly = false,
        AutoCommit = true
    });
}

tmp is the file path to what I am uploading.

Any help is appreciated!

Solution

When you are new to the whole project, you really need to split a problem into parts to see which part is actually the problem. Testing this end-to-end and then trying to fix something in a middle might be too complicated.

In your case, the easiest way is probably to dump your SolrIndexFile content to see whether the extraction actually populates those values. If it does not, the problem is not Solr but your custom code.

If it does, then the question is what happens on the Solr side. If you go into Admin UI, then on the schema browser screen you can choose specific field and see what tokens (indexed representation) it contains. That way you can check whether any content made it into Solr. If it did not, then you worry about schema and mapping. You would also do a basic query and check that new documents are actually showing up whether with those extra fields or not. If not, you may have several Solr instances, missing commit or other problems.

If all that is fine, then you focus on the query side and see whether you perhaps not asking for those fields or some other omission.