Search code examples
elasticsearchelasticsearch-pluginelasticsearch-7fscrawler

FSCrawler can't find existing jobs


I'm quite new to the Elastic Stack and want to index documents by using FSCrawler. I'm occuring a strange problem:

I create a new job and get a confirmation that it had been successfuly created. I can see the newly created folder with the jobname.

The problem is, that somehow FSCrawler can't find the new generated jobs.

I generate the job by using the following command in PS:

PS C:\ELK\fscrawler> bin/fscrawler testJobLaaKii
10:22:28,708 INFO  [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [8.4mb/247.5mb=3.43%], RAM [2.4gb/7.8gb=31.33%], Swap [4.6gb/12.5gb=37.33%].
10:22:28,724 WARN  [f.p.e.c.f.c.FsCrawlerCli] job [testJobLaaKii] does not exist
10:22:28,726 INFO  [f.p.e.c.f.c.FsCrawlerCli] Do you want to create it (Y/N)?
y
10:22:31,190 INFO  [f.p.e.c.f.c.FsCrawlerCli] Settings have been created in [C:\Users\<username>\.fscrawler\testJobLaaKii\_settings.yaml]. Please review and edit before relaunch

But whenever I want to start it, it seems like FSCrawler can't find it.

PS C:\ELK\fscrawler> bin/fscrawler
10:24:49,361 INFO  [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [8.6mb/247.5mb=3.48%], RAM [2.4gb/7.8gb=31.38%], Swap [4.6gb/12.5gb=37.06%].
10:24:49,373 INFO  [f.p.e.c.f.c.FsCrawlerCli] No job specified. Here is the list of existing jobs:
10:24:49,378 INFO  [f.p.e.c.f.c.FsCrawlerCli] No job exists in [C:\Users\<username>\.fscrawler].
10:24:49,378 INFO  [f.p.e.c.f.c.FsCrawlerCli] To create your first job, run 'fscrawler job_name' with 'job_name' you want

Even though the job is clearly created:

file system with new generated job


Solution

  • Sooo, after finding this video: Indexing many PDF files for full-text search using Elasticsearch

    I solved it by using the command he showed in the video:

    bin\fscrawler --config_dir ./DS data_science --loop 1
    

    instead of my shorter version. I can't tell whats the problem in using the shorter version and I still cant see my jobs listed while executing bin\fscrawl but somehow it works...