I'm quite new to the Elastic Stack and want to index documents by using FSCrawler. I'm occuring a strange problem:
I create a new job and get a confirmation that it had been successfuly created. I can see the newly created folder with the jobname.
The problem is, that somehow FSCrawler can't find the new generated jobs.
I generate the job by using the following command in PS:
PS C:\ELK\fscrawler> bin/fscrawler testJobLaaKii
10:22:28,708 INFO [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [8.4mb/247.5mb=3.43%], RAM [2.4gb/7.8gb=31.33%], Swap [4.6gb/12.5gb=37.33%].
10:22:28,724 WARN [f.p.e.c.f.c.FsCrawlerCli] job [testJobLaaKii] does not exist
10:22:28,726 INFO [f.p.e.c.f.c.FsCrawlerCli] Do you want to create it (Y/N)?
y
10:22:31,190 INFO [f.p.e.c.f.c.FsCrawlerCli] Settings have been created in [C:\Users\<username>\.fscrawler\testJobLaaKii\_settings.yaml]. Please review and edit before relaunch
But whenever I want to start it, it seems like FSCrawler can't find it.
PS C:\ELK\fscrawler> bin/fscrawler
10:24:49,361 INFO [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [8.6mb/247.5mb=3.48%], RAM [2.4gb/7.8gb=31.38%], Swap [4.6gb/12.5gb=37.06%].
10:24:49,373 INFO [f.p.e.c.f.c.FsCrawlerCli] No job specified. Here is the list of existing jobs:
10:24:49,378 INFO [f.p.e.c.f.c.FsCrawlerCli] No job exists in [C:\Users\<username>\.fscrawler].
10:24:49,378 INFO [f.p.e.c.f.c.FsCrawlerCli] To create your first job, run 'fscrawler job_name' with 'job_name' you want
Even though the job is clearly created:
Sooo, after finding this video: Indexing many PDF files for full-text search using Elasticsearch
I solved it by using the command he showed in the video:
bin\fscrawler --config_dir ./DS data_science --loop 1
instead of my shorter version. I can't tell whats the problem in using the shorter version and I still cant see my jobs listed while executing bin\fscrawl
but somehow it works...