Search code examples
dspace

Harvest a collection from a external repository


i'm trying harvest a collection from a external repository by OAI service, but when i try ping the url by terminal or test the configurations harvest in module xmlui:

By xmlui, testing the configurations of collection return this:

* OAI server could not be reached.

By terminal:

C:\dspace5\bin>dspace harvest -g -a https://repositorioaberto.uab.pt/oai/request -i all

Using DSpace installation in: C:\dspace5

Testing basic PMH access:  8967 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://repositorioaberto.uab.pt/oai/request?verb=Identify
23544 [net.sf.ehcache.CacheManager@6d167f58] DEBUG net.sf.ehcache.util.UpdateChecker  - Update check failed: java.net.ConnectException: Connection timed out: connect
invalidAddress: OAI server could not be reached.

Testing ORE support:  31598 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://repositorioaberto.uab.pt/oai/request?verb=Identify
invalidAddress: OAI server could not be reached.
52641 [main] DEBUG net.sf.ehcache.CacheManager  - CacheManager already shutdown

By terminal returning status: 200 OK!:

Other test with other url oai service:

Using DSpace installation in: C:\dspace5
Testing basic PMH access:  38726 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=Identify
40198 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
40205 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
41265 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=ListMetadataFormats
41278 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
41284 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
41292 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=ListMetadataFormats
41350 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
41398 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
OK

Testing ORE support:  41541 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=Identify
41555 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
41555 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
41557 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=ListMetadataFormats
41566 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
41570 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
41573 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - requestURL=https://demo.dspace.org/oai/request?verb=ListMetadataFormats
41591 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - responseCode=200
41591 [main] DEBUG ORG.oclc.oai.harvester2.verb.HarvesterVerb  - contentEncoding=null
OK

41604 [main] DEBUG net.sf.ehcache.CacheManager  - CacheManager already shutdown

This is related with proxy of my dspace application or the application oai service collected?

My proxy configurations are sett correctly:

http.proxy.host= http://myproxy.org

http.proxy.port = 8080

I tested too, in others environments dspace without proxy and including in the demo.dspace.org/xmlui. it returns the same error:

  • OAI server could not be reached.

I honestly am not able to see the problem or how to solve this


Solution

  • If I open up your OAI landing page at https://repositorioaberto.uab.pt/oai/request?verb=Identify

    I see that the links for the OAI verbs do not conform to the original URL

    http://repositorioaberto.uab.pt/oaiextended/request?verb=ListSets

    Can you explain why "oaiextended" is part of the path? Could this be the source of the problem?