GraphDB Free/9.4.1, RDF4J/3.3.1
I'm working on using the /rest/data/import/server/{repo-id} endpoint to initiate the importing of an RDF/XML file.
Steps:
put SysML.owl in the ${graphdb.workbench.importDirectory} directory. chmod a+r SysML.owl
create repository test1 (in Workbench - using all defaults except RepositoryID := "test1")
curl http://127.0.0.1:7200/rest/data/import/server/test1 => as expected: [{"name":"SysML.owl","status":"NONE"..."timestamp":1606848520821,...]
curl -XPOST --header 'Content-Type: application/json' --header 'Accept: application/json' -d ' { "fileNames":[ "SysML.owl" ] }' http://127.0.0.1:7200/rest/data/import/server/test1 => SC==202
after 60 seconds, curl http://127.0.0.1:7200/rest/data/import/server/test1 => [{"name":"SysML.owl","status":"DONE","message":"Imported successfully in 7s.","context":null,"replaceGraphs":[],"baseURI": "file:/home/steve/graphdb-import/SysML.owl", "forceSerial":false,"type":"file","format":null,"data":null,"timestamp": 1606848637716, [...other json content deleted] Repository test1 now has the 263,119 (824 inferred) statements from SysML.owl loaded
BUT if I then
The main-2020-12-01.log shows the INFO messages pertaining to the repository test1, plugin registrations, etc. Nothing indicating why the prior repository instance's import status is lingering.
And this is of concern because I was expecting to use some polling of the status to determine when the data is loaded so my processing can proceed. Some good news - I can issue the import server file request again and after waiting 60 seconds, the 263,119 statements are present. But the timestamp on the import is the earlier repo instance's timestamp. It was not reset via the latest import request.
I'm probably missing some cleanup step(s), am hoping someone knows which.
Thanks, -Steve
The status is simply for your reference and doesn't represent the actual presence of data in the repository. You could achieve a similar thing simply by clearing all data in the repository without recreating it.
If you really need to rely on those status records you can clear the status for a given file once you polled it and determined it's done (or prior to starting an import) with this curl:
curl -X DELETE http://127.0.0.1:7200/rest/data/import/server/test1/status \
-H 'content-type: application/json' -d '["SysML.owl"]'
Note that this is an undocumented API and it may change without notice.