I want to download all the 800k pages of my Confluence wiki.
I'd like to use:
curl -u wikiusername:wikipassword https://wiki.hostname.com/rest/api/content?start=1`
and simply increase start
from 1
to 800000
.
However, the response time increases as start
increases, and from ~150,000
begins to timeout:
start |
response time (seconds) |
---|---|
1 | 0.4 |
1,000 | 2.5 |
10,000 | 9 |
50,000 | 112 |
100,000 | 286 |
200,000 | timeout |
How can I use rest/api/content
to download all the 800k pages of my Confluence wiki without timing out?
Use the limit parameter as in developer.atlassian.com/server/confluence/… - Elazaron
Option 2: Download space by space, as this Python 2 script to export Confluence spaces and pages recursively via its API did: https://github.com/siemens/confluence-dumper (mirror).
I confirm option 2 works.