Search code examples
slingjackrabbitjackrabbit-oak

Add new datastore during upgrade


I want to migrate binary content from the segmentstore to a new datastore during an repository update. The current version of my repository is 1.6.1. It is using TarMK segmentstore, and has no datastore. In my experience, having binaries in a file datastore offers a considerable boost in performance. So that’s what I want to do with my upgrade to 1.26.0. But how does one copy binaries to a new datastore?

This command migrates my content to the new repository. The application loads the content properly. But there is no datastore.

$ java -jar ~/oak-upgrade-1.26.0.jar --copy-binaries 
    --copy-orphaned-versions=false --copy-versions=false 
    --include-paths=/apps/lsa,/content/lsa,/var/recyclebin/content/lsa 
    ../dev-jun-author/sling/repository sling/repository

When I try to copy binaries to a new datastore using this command...

  $ java -jar ~/oak-upgrade-1.26.0.jar --copy-binaries 
         --include-paths=/apps/lsa,/content/lsa,/var/recyclebin/content/lsa 
         --datastore=sling/repository/datastore ../dev-jun-author/sling/repository sling/repository

It has the following output..

24.08.2020 16:43:12.263 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyVersions parameter set to 1969-12-31 24.08.2020 16:43:12.265 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - copyOrphanedVersions parameter set to 1969-12-31 24.08.2020 16:43:12.265 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - paths to include: [/apps/lsa, /content/lsa, /var/recyclebin/content/lsa] 24.08.2020 16:43:12.265 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.MigrationOptions - Cache size: 256 MB 24.08.2020 16:43:12.269 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Source: SEGMENT_TAR[../dev-jun-author/sling/repository] 24.08.2020 16:43:12.271 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.StoreArguments - Destination: SEGMENT_TAR[sling/repository] 24.08.2020 16:43:12.296 [main] INFO org.apache.jackrabbit.oak.segment.file.FileStore - Creating file store FileStoreBuilder{version=1.26.0, directory=../dev-jun-author/sling/repository/segmentstore, blobStore=null, maxFileSize=256, segmentCacheSize=256, stringCacheSize=256, templateCacheSize=64, stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, nodeDeduplicationCacheSize=1048576, memoryMapping=false, offHeapAccess=false, gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, retainedGenerations=2, gcType=FULL}} 24.08.2020 16:43:12.428 [main] INFO org.apache.jackrabbit.oak.segment.file.ReadOnlyFileStore - TarMK ReadOnly opened: ../dev-jun-author/sling/repository/segmentstore (mmap=false) 24.08.2020 16:43:12.601 [main] INFO org.apache.jackrabbit.oak.segment.file.ReadOnlyFileStore - TarMK closed: ../dev-jun-author/sling/repository/segmentstore 24.08.2020 16:43:12.618 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.DatastoreArguments - Blobs embedded in SEGMENT_TAR[../dev-jun-author/sling/repository] will be copied to FileDataStore[sling/repository/datastore] 24.08.2020 16:43:12.619 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.DatastoreArguments - Source blob store: DummyBlobStore 24.08.2020 16:43:12.619 [main] INFO org.apache.jackrabbit.oak.segment.file.FileStore - Creating file store FileStoreBuilder{version=1.26.0, directory=../dev-jun-author/sling/repository/segmentstore, blobStore=null, maxFileSize=256, segmentCacheSize=256, stringCacheSize=256, templateCacheSize=64, stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, nodeDeduplicationCacheSize=1048576, memoryMapping=true, offHeapAccess=false, gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, retainedGenerations=2, gcType=FULL}} 24.08.2020 16:43:12.632 [main] INFO org.apache.jackrabbit.oak.segment.file.ReadOnlyFileStore - TarMK ReadOnly opened: ../dev-jun-author/sling/repository/segmentstore (mmap=true) 24.08.2020 16:43:12.635 [main] INFO org.apache.jackrabbit.oak.segment.SegmentNodeStore$SegmentNodeStoreBuilder

  • Creating segment node store SegmentNodeStoreBuilder{blobStore=inline} 24.08.2020 16:43:12.644 [main] INFO org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler - Initializing SegmentNodeStore with the commitFairLock option enabled. 24.08.2020 16:43:12.654 [main] INFO org.apache.jackrabbit.oak.upgrade.cli.parser.DatastoreArguments - Destination blob store: FileDataStore[sling/repository/datastore] 24.08.2020 16:43:12.665 [main] INFO org.apache.jackrabbit.oak.segment.file.FileStore - Creating file store FileStoreBuilder{version=1.26.0, directory=sling/repository/segmentstore, blobStore=DataStore backed BlobStore [org.apache.jackrabbit.oak.plugins.blob.datastore.OakFileDataStore], maxFileSize=256, segmentCacheSize=256, stringCacheSize=256, templateCacheSize=64, stringDeduplicationCacheSize=15000, templateDeduplicationCacheSize=3000, nodeDeduplicationCacheSize=1048576, memoryMapping=true, offHeapAccess=false, gcOptions=SegmentGCOptions{paused=false, estimationDisabled=false, gcSizeDeltaEstimation=1073741824, retryCount=5, forceTimeout=60, retainedGenerations=2, gcType=FULL}}

Sling's error log has this output

25.08.2020 10:32:35.030 ERROR [0:0:0:0:0:0:0:1 [1598365955029] GET /content/lsa/assets/Screen%20Shot%202020-04-15%20at%204.06.14%20PM.png HTTP/1.1] org.apache.sling.engine.impl.SlingRequestProcessorImpl service: Uncaught Throwable java.lang.IllegalStateException: Attempt to read external blob with blobId [d4c06d8a7e0b3381caa5c918d6403319d603dd153e5ce32f663e940af55f0326#815933] without specifying BlobStore at org.apache.jackrabbit.oak.segment.SegmentBlob.getBlob(SegmentBlob.java:248) [org.apache.jackrabbit.oak-segment-tar:1.26.0] at org.apache.jackrabbit.oak.segment.SegmentBlob.getNewStream(SegmentBlob.java:253) [org.apache.jackrabbit.oak-segment-tar:1.26.0] at org.apache.jackrabbit.oak.segment.SegmentBlob.getNewStream(SegmentBlob.java:84) [org.apache.jackrabbit.oak-segment-tar:1.26.0] at org.apache.jackrabbit.oak.plugins.value.jcr.BinaryImpl.getStream(BinaryImpl.java:59) [org.apache.jackrabbit.oak-store-spi:1.26.0] at org.apache.sling.jcr.resource.internal.helper.LazyInputStream.getStream(LazyInputStream.java:106) [org.apache.sling.jcr.resource:3.0.20] at org.apache.sling.jcr.resource.internal.helper.LazyInputStream.read(LazyInputStream.java:65) [org.apache.sling.jcr.resource:3.0.20] at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314) [org.apache.commons.io:2.6.0] at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270) [org.apache.commons.io:2.6.0] at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291) [org.apache.commons.io:2.6.0] at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246) [org.apache.commons.io:2.6.0] at com.peregrine.rendition.RenditionsServlet$StreamResponse.writeTo(RenditionsServlet.java:191) [com.peregrine-cms.base.core:1.0.0.SNAPSHOT] at com.peregrine.commons.servlets.AbstractBaseServlet.doRequest(AbstractBaseServlet.java:133) at com.peregrine.commons.servlets.AbstractBaseServlet.doGet(AbstractBaseServlet.java:82) at org.apache.sling.api.servlets.SlingSafeMethodsServlet.mayService(SlingSafeMethodsServlet.java:266) [org.apache.sling.api:2.22.0] at org.apache.sling.api.servlets.SlingAllMethodsServlet.mayService(SlingAllMethodsServlet.java:137) [org.apache.sling.api:2.22.0] at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:342) [org.apache.sling.api:2.22.0] at org.apache.sling.api.servlets.SlingSafeMethodsServlet.service(SlingSafeMethodsServlet.java:374) [org.apache.sling.api:2.22.0] at org.apache.sling.engine.impl.request.RequestData.service(RequestData.java:552) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.SlingComponentFilterChain.render(SlingComponentFilterChain.java:44) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:82) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.SlingRequestProcessorImpl.processComponent(SlingRequestProcessorImpl.java:283) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.RequestSlingFilterChain.render(RequestSlingFilterChain.java:49) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:82) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.debug.RequestProgressTrackerLogFilter.doFilter(RequestProgressTrackerLogFilter.java:110) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.2] at org.apache.sling.i18n.impl.I18NFilter.doFilter(I18NFilter.java:131) [org.apache.sling.i18n:2.5.14] at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:72) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.filter.AbstractSlingFilterChain.doFilter(AbstractSlingFilterChain.java:78) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.SlingRequestProcessorImpl.doProcessRequest(SlingRequestProcessorImpl.java:151) [org.apache.sling.engine:2.7.2] at org.apache.sling.engine.impl.SlingMainServlet.service(SlingMainServlet.java:250) [org.apache.sling.engine:2.7.2] at org.apache.felix.http.base.internal.handler.ServletHandler.handle(ServletHandler.java:123) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:86) [org.apache.felix.http.jetty:4.0.18] at org.apache.sling.junit.impl.servlet.TestLogServlet$TestNameLoggingFilter.doFilter(TestLogServlet.java:257) [org.apache.sling.junit.core:1.0.26] at org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:142) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:81) [org.apache.felix.http.jetty:4.0.18] at org.apache.sling.i18n.impl.I18NFilter.doFilter(I18NFilter.java:131) [org.apache.sling.i18n:2.5.14] at org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:142) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:81) [org.apache.felix.http.jetty:4.0.18] at org.apache.sling.engine.impl.log.RequestLoggerFilter.doFilter(RequestLoggerFilter.java:75) [org.apache.sling.engine:2.7.2] at org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:142) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:81) [org.apache.felix.http.jetty:4.0.18] at org.apache.sling.engine.impl.parameters.RequestParameterSupportConfigurer.doFilter(RequestParameterSupportConfigurer.java:67) [org.apache.sling.engine:2.7.2] at org.apache.felix.http.base.internal.handler.FilterHandler.handle(FilterHandler.java:142) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:81) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.Dispatcher$1.doFilter(Dispatcher.java:146) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.whiteboard.WhiteboardManager$2.doFilter(WhiteboardManager.java:1002) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.sslfilter.internal.SslFilter.doFilter(SslFilter.java:97) [org.apache.felix.http.sslfilter:1.2.6] at org.apache.felix.http.base.internal.handler.PreprocessorHandler.handle(PreprocessorHandler.java:136) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.whiteboard.WhiteboardManager$2.doFilter(WhiteboardManager.java:1008) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.whiteboard.WhiteboardManager.invokePreprocessors(WhiteboardManager.java:1012) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:91) [org.apache.felix.http.jetty:4.0.18] at org.apache.felix.http.base.internal.dispatch.DispatcherServlet.service(DispatcherServlet.java:49) [org.apache.felix.http.jetty:4.0.18] at javax.servlet.http.HttpServlet.service(HttpServlet.java:725) [org.apache.felix.http.servlet-api:1.1.2] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:763) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:551) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1363) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:489) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1278) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.Server.handle(Server.java:500) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) [org.apache.felix.http.jetty:4.0.18] at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) [org.apache.felix.http.jetty:4.0.18] at java.base/java.lang.Thread.run(Thread.java:834)

I can see the repository has a datastore with some folders. But images in my Sling app are not working. How can I migrate binaries to a datastore using oak-upgrade or other utility?


Solution

  • I did this once a long time ago. Your command to split the repository looks correct. You can check the datastore folder after running the oak-upgrade tool to see if there are files created.

    Don't forget to create the needed configuration files to use the file datastore before starting Sling on the new (split) repository location.

    Create a org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStore.config file in the install folder which contains at least this property:

    path="./repository/datastore"
    

    Also create a org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.config file in the install folder which contains at least this property:

    customBlobStore=B"true"
    

    The minRecordLength used by default by the oak-upgrade tool is 16384 bytes.

    See https://jackrabbit.apache.org/oak/docs/osgi_config.html#config-sling for more details.

    I would also suggest not to do a partial content migration when splitting off the binaries. Everything in Sling is content, that includes your OSGi bundles etc as well.