After doing some research about JCR or RDBMS, and reading other posts, I am still uncertain whether to use JCR over JPA for a document management system, which has to deal with different document types, very large files and a lot of concurrent access from many users.
My main reason to consider JCR is because documents look like content to me, and the specification already deals with some problems that comes with it - mostly I am interested in storage and versioning. Also I would like to sort of encapsulate the document stuff within a JCR implementation and use JPA for everything else application specific.
Maybe someone can help me with my remaining questions:
UPDATE: even though this question has been answered in detail, somebody might have a more critical sight about its use from a more practical point of view. Personally I am getting more and more concerned about the following non technically related issues:
Short version: Documents are structured or semi-structured content. Thats THE use-case for a hierarchically organized data-storage. You should go for JCR if you don't want to implement all the basic dms/cms stuff for yourself (consider this, you're probably doing it the first time, while they were doing it all the time).
Long version: JCR covers much of the basic use cases of document or content management systems by specification, like versioning, locking, lifecycle management or referential integrity. Further it allows you to extend your data without changing the schema (of course you can define your node types in a model, but you don't have to). Most of the JCR implementations (like Jackrabbit) use a database in the backend making them "little more" than an abstraction layer over your relational backend. To deal with large data, you can use the filesystem storage (which is much faster than storing every binary data to the database) while storing the structured data (nodes and properties) in the database.
When going for JPA you have to deal with all this dms/cms stuff by yourself. Of course you can do it, but it is much more low-level programming that has been done in the JCR implementation already. Every model change requires a schema change, and the table layout is not that trivial (do you want to have a big table for your documents, with every property being a column? do you want to have a separate table for each document class? how do you model lifecycles, how do you model versioning?)
For the first hops with JCR, I'd recommend David's Model, consider everything of your application as content. I had worked in a project, where we decided against a mix of JCR and JPA so we don't have to deal with different APIs for storage.
And there are at least some JCR implementations out there
Btw. the JCR API and implementations are done pretty much with RESTful architecture in mind. So if you consider a REST API, the mapping is rather simple, too. Further, it allows consumer to explore the content directly via JCR API making it easy to integrate the content in other applications (i.e. read-only) while you have to reveal the internal design of the your database with JPA making consumer contracts more likely to break on changes.
Regarding your remaining questions: