I want to ask about a good triplestore to use for large datasets, it should:
You should consider using the OpenLink Virtuoso store. It is available via an OpenSource license and scales to billions of triples. You can use it via the Sesame and Jena APIs.
See here for an overview of large scale triple stores. Virtuoso is definitely easier to set up than BigData. Beside that I have used the Sesame NativeStore, which doesn't scale too well.
4Store is also a good choice, although I haven't used it. One benefit of Virtuoso over 4Store is that you can easily mix standard relational models with RDF, since Virtuoso is under the hood a relational database.