Search code examples
githubgithub-apipastebinopendata

GitHub to share a set of SPARQL queries


I am using github to share a set of SPARQL queries:

http://www.boisvert.me.uk/opendata/sparql_aq+.html?file=specific%20sensor.txt

Currently the simple work allows end-users to access queries stored on the github repository, but ultimately I want to allow them to also modify the queries, as with a pastebin, and make use of the repository to better manage the shared system. Ideally I would want end-users who may not be very tech-savvy, to be able to make minor changes to queries to an open, linked data endpoint: so to keep the technology barrier low.

My problem is this: how best to structure the github project and exploit the API to make the most of the available information? I can think of different points:

  • Currently the project (https://github.com/boisvert/unshaql) holds client code and example queries. Does it make a difference to create an independent project (separate from the web client code) for SPARQL queries?
  • I would use directories within the project to classify/tag queries, and file names to title them. Are there better alternatives? It strikes me that a hierarchical structure is not a good fit to tags.
  • When end-users save, a simpler (and cruder) option is to allow them to push their file into just one branch, which holds the examples. A better engineered one would be to allow them to use their github credentials to fork the set of SPARQL queries and edit theirs, but with unaware users, how do I avoid creating a mess?

Solution

  • I think that a rigular Github repository is a rather bad fit for this kind of content. If your users have a GitHub account, you should probably use Gists instead: https://help.github.com/articles/about-gists/ I never used this myself, but it seems perfectly adapted to what you are planning. Your site could become a DB of tags over user-provided gists. That would however lock you into GitHub-specific solutions.

    Even if you go for a regular repository, you should not allow the users to commit into the repository hosting your code: that would be a serious security hazard as you won't be able to control the parts of the repository to which they are allowed to commit.

    If you setup two repositories, it's rather easy to have the code of a webpage in a repository, and the code automatically commited in another repository (under an anonymous identity so that your users don't have to create a github account).

    Also, note that the oauth token should never be stored in a public repository (or the GitHub robots will invalidate it in a matter of hours). See Hiding GitHub token in .gitconfig for a solution to this sub-problem.