Search code examples
google-app-enginegoogle-cloud-datastoreblobstoregae-quotas

appengine toy program hitting quota limits for reads and writes


Let's say I wanted to make an appengine application that stores a 50,000 word dictionary and also equivalent dictionaries in 10 other languages of similar size.

I got this working locally on my dev server, but when I went to load the first dictionary into the real appserver I immediately went over my writes per day quota. I had no idea how many dictionary entries made it into the datastore. So, 24 hours later, I went and tried to bulk download the dictionary and see how many entries I actually had, but doing that I hit the reads per day quota and got nothing back for my trouble. I tried enabling billing at setting a daily maximum of $1.00 and hit that quota through the bulkloader and didn't get any data for my trouble or my $1.00.

Anyway, so I looked at my datastore viewer and it shows that each of my dictionary words required 8 writes to the datastore.

So, does this mean that this sort of application is inappropriate for appengine? Should I not be trying to store a dictionary in there? Is there a smarter way to do this? For instance, could I store the dictionary in file form in the blob store somehow and then do things with it programatically from there?

Thank you for any suggestions


Solution

  • It's likely that you'll be reading much less then writing so the problem is getting the data in, not so much reading it.

    So all you need to do to use your current configuration is to slow down the write rate. Then, presumably you'll be getting each word by it's ID (the word itself I hope!) and so reads will be fast and small, exactly as you want.

    You could do this: Chop your source data into 1 file per letter . Upload those files with your application and create a task to read each file in turn then slowly write that data to the datastore. Once that task completes, it's last action is to call the next task.

    It might take a week to complete but once it's there it'll be alot more handy then an opaque blob you have to get from the blob store, reading N words for every single 1 you are actually interested in, and then unpack and process for every word.

    You can also use the bulk downloader to upload data as well!