Search code examples
djangofilefield

Hooking between file uploaded and before model saved


I'm new to django and try to realize an project that allows the user to upload a file, parses it and enters the contained information into the same model:

class Track(models.Model):  
    Gpxfile  = models.FileField("GPS XML", upload_to="tracks/gps/")
    date=models.DateTimeField(blank=True)
    waypoints = models.ForeignKey(Waypoint)
    ...

For the start I'm ok to work with the admin interface and to save work. So I hooked into the models save() method:

def save(self, *args, **kwargs):
    """we hook to analyse the XML files"""
    super(Track, self).save(*args, **kwargs) #get the GPX file saved first
    self.__parseGPSfile(self.Gpsxmlfile.path) #then analyse it

But here I run into problems due to the dependency:

  • to get the filefield saved to a real file, I need to invoke the original save() first
    • this breaks as some fields aren't populated yet as I didn't
  • if I switch both lines, the file isn't saved yet and can't be parsed

Maybe I have just a lack of basic knowledge but even after reading a lot of SOs, blogs and googeling around, I don't have a clear idea how to solve it. I just found this ideas that don't seem to fit very well:

  • create your own view and connect it to your handler (see here)
    • bad as this won't work with admin interface, needs views, puts logic to views etc...
  • use a validator for the filefield (see here)
    • not sure if this is a good design, as it's about file processing in general and not realy validating (yet)

So what does the community suggest to realize a file postprocessing and data "import" in Django 1.4?


Solution

  • You can parse the file prior to saving, and generally I like to do this in the model clean() method:

    def clean(self):
        file_contents = self.Gpxfile.read()
        ...do stuff
    

    If the file doesn't meet your validation criteria, you can raise a ValidationError in clean which will propagate back to the calling view so you can report the form errors back to the user.

    If you really need to save the file first and then do something, you could use a post_save signal

    def some_function_not_in_the_model(sender, **kwargs):
        obj = kwargs['instance']
        ...do stuff with the object
    
    # connect function to post_save
    post_save.connect(some_function_not_in_the_model, sender=Track)
    

    Django docs on post_save

    Finally, one note about large files is that they may end up as temporary files on the server (in Linux /var/tmp or similar...this can be set in settings.py). It may be a good idea to check this within the clean() method when trying to access the file, something like:

    # check if file is temporary
    if hasattr(self.Gpxfile.file, 'temporary_file_path'):
        try:
            file_path = self.Gpxfile.file.temporary_file_path(),
        except:
            raise ValidationError(
                "Something bad happened"
            )
    else:
        contents = self.Gpxfile.read()
    

    Oh, and finally finally, be careful about closing the temporary file. Back when I started using Django's FileField and learned how the temporary file worked, I thought I'd be the good programmer and close the file after I was done using it. This causes problems as Django will do this internally. Likewise, if you open the temporary file and raise a ValidationError, you will likely want to remove (unlink) the temp file to prevent them from accumulating in the temp directory.

    Hope this helps!