Search code examples
pythonv8

PyV8, can I manipulate DOM structure?


Lets assume that we have PyV8:

import PyV8
ctxt = PyV8.JSContext()

and a python DOM structure, for example xml.dom

How can I feed a .js-file to PyV8 so that it could change DOM-structure that I have.
If I had it's content:

$("#id").remove();

I want dom item to be removed.

PyV8 has perfect hello-world example. But I'd like to see something usefull.

To be clear, what I want to do is:
"Javascript file" -->-- magic -->-- DOM, (already built with html file) and changed now with passed javascript file


Solution

  • Appologies for the formatting. I spaced as best I could, but my screen reader doesn't like SO's formatting controls.

    I'm going to take a shot at answering your question, though it seems a tad vague. Please let me know if I need to rewrite this answer to fit a different situation. I assume you are trying to get an HTML file from the web, and run Javascript from inside this file, to act on said document. Unfortunately, none of the Python xml libraries have true DOM support, and W3C DOM compliance is nonexistent in every package I have found. What you can do is use the PyV8 w3c.py dom file as a starting example, and create your own full DOM. W3C Sample Dom You will need to rewrite this module, though, as it does not respect quotes or apostrophys. BeautifulSoup is also not the speediest parser. I would recommend using something like lxml.etree's target parser option. LXML Target Parser Search for "The feed parser interface". Then, you can load an HTML/Script document with LXML, parse it as below, and run each of the scripts you need on the created DOM.

    Find a partial example below. (Please note that the HTML standards are massive, scattered, and _highly browser specific, so your milage may vary).

    class domParser(object):
        def __init__(self):
        #initialize dom object here, and obtain the root for the destination file object.
            self.dom = newAwesomeCompliantDom()
            self.document = self.dom.document
            self.this = self.document
    
        def comment(self, commentText):
        #add commentText to self.document or the above dom object you created
            self.this.appendChild(self.document.DOMImplementation.createComment(commentText))
    
        def start(self, tag, attrs):
        #same here
            self.this = self.this.appendChild(self.document.DOMImplimentation.newElement(tag,attrs))
    
        def data(self, dataText):
        #append data to the last accessed element, as a new Text child
            self.this.appendChild(self.document.DOMImpl.createDataNode(dataText))
    
        def end(self):
        #closing element, so move up the tree
            self.this = self.this.parentNode
    
        def close(self):
            return self.document
    
    #unchecked, please validate yourself
    x = lxml.etree.parse(target=domParser)
    x.feed(htmlFile)
    newDom = x.close()