I just tried something rather trivial: get the source code of a web page (by saving it) and count how often a certain phrase occurs in the code.
Turns out, it doesn't work if that page uses Polymer / web components. Is this a browser bug?
Try the following: Go to http://www.google.com/design/icons/ and try to find star_half
in the code (the last icon on the page). If you inspect the element inside of Chrome or Firefox, it will lead you to
<i class="md-icon dp48">star_half</i>
but this won't be in the source if you copy the root node or save the html to disk.
Is there a way to get the entire code?
Reason for this behavior is probably how source viewing( and source saving as well?) works for browser and because shadow roots are attached to web components on the client side.
When you press ctrl-u on a web page, browser essentially does a network call again on the same url to fetch a copy of what server returned when you hit that url.
In this case, when this page renders, browser identifies the component icons-layout
and then executes code to attach a shadow-root to this node. All of this happens when your page reaches the client(browser).
When you are trying to save this page, you are saving what server returned not current state of the page. You'll see same behavior if you fire up chrome console and try to save an icons-layout
node.
Is there a way to get the entire code?
I don't know how to do it from browser but phantomjs provides a way to save client side rendered html.