Search code examples
algorithmbrowserinternals

How does a browser 'find' something on a webpage?


Now this may be a really trivial question, but how do modern browsers handle the find ( Ctrl + F ) operation on webpages?

Do they convert it to some plain text representation by regex'ing out all HTML/CSS/JS from the webpage and then running a recursive find?


Solution

  • This question is a rather loaded question because various browsers can perform slightly different from others based on the purpose and design of the browser's engine. However, they typically should produce similar looks and feels based on W3C standards. The best way the find out how each browser functions would be to go the individual website of the browser manufacturer to research the mapping system that it uses. HTML by default is a tree node system where a document can branch off into other subtrees. One pathing system that can be used is called XPath. Below are some links respectively how browsers function, W3 Schools, and XPath. Hopefully these will help you to at least understand the concept of browser functionality. I would start with the rendering engine link first.

    https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/#The_rendering_engine

    https://www.w3schools.com

    https://librarycarpentry.org/lc-webscraping/02-xpath/index.html