Class: Senior Web Developer
Alignment: Chaotic Neutral
Welcome to my Page Analyzer tool! Here, you can run a series of tests against any URL on the web.
This tool is provided for demonstration purposes only. To prevent abuse, you are limited to a # of requests per day from your IP.
Checking in with application server...
Once the document is completely loaded, the DOM is parsed. Using a pre-defined list of valid and invalid element tags, the script recursively keeps track of the content and length of all valid elements, then merges eligible elements' together until the largest "blob" of text is identified and then returns the recursive nodeValue (most reliable method of retreiving unformatted contents of an element) of the largest identified element.
The element identified as the main article of a page is passed to a secondary parsing method that further refines the article content by removing extra spaces, punctuation, and transforms into root words. This result is then passed to another function that turns the word list into an array, and passes each word through a series of rules that modify its weight. The exact rules are secret, but it's obvious that the largest effect on weight is the number of occurances. The weighted array is passed back to this page and a script creates a table of all words with a score over 1. Another script is then triggered to read the table and create a line chart.