|
eValid -- Automated Web Quality Solution
Browser-Based, Client-Side, Functional Testing & Validation,
Load & Performance Tuning, Page Timing, Website Analysis,
and Rich Internet Application Monitoring.
|
|
eValid -- String Processing Details
eValid Home
Background
This page explains certain details about how eValid handles strings
in the string filter.
During processing it is important to understand that the string matching
work is done within the browser image (using the DOM), and not from the
raw HTML file.
This can produce some behaviors that appear unusual if not understood.
Even so, this is a superior way to analyze page content because the analysis
happens on precisely what is shown to the user on the screen,
regardless of how it was assembled there.
"What you see is really what you see!"
Searches of what you
see really do match what you see,
and do not necessarily match the content of the files that
were used to put what you see on the screen you're viewing.
Detailed Explanations
- Line Breaks.
Because the searching is done from the internal image of the HTML
line breaks that may appear in the original HTML may not appear
in your string searches.
Words that are hyphenated around a line break in the original HTML
will have an extra blank included because the hypen is not treated.
- HTML Tags.
The browser "lifts" all HTML tags to UPPER CASE.
Even if the original HTML had, for example <a ... the
actual matchable string would appear as <A ....
- Modifier/Attributes.
The internal model "lowers" the case on the details put into tags
that are modifiers or attributes.
Hence if the HTML has <IMG ALT=example... the
string matching will be on the processed HTML and would
appear as (and match as) <IMG alt=example....
- Text Strings.
Text is left alone and is matched exactly in visible text
searches (except for line breaks, see above).
- Comment Strings.
Comments of the form <!-- this is a comment -->
are treated as visible text.
- Instantiated Comments.
There are some forms of HTML in which, because comments are instantiated
within what ultimately appears as visible text, you would be able to
search for strings that appear on the screen that do NOT appear as
contiguous strings in the pure HTML source document.