I'm writing a userscript which detects "ship to" arbitrarily designed address forms and parses their contents. In order to do this, I need to find form "rows" (which may or may not be tr
elements) that contain both the label for the address (such as "Name", "Address1", etc) and the corresponding input
field for that tag. For example, in the following snippet:
<div>
<label>MaidenName</label>
<table><tbody>
<tr>
<td><label>FirstName</label></td>
<td><input value = "Bob"></td>
</tr>
<tr>
<td><label>LastName</label></td>
<td><input value = "Smith"></td>
</tr>
<tr>
<td><label>CompanyName</label></td>
<td><input value = "Ink Inc"></td>
</tr>
</tbody></table>
</div>
I would want to match all of the tr
elements, because they each contain a "Name" label and an input field. However, I would not want to match the div
on account of the "MaidenName" label, because it has broader scope than the matches found for the fields inside of the table.
My current algorithm to find these rows (which are often div
elements instead of tr
ones) is to:
Translating from the port I'm working in, the JQuery Javascript would look like the following:
// set up my two lists
var labelNodes = getLabelNodes();
var nodesWithAddress =
$().find("input[type='text']:visible, select:visible");
var pathToCommonParents = getLabelNodes()
.parentsUntil(nodesWithAddressChildren.parents()).parent();
// keep the highest-level nodes, so we only have the common paths -
//not the nodes between it and the labels.
return combinedNodeSet.filter(
function (index) {return $(this).find(combinedNodeSet).length == 0});
This works... but all of that traversing and comparing overhead absolutely wrecks my performance (this can take five seconds or more.)
What would be a better way of implementing this? I think the following pseudocode would be better, but I could be wrong, and I don't know how to implement it:
var filteredSet = $().find(*).hasAnyOf(labelNodes).hasAnyOf(nodesWithAddress);
return filteredSet.hasNoneOf(filteredSet);
to generalize micha's answer so that it can look in a page-independent way (different tags and nesting levels):
var labelNodes = getLabelNodes();
var returnNodes = $();
var regexAncestors = labelNodes.parent();
while (regexAncestors.length){
regexAncestors = regexAncestors.not("body");
var commonParentNodes = regexAncestors.has("input[type='text']:visible, select:visible");
returnNodes.add(commonParentNodes);
regexAncestors = regexAncestors.not(commonParentNodes).parent();
}
return returnNodes;