Several iPhone Apps (like my “iCab Mobile” or “NewsTap” Apps) provide a search feature which allows to search for text in the content that is currently displayed within a UIWebView. The found occurrences of the searched text are highlighted with a yellow background, so the search result can be visually located very easy.
This blog post describes how this can be implemented. I’m implementing this feature as a category for the UIWebView class, so you can use the new search feature for all UIWebView objects in your Apps very easily.
First of all, UIWebView doesn’t allow us to access its content directly, so we have to use JavaScript again. But if you’ve read my other blog posts, you already know this approach.
Our goal is to implement two methods for our new UIWebView category. One method should start the search and highlights the search results. As a result this method should return the number of occurrences of the text we’ve searched. The other method should remove all the highlighted search results again to restore the original layout of the web page.
What we need to do is to write the main code in JavaScript and a wrapper code in Objective C which is simply calling the JavaScript code.
We start with the JavaScript code that is doing the real work. As I’ve already described in the blog post WebKit on the iPhone, the JavaScript code will be saved as resource file in the XCode project. This way it can be loaded from within the Objective C code from the application bundle very easily, and we don’t mix the code of multiple programming languages (JavaScript and Objective C) in the same files.
The following code is the the JavaScript implementation; below I’ll explain what it is doing and how it works:
SearchWebView.js:
// We're using a global variable to store the number of occurrences
var MyApp_SearchResultCount = 0;
// helper function, recursively searches in elements and their child nodes
function MyApp_HighlightAllOccurencesOfStringForElement(element,keyword) {
if (element) {
if (element.nodeType == 3) { // Text node
while (true) {
var value = element.nodeValue; // Search for keyword in text node
var idx = value.toLowerCase().indexOf(keyword);
if (idx < 0) break; // not found, abort
var span = document.createElement("span");
var text = document.createTextNode(value.substr(idx,keyword.length));
span.appendChild(text);
span.setAttribute("class","MyAppHighlight");
span.style.backgroundColor="yellow";
span.style.color="black";
text = document.createTextNode(value.substr(idx+keyword.length));
element.deleteData(idx, value.length - idx);
var next = element.nextSibling;
element.parentNode.insertBefore(span, next);
element.parentNode.insertBefore(text, next);
element = text;
MyApp_SearchResultCount++; // update the counter
}
} else if (element.nodeType == 1) { // Element node
if (element.style.display != "none" && element.nodeName.toLowerCase() != 'select') {
for (var i=element.childNodes.length-1; i>=0; i--) {
MyApp_HighlightAllOccurencesOfStringForElement(element.childNodes[i],keyword);
}
}
}
}
}
// the main entry point to start the search
function MyApp_HighlightAllOccurencesOfString(keyword) {
MyApp_RemoveAllHighlights();
MyApp_HighlightAllOccurencesOfStringForElement(document.body, keyword.toLowerCase());
}
// helper function, recursively removes the highlights in elements and their childs
function MyApp_RemoveAllHighlightsForElement(element) {
if (element) {
if (element.nodeType == 1) {
if (element.getAttribute("class") == "MyAppHighlight") {
var text = element.removeChild(element.firstChild);
element.parentNode.insertBefore(text,element);
element.parentNode.removeChild(element);
return true;
} else {
var normalize = false;
for (var i=element.childNodes.length-1; i>=0; i--) {
if (MyApp_RemoveAllHighlightsForElement(element.childNodes[i])) {
normalize = true;
}
}
if (normalize) {
element.normalize();
}
}
}
}
return false;
}
// the main entry point to remove the highlights
function MyApp_RemoveAllHighlights() {
MyApp_SearchResultCount = 0;
MyApp_RemoveAllHighlightsForElement(document.body);
}
The basic principle of searching the text and removing the highlighted search results is the same: We’re working at DOM level (Document Object Model), which means the HTML document is represented as a tree structure where each HTML element, text, comment etc. is represented as a node. All the nodes are linked together with parent and child connections. The root element of each HTML document is the element that is created by the HTML tag. This element has usually two children: The HEAD element and the BODY element. Only the content of the BODY element is visible and displayed on screen, so we only need to process this part of the document tree.
What we need to do is to start with the body element and traverse all of its child nodes. From within the child nodes we need to go to their child nodes as well, and so on until we reach a leaf nodes, which has no child elements. Text nodes are always leaf nodes and text nodes are the nodes which might contain the text we’re looking for.
Traversing the whole HTML tree searching for all text nodes can be done by a recursive algorithm called Depth-First-Search (DFS). The DFS algorithm will traverse the tree structure starting from a root element (in our case the BODY element) to the first leaf node in a branch of the tree (for example going to the first child of the root first, from there again going to the first child, etc until a leaf node is reached). Then the algorithm goes back (backtracking) to the last node where not all child nodes were traversed yet and continues with the next unvisited child nodes etc. This way all nodes of the whole tree are traversed and we are able to find all the text nodes in which we are looking for the text we are earching.
The functions “MyApp_HighlightAllOccurencesOfStringForElement(element,keyword)” and “MyApp_RemoveAllHighlightsForElement(element)” are both implementations of this DFS algorithm. These functions are called from MyApp_HighlightAllOccurencesOfString(keyword)” and “MyApp_RemoveAllHighlights()” which are doing the necessary initialization and provide the DFS functions with the proper root element (the BODY element). The initialization for a new search is to make sure than no highlighted text from a previous search is present, so we simple call the function to remove all text highlights.
When searching for a text, we check if the currently inspected node is a text node or if it is an element node. If it is an element node it can have child nodes, and these must be inspected as well. If it is a text node, we have to find out if the text of this node contains the text we’re searching. If yes, we insert the text highlight, otherwise we are finished with this node. Also if the node is neither a text node nor an element node, there aren’t any child nodes which are interesting for us, so we are finished with this node as well.
When the text of a text node contains the searched text, we have to split the text into three parts. Part one will contain the text up to the searched text, part two contains the searched text itself and part three contains the rest of the text. A new element will be created (a SPAN element) and the second part (the searched text) will become a child node of this new element. Now we can assign StyleSheet rules to the newly created SPAN element to create the highlight effect (setting the background color to yellow, setting the text color to black, you can even increase the font size, if you want to). Now the new element is linked with part one and three so it becomes a part of the tree strucuture of the HTML tree. And because the searched text might be found multiple times in the original text node, we continue to search for the searched text in the third part of the original text node. If we find another occurrence of the searched text, we split this third part again in three parts, otherwise we are finished. When we create a SPAN element for the highlight effect, we also assign a special value (here “MyAppHighlight”) for the class attribute. This is important to be able to find these elements later again when we want to remove the highlight effects using the function “MyApp_RemoveAllHighlights()”. For this task we traverse the tree as well, but now we’re looking for elements whose class attribute has this special value. To restore the original state of the HTML document, we have to remove the elements we’ve inserted before (the ones with the special value of the class attribute) and we need to concatenate the text node we’ve split. JavaScript can help us to concatenate the text nodes again, because it provides the “normalize()” function which can do this for us.
In JavaScript we can find out the type of a node with the “nodeType” property. A value of 1 means that the node is a normal element node (like the “body” node, a “span” node etc.). A value of 3 means that the node is a text node. In this case the property nodeValue contains the text of the node. Other values for nodeType represent comment nodes (in HTML these are written as “<!– Comment –>”), attribute nodes (for HTML attributes like for example the “HREF” attribute for the “A” tag), document nodes and some more. In our case only the values 1 (element node) and 3 (text node) are important.
In the above implementation, we count the number of found occurrences in a global variable.
Note: You’ll notice that the JavaScript function names and variables and also the value for the class attribute I’m using in the above code are very lengthy and they do also have a prefix like “MyApp_”. The reason for this is to avoid any conflicts with existing function and variable names of the web page in which we inject our JavaScript code. If you’re generating the HTML code yourself that is displayed in the UIWebView object, you can choose shorter and simpler names. But if you have to deal with HTML and JavaScript code of any web pages (like in a web browser like iCab Mobile), you should use longer names and also add the name of your app as Prefix to all function and variable names to avoid any conflicts.
The Cocoa/Objective C part of the implementation is very simple. We only need to declare the interface and write a simple wrapper which loads and calls the JavaScript code that is actually doing all the hard work. The interface is also simple, we only need two methods: one to start the search and which highlights the found text and one which removes all the highlights again.
SearchWebView.h:
@interface UIWebView (SearchWebView) - (NSInteger)highlightAllOccurencesOfString:(NSString*)str; - (void)removeAllHighlights; @end
The typical use case would be to provide a search field where the user can enter some text. This text would be passed to the method “highlightAllOccurencesOfString:”. And when the user shakes the device, the App could call the method “removeAllHighlights” to remove all the highlighted search results again.
The implementation would look like this:
SearchWebView.m:
@implementation UIWebView (SearchWebView)
- (NSInteger)highlightAllOccurencesOfString:(NSString*)str
{
NSString *path = [[NSBundle mainBundle] pathForResource:@"SearchWebView" ofType:@"js"];
NSString *jsCode = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
[self stringByEvaluatingJavaScriptFromString:jsCode];
NSString *startSearch = [NSString stringWithFormat:@"MyApp_HighlightAllOccurencesOfString('%@')",str];
[self stringByEvaluatingJavaScriptFromString:startSearch];
NSString *result = [self stringByEvaluatingJavaScriptFromString:@"MyApp_SearchResultCount"];
return [result integerValue];
}
- (void)removeAllHighlights
{
[self stringByEvaluatingJavaScriptFromString:@"MyApp_RemoveAllHighlights()"];
}
@end
The first thing we’re doing in the method “highlightAllOccurencesOfString:” is to load the JavaScript file we’ve written above from the application bundle and inject it into the web page that is currently displayed in UIWebView. Because we’re implementing this as a category for UIWebView, we can use “self” to call the method “stringByEvaluatingJavaScriptFromString:” of the UIWebView instances.
After we’ve injected the JavaScript code we simply call the JavaScript function we’ve defined above to do the search.
And finally we access the variable we’ve defined in the JavaScript code, which represents the number of occurrences of the string that were found, and we return its integer value as the result of the method.
In the method “removeAllHighlights” we only need to call the corresponding JavaScript function we’ve defined in the JavaScript code from above. Loading the external JavaScript file and injecting it into the Web page is not necessary here. If we’ve started a search before, the code is already injected and we don’t need to do this again. And if we haven’t started a search before, we just don’t need the JavaScript code because there are no highlights which have to be removed.
As you can see, the Objective C code for the UIWebView category is just a simple wrapper code for the JavaScript code. Now, when you have a UIWebView object in your App, you can simply search for text in its content by calling “highlightAllOccurencesOfString:”, like in this example where we’re searching for the text “keyword”:
// webView is an instance of UIWebView [webView highlightAllOccurencesOfString:@"keyword"];
Additional notes: In case your App has to deal with web pages which can have frames, you have to add some additional code that looks for frames. You have to traverse all the documents of all the frames to find all the text nodes in all frames. The code from above isn’t doing this to keep it as simple as possible.
Hello
Thanks for this great tutorial. I implemented the same steps as you mentioned above but I am not able to get it done. Can you help me out?.
What exactly is your problem? You can send me a sample XCode project via email in case you find it too difficult to explain.
Hello Alexander
I just created a simple View based project where inside I included the UIWebView.I am loading the sample url of this page.I want search and highlight the word “the”. I am calling the [webView highlightAllOccurencesOfString:@"the"]; in – (void)webViewDidFinishLoad:(UIWebView *)webView1 delegate method. No changes in my webView. How can check whether the javascript is executing properly. I have also included the .js file inside the target and also removed from Compile Sources. Can I have your email so that I can send this project?
thank you.
First of all, there were silly copy-and-paste bugs in the javascript source, which I’ve now corrected. Now it should work fine. Sorry.
Another note: The delegate method “webViewDidFinishLoad:” is usually a bad location to access the DOM tree. This delegate is called when the data of the web page has finished loading, which does not necessary mean that the code is already fully rendered. So it is possible that the DOM tree is not fully created yet. Unfortunately UIWebView doesn’t provide a way to find out when the page is fully rendered, so the only thing you can do is just wait for half a second or so after the delegate method was called before doing anything with the DOM tree.
Hi Alexander , You are right now its working fine. And also I am not calling that function in the WebViewDidFinishLoad: , instead in a button action. It worked fine when I loaded the html content. But it failed to load when I loaded an XML file. No problem in displaying the content of xml file in UIWebView, but searching fails. What may be the reason? can you help me on this?
Thank You
XML files do need a special treatment. General XML document don’t use the “body” tag and usually also don’t have “span” tags. XHTML, which is also XML does have “body” and “span” tags, in this case the script from above should also work. Maybe you may need to exchange the “document.body” expression by “document.getElementsByTagName(‘body’)[0]” to get the body element. But in general almost all XHTML documents are delivered as HTML by web servers (because most copies of Internet Explorer do not support XHMTL properly), so they are also treated as HTML by the browser and then “document.body” should still work.
Hello
For me its working fine for the XHTML files. But still its not working when I load the .xml contents. I even changed the “document.body” to “document.getElementsByTagName(‘body’)[0]“; The xml file which I am using do contain ‘body’ tag.
Thanks
getElementsByTagName() should work with XML as well. You can try to use “window.documentElement” in XML instead of “window.document” (or “document”).
Hi.
The search is working fine when I rename “sample.xml” to “sample.xhtml” . How can I solve this ?
General XML doesn’t have specialized tags like “body” and “span” with predefined meanings and properties, like this is the case for XHTML or HTML. But my example uses these tags to find the root element for the visible area of the document and for the highlighting. For example in my example I assume that the “span” element has the “style” property (which is the case for XHTML and HTML, but not for XML, AFAIK).
So you may need to rewrite/modify the example code so that it will use the properties and elements that are used in your XML document. I’m not yet sure if it is possible to add inline-style attributes in XML and have them processed when rendering the document. So probably you have to add these CSS rules in a global stylesheet, which can be a little bit more challenging
THX ,It’s very useful.
First thanks for so detail explanation on how to archive this search in uiwebview
but, i would be to much to ask for a little project. that include this work around.
if the problem is hosting it i and wiling to do it for you.
Best Regards
HP
Finally got it working, any way to scroll to the first fund item.
Best Regards
HP
@quky
Scrolling can be done in Javascript with “window.scrollBy(x,y)”. So if you know the coordinates of the element you want to scroll into view, you can simply use the scrollBy() function. And for getting the coordinates of an element node, you can use the offsetLeft/offsetTop/offsetParent properties of the element. offsetLeft/offsetTop measure the coordinates of an element relative to its “offsetParent” element, so you have to start with the element node itself and loop through all the offsetParent elements until you reach the root of the document (offsetParent is null) and sum up all offsetTop and offsetLeft values of all the visited elements (the element you’ve started with and all the offsetParent elements). Then you get the absolute coordinates of the element relative to the document itself.
thanks Alex now you leave me with a big home work. i am new to all this so let me figure out how to implement that .
Best regards
HP
OK Alexander i insert this code in the .js file along the lines
element.parentNode.insertBefore(span, next);
element.parentNode.insertBefore(text, next);
element = text;
MyApp_SearchResultCount++; // update the counter
// code added for position search on first ocurrence
if (MyApp_SearchResultCount == 1) {
var curleft = curtop = 0;
if (element.offsetParent) {
do {
curleft += element.offsetLeft;
curtop += element.offsetTop;
} while (element = element.offsetParent);
}
} // end off code position search on first ocurrence
}
} else if (element.nodeType == 1) { // Element node
if (element.style.display != “none” && element.nodeName.toLowerCase() != ’select’) {
but the curleft and curtop always return 0 value
i am pulling the value from
- (NSInteger)highlightAllOccurencesOfString:(NSString*)str
{
NSString *path = [[NSBundle mainBundle] pathForResource:@”SearchWebView” ofType:@”js”];
NSString *jsCode = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
[self stringByEvaluatingJavaScriptFromString:jsCode];
NSString *startSearch = [NSString stringWithFormat:@"MyApp_HighlightAllOccurencesOfString('%@')",str];
[self stringByEvaluatingJavaScriptFromString:startSearch];
NSString *posX = [self stringByEvaluatingJavaScriptFromString:@"curleft"];
NSLog([NSString stringWithFormat:@"posX %.2f ",[posX integerValue]]);
NSString *posY = [self stringByEvaluatingJavaScriptFromString:@"curtop"];
NSLog([NSString stringWithFormat:@"posY %.2f ",[posY integerValue]]);
NSString *result = [self stringByEvaluatingJavaScriptFromString:@"MyApp_SearchResultCount"];
can you shade some light on this please
Best Regards
HP
@quky
You may need to wait for a short time before calculating the coordinates, because inserting a node requires that the web page must be re-rendered. The new element can change the layout. And if the re-rendering is done in the background you can’t immediately get the correct coordinates.
I just can’t seem to get this to work with my code, I have a few words, each being anchored in the HTML (a href). What I want to do is have the word/link highlighted when clicked, so I search for the link using the above code. Only it doesn’t show, and there are no search results. Is the search only working on plain text (with no anchor), and if so can it be modified? The CSS I am using contains the following:
-webkit-touch-callout:none; -webkit-user-select: none;
Please advise.
@mserougi
Sorry, but I’m not sure what exactly you’re doing. The code from above will work with HTML code, and the HTML code can also contain links. You could send me your code (via email if it’s more than just a few lines) so I can see what you’re trying to do. Probably I can then see what’s going wrong.
Oh, I see the HTML I’ve added to my comment, some of it didn’t show, I apologize, as this code now doesn’t make much sense. Please have the comment deleted, I will send it by email.
Hi tried the same but its not working.
the sample is available at
http://dl.dropbox.com/u/1658440/Ashish/SeachTest.zip
need help in that.whats wrong why its not seraching the text. I have followed all the steps I think which you have guided above.
@akp
Your problem is caused by a simple typo. In the method “highlightAllOccurencesOfString” you try to load the file “SearchWebView.js” from the resources, but the real file name of the JavaScript file is “SearchwebView.js”. So the file is not loaded (the path is nil), loading the content of the file fails and so there’s no JavaScript code that can be executed. Correcting the file name will fix your problem.
I was wondering, thanks for your great help and great blog.
I am using a UIWebView in a document viewer. These methods work great for a variety of documents, not just html. I searched doc, xls, and ppt, documents successfully. Any idea of how to utilize these methods with pdfs or excel spreadsheets?
My methods rely on HTML structures, so if these methods do work for doc, xsl and ppt, these documents are probably converted to HTML internally. And I assume that PDF is displayed directly (I assume that the iPhone OS directly supports PDF, just like the MacOS) and not converted to HTML and so these methods don’t work here.
I’m only guessing, because I don’t know all of the internals of the UIWebView class.