Several iPhone Apps (like my “iCab Mobile” or “NewsTap” Apps) provide a search feature which allows to search for text in the content that is currently displayed within a UIWebView. The found occurrences of the searched text are highlighted with a yellow background, so the search result can be visually located very easy.
This blog post describes how this can be implemented. I’m implementing this feature as a category for the UIWebView class, so you can use the new search feature for all UIWebView objects in your Apps very easily.
First of all, UIWebView doesn’t allow us to access its content directly, so we have to use JavaScript again. But if you’ve read my other blog posts, you already know this approach.
Our goal is to implement two methods for our new UIWebView category. One method should start the search and highlights the search results. As a result this method should return the number of occurrences of the text we’ve searched. The other method should remove all the highlighted search results again to restore the original layout of the web page.
What we need to do is to write the main code in JavaScript and a wrapper code in Objective C which is simply calling the JavaScript code.
We start with the JavaScript code that is doing the real work. As I’ve already described in the blog post WebKit on the iPhone, the JavaScript code will be saved as resource file in the XCode project. This way it can be loaded from within the Objective C code from the application bundle very easily, and we don’t mix the code of multiple programming languages (JavaScript and Objective C) in the same files.
The following code is the the JavaScript implementation; below I’ll explain what it is doing and how it works:
SearchWebView.js:
// We're using a global variable to store the number of occurrences
var MyApp_SearchResultCount = 0;
// helper function, recursively searches in elements and their child nodes
function MyApp_HighlightAllOccurencesOfStringForElement(element,keyword) {
if (element) {
if (element.nodeType == 3) { // Text node
while (true) {
var value = element.nodeValue; // Search for keyword in text node
var idx = value.toLowerCase().indexOf(keyword);
if (idx < 0) break; // not found, abort
var span = document.createElement("span");
var text = document.createTextNode(value.substr(idx,keyword.length));
span.appendChild(text);
span.setAttribute("class","MyAppHighlight");
span.style.backgroundColor="yellow";
span.style.color="black";
text = document.createTextNode(value.substr(idx+keyword.length));
element.deleteData(idx, value.length - idx);
var next = element.nextSibling;
element.parentNode.insertBefore(span, next);
element.parentNode.insertBefore(text, next);
element = text;
MyApp_SearchResultCount++; // update the counter
}
} else if (element.nodeType == 1) { // Element node
if (element.style.display != "none" && element.nodeName.toLowerCase() != 'select') {
for (var i=element.childNodes.length-1; i>=0; i--) {
MyApp_HighlightAllOccurencesOfStringForElement(element.childNodes[i],keyword);
}
}
}
}
}
// the main entry point to start the search
function MyApp_HighlightAllOccurencesOfString(keyword) {
MyApp_RemoveAllHighlights();
MyApp_HighlightAllOccurencesOfStringForElement(document.body, keyword.toLowerCase());
}
// helper function, recursively removes the highlights in elements and their childs
function MyApp_RemoveAllHighlightsForElement(element) {
if (element) {
if (element.nodeType == 1) {
if (element.getAttribute("class") == "MyAppHighlight") {
var text = element.removeChild(element.firstChild);
element.parentNode.insertBefore(text,element);
element.parentNode.removeChild(element);
return true;
} else {
var normalize = false;
for (var i=element.childNodes.length-1; i>=0; i--) {
if (MyApp_RemoveAllHighlightsForElement(element.childNodes[i])) {
normalize = true;
}
}
if (normalize) {
element.normalize();
}
}
}
}
return false;
}
// the main entry point to remove the highlights
function MyApp_RemoveAllHighlights() {
MyApp_SearchResultCount = 0;
MyApp_RemoveAllHighlightsForElement(document.body);
}
The basic principle of searching the text and removing the highlighted search results is the same: We’re working at DOM level (Document Object Model), which means the HTML document is represented as a tree structure where each HTML element, text, comment etc. is represented as a node. All the nodes are linked together with parent and child connections. The root element of each HTML document is the element that is created by the HTML tag. This element has usually two children: The HEAD element and the BODY element. Only the content of the BODY element is visible and displayed on screen, so we only need to process this part of the document tree.
What we need to do is to start with the body element and traverse all of its child nodes. From within the child nodes we need to go to their child nodes as well, and so on until we reach a leaf nodes, which has no child elements. Text nodes are always leaf nodes and text nodes are the nodes which might contain the text we’re looking for.
Traversing the whole HTML tree searching for all text nodes can be done by a recursive algorithm called Depth-First-Search (DFS). The DFS algorithm will traverse the tree structure starting from a root element (in our case the BODY element) to the first leaf node in a branch of the tree (for example going to the first child of the root first, from there again going to the first child, etc until a leaf node is reached). Then the algorithm goes back (backtracking) to the last node where not all child nodes were traversed yet and continues with the next unvisited child nodes etc. This way all nodes of the whole tree are traversed and we are able to find all the text nodes in which we are looking for the text we are earching.
The functions “MyApp_HighlightAllOccurencesOfStringForElement(element,keyword)” and “MyApp_RemoveAllHighlightsForElement(element)” are both implementations of this DFS algorithm. These functions are called from MyApp_HighlightAllOccurencesOfString(keyword)” and “MyApp_RemoveAllHighlights()” which are doing the necessary initialization and provide the DFS functions with the proper root element (the BODY element). The initialization for a new search is to make sure than no highlighted text from a previous search is present, so we simple call the function to remove all text highlights.
When searching for a text, we check if the currently inspected node is a text node or if it is an element node. If it is an element node it can have child nodes, and these must be inspected as well. If it is a text node, we have to find out if the text of this node contains the text we’re searching. If yes, we insert the text highlight, otherwise we are finished with this node. Also if the node is neither a text node nor an element node, there aren’t any child nodes which are interesting for us, so we are finished with this node as well.
When the text of a text node contains the searched text, we have to split the text into three parts. Part one will contain the text up to the searched text, part two contains the searched text itself and part three contains the rest of the text. A new element will be created (a SPAN element) and the second part (the searched text) will become a child node of this new element. Now we can assign StyleSheet rules to the newly created SPAN element to create the highlight effect (setting the background color to yellow, setting the text color to black, you can even increase the font size, if you want to). Now the new element is linked with part one and three so it becomes a part of the tree strucuture of the HTML tree. And because the searched text might be found multiple times in the original text node, we continue to search for the searched text in the third part of the original text node. If we find another occurrence of the searched text, we split this third part again in three parts, otherwise we are finished. When we create a SPAN element for the highlight effect, we also assign a special value (here “MyAppHighlight”) for the class attribute. This is important to be able to find these elements later again when we want to remove the highlight effects using the function “MyApp_RemoveAllHighlights()”. For this task we traverse the tree as well, but now we’re looking for elements whose class attribute has this special value. To restore the original state of the HTML document, we have to remove the elements we’ve inserted before (the ones with the special value of the class attribute) and we need to concatenate the text node we’ve split. JavaScript can help us to concatenate the text nodes again, because it provides the “normalize()” function which can do this for us.
In JavaScript we can find out the type of a node with the “nodeType” property. A value of 1 means that the node is a normal element node (like the “body” node, a “span” node etc.). A value of 3 means that the node is a text node. In this case the property nodeValue contains the text of the node. Other values for nodeType represent comment nodes (in HTML these are written as “<!– Comment –>”), attribute nodes (for HTML attributes like for example the “HREF” attribute for the “A” tag), document nodes and some more. In our case only the values 1 (element node) and 3 (text node) are important.
In the above implementation, we count the number of found occurrences in a global variable.
Note: You’ll notice that the JavaScript function names and variables and also the value for the class attribute I’m using in the above code are very lengthy and they do also have a prefix like “MyApp_”. The reason for this is to avoid any conflicts with existing function and variable names of the web page in which we inject our JavaScript code. If you’re generating the HTML code yourself that is displayed in the UIWebView object, you can choose shorter and simpler names. But if you have to deal with HTML and JavaScript code of any web pages (like in a web browser like iCab Mobile), you should use longer names and also add the name of your app as Prefix to all function and variable names to avoid any conflicts.
The Cocoa/Objective C part of the implementation is very simple. We only need to declare the interface and write a simple wrapper which loads and calls the JavaScript code that is actually doing all the hard work. The interface is also simple, we only need two methods: one to start the search and which highlights the found text and one which removes all the highlights again.
SearchWebView.h:
@interface UIWebView (SearchWebView) - (NSInteger)highlightAllOccurencesOfString:(NSString*)str; - (void)removeAllHighlights; @end
The typical use case would be to provide a search field where the user can enter some text. This text would be passed to the method “highlightAllOccurencesOfString:”. And when the user shakes the device, the App could call the method “removeAllHighlights” to remove all the highlighted search results again.
The implementation would look like this:
SearchWebView.m:
@implementation UIWebView (SearchWebView)
- (NSInteger)highlightAllOccurencesOfString:(NSString*)str
{
NSString *path = [[NSBundle mainBundle] pathForResource:@"SearchWebView" ofType:@"js"];
NSString *jsCode = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
[self stringByEvaluatingJavaScriptFromString:jsCode];
NSString *startSearch = [NSString stringWithFormat:@"MyApp_HighlightAllOccurencesOfString('%@')",str];
[self stringByEvaluatingJavaScriptFromString:startSearch];
NSString *result = [self stringByEvaluatingJavaScriptFromString:@"MyApp_SearchResultCount"];
return [result integerValue];
}
- (void)removeAllHighlights
{
[self stringByEvaluatingJavaScriptFromString:@"MyApp_RemoveAllHighlights()"];
}
@end
The first thing we’re doing in the method “highlightAllOccurencesOfString:” is to load the JavaScript file we’ve written above from the application bundle and inject it into the web page that is currently displayed in UIWebView. Because we’re implementing this as a category for UIWebView, we can use “self” to call the method “stringByEvaluatingJavaScriptFromString:” of the UIWebView instances.
After we’ve injected the JavaScript code we simply call the JavaScript function we’ve defined above to do the search.
And finally we access the variable we’ve defined in the JavaScript code, which represents the number of occurrences of the string that were found, and we return its integer value as the result of the method.
In the method “removeAllHighlights” we only need to call the corresponding JavaScript function we’ve defined in the JavaScript code from above. Loading the external JavaScript file and injecting it into the Web page is not necessary here. If we’ve started a search before, the code is already injected and we don’t need to do this again. And if we haven’t started a search before, we just don’t need the JavaScript code because there are no highlights which have to be removed.
As you can see, the Objective C code for the UIWebView category is just a simple wrapper code for the JavaScript code. Now, when you have a UIWebView object in your App, you can simply search for text in its content by calling “highlightAllOccurencesOfString:”, like in this example where we’re searching for the text “keyword”:
// webView is an instance of UIWebView [webView highlightAllOccurencesOfString:@"keyword"];
Additional notes: In case your App has to deal with web pages which can have frames, you have to add some additional code that looks for frames. You have to traverse all the documents of all the frames to find all the text nodes in all frames. The code from above isn’t doing this to keep it as simple as possible.
I’m Korean Developer.
Wow this is really good!!
thx a lot
Hi Alexander.
Sorry to bother, but I would be very glad if you could help.
I’m trying to modify your code so that I can also add anchors to each keyword that has been found, but I haven’t been able so far.
Could you please guide me on this? I’m know next to nothing about Javascript, so it’s been really hard trying to modify your code, even with lots of searches.
Thanks in advance.
@Aloha Silver
If you need a link, then you can simple create an “a” tag instead of a “span” tag. And you
have to set the “href” attribute for the newly created link element (for example this way: linkElement.setAttribute(“href”,”link-url”) ).
Alexander, it works for standard names, actually. I was trying to add two tags at the same time, but it turns out I don’t need to.
I’m actually trying to set the “name” attribute of the “a” tag, so I can look for the words later.
The thing is, I’m trying to set dynamic names, like linkElement.setAttribute(“name”, “foundText”+MyApp_SearchResultCount). This way, I could scroll through all of the results programatically by looking for those tags.
But in every test I’ve done so far, It does not work, and I just can’t figure out why.
Do you you have any ideia about what could be wrong?
Thanks a lot.
@Aloha Silver
What exactly are you doing when you try to scroll to these new elements?
HI Alexander,
This is really good..now I’m trying to put like an image behind that selected text! Is that possible?
@Marie
Yes, this should be possible. I think you only need to set a background image, which can be done by the CSS rule “background-image: url(urlOfImage)”. And when you set this CSS rule in JavaScript, you could do it this way: element.style.backgroundImage = “urlToImage”
Alexander, I’m trying to scroll to each element by calling the javascript window.location.hash=NSString.
It works for window.location.hash=’foundText0′, but window.location.hash=’foundText1′ already returns my scroll position to the last result, even when there are 8 results (so there should be foundText0, foundText1, foundText2 … foundText7 “a” tags).
@Aloha Silver
This does work for me (the semicolon at the end seems to be important, because without the semicolon it doesn’t work. I assume this is a bug in the WebKit of the iOS because the semicolon should be optional here):
window.location.hash=’foundText3′;
Hi Alexander, your post is great. Thanks a lot for all the information.
For my application however, I am trying to load a pdf document in the UIWebView and want to search text in pdf document. Can you please let me know how this would be possible. Is there any way I can convert pdf to html without letting go of table views etc. I need urgent help on this matter. Thanks in advance
@Arohi
Sorry, but I can’t help you with this. I don’t think you can access the inner details of PDF files that are displayed by UIWebView.
Hi Alexander!
I tried what you told me…but i can’t get the image behind that selected text…i don’t know why! is there another way to do it… :/
Thanks!
Oh! never mind! it was a problem with the UIWebView…thanks! It works perfectly!
Alexander, using the semicolon works!
I just need to understand HTML nodes better before successfully implementing my idea now.
Thanks a lot!
this post is very usefull thx!
Thank you, Alexander!
Great note!
Hi Alexander,
Thank you for putting this page together. I have a question though.
I am having the same problem as @Siddu. I cannot get the javascript to execute inside the “if (element.nodeType == 3)” statement. I would like to do a simple search for text in an html script that I created and displayed in the UIWebview. For example, I may display a string defined by the following html:
@”The Meaning of Life … really is 42!”
the search for the string “Life”
Any ideas on why MyApp_SearchResultCount shows up as 0? Thank you kindly.
-Ryan
@Ryan
Could you create a simple text project where this issue can be seen and then send it to me or post a link to the project? The above code works in my projects, so I would need to check this myself in order to help.
@Ryan and others who have the same issue
Please make sure that you call “highlightAllOccurencesOfString:” after the page is completely loaded and rendered. If you call it too early, the HTML code is not yet rendered and so the script won’t be able to find anything. So at least you have to wait until the UIWebView delegate “webViewDidFinishLoad:” is called.
It worked! I had to brush up on what is meant to call a delegate method, but after that it was pretty straight forward.
Thanks for your help Alexander!
Hi Alexander
i had a problem, i need to load a custom css on a uiwebview
NSURL *url = [NSURL URLWithString:indirizzo];
NSURLRequest *requestObj = [NSURLRequest requestWithURL:url];
[webView loadRequest:requestObj];
i must style the page via the css after i loaded it. You think it’s possible to inject javascript with my css rules, and finally show the webpage modified?
Thanks for your help!
@Fabio
Yes, you can inject JavaScript code with new StyleSheets. But this can be only done after the page is loaded, so the page will load and display with the original stylesheets first before you can set the new stylesheets.
Alexander, i thank you anticipately, but how i could manage the
[webView loadRequest:requestObj];
like in java(i come from it)
UiWebView view= webView.loadRequest(requestObj);
view.injectSomeJs(myjs);
view.show();
how it could be done(if it’s not so much complicated)
actually, i load my webView on
-(void)viewDidLoad{..}
i tried to put the code snippets you suggested on
- (void)viewDidFinishLoad:(UIWebView *)webView {…}
but the debug never reach this point
any suggestion?
@Fabio
The delegate method “viewDidFinishLoad:” is the right place to inject your JavaScript code. But you have to use the “setDelegate:” method of the UIWebView object first to set the object which implements this “viewDidFinishLoad:” method. If you donÄ’t set the delegate, the delegate method is not called.
And you have to use the method “stringByEvaluatingJavaScriptFromString:” of UIWebView to execute JavaScript code. Here you can pass any Javascript code to the UIWebView.