Skip to content


elementFromPoint() under iOS 5

The JavaScript call elementFromPoint(x,y) can be used to find the element of a web page at a certain coordinate. If you have used this call in the past on a web page or in a native iOS App which was developed for iOS 4.x or older, you’ll notice that your web page or App might fail when used under iOS 5. Under iOS 5 the elementFromPoint(x,y) call finds different elements or even returns null instead of an element. It looks like the call is now broken. But this is not the case, in fact, under iOS 5 it works correct the first time, it was broken before.

elementFromPoint(x,y) was defined to return the element at the given coordinates within the view port (the visible area) of the web page, or null (if the coordinates are outside of the viewport). The coordinates are measured relative to the origin of the view port. This is how iOS 5 finally works.

Before (iOS 4.x and older), elementFromPoint(x,y) completely ignored the view port. It measured the coordinates relative to the origin of the document. And even elements outside of the visible area could be found. This behavior seems to make much more sense than the new iOS 5 behavior, but according to the official JavaScript specification, it’s not the correct behavior.

The different behavior between iOS 5 and older iOS versions can cause some serious problems. The coordinate systems are no longer compatible, so when the web page or App has to run under old an new iOS versions, it is necessary to find out the correct coordinate system that is used.

In a native iOS App you could simply check the iOS version, and based on its value you can decide which coordinate system you need to use when calling elementFromPoint(x,y). But when writing a web page, this is not so easy: the iOS version is not exposed to the web page (it might be part of the UserAgent information, but because almost all browser do allow to use a fake userAgent information, this information is not reliably at all). Also on the Mac and on other platforms different WebKit releases might be used which do use different coordinate systems for the elementFromPoint(x,y) call as well. Therefore it makes sense to find a way to identify the coordinate system independent of the iOS version, and if necessary correct the coordinates.

At first when we compare the two coordinate systems, we notice that the coordinates are offset by the scroll location. If we scroll the web page so that the top left corner of the page is visible, both coordinate systems are the same. The origin of the view port is identical to the origin of the web page. And the scroll offset is also 0 in both directions. If you scroll down 100 px, the origin (the coordinate (0,0)) of the viewport is located at the coordinate (0,100) of the web page. So the scroll offset is exactly the offset between the two coordinate systems. Therefore, transforming one coordinate system into the other is very easy. We only need to add or subtract the actual scroll offsets.

function documentCoordinateToViewportCoordinate(x,y) {
  var coord = new Object();
  coord.x = x - window.pageXOffset;
  coord.y = y - window.pageYOffset;
  return coord;
}

function viewportCoordinateToDocumentCoordinate(x,y) {
  var coord = new Object();
  coord.x = x + window.pageXOffset;
  coord.y = y + window.pageYOffset;
  return coord;
}

These JavaScript functions take a coordinate of one system and transform them into a coordinate of the other system.

But in order find out if and which of the functions we need to use, we have to find out, in which coordinate system the call elementFromPoint(x,y) expects the coordinates. To do this we use the fact that elementFromPoint() returns null when the coordinates are outside of the view port, when it expects coordinates measured relative to the viewport (as noted above, when the coordinates are relative to the origin of the document, elementFromPoint() will always return an element, even when outside of the visible area, so we can distinguish between the two cases).
Good test coordinates would be (0, window.pageYOffset + window.innerHeight -1) and (window.pageXOffset + window.innerWidth -1, 0), for vertical scrolling and horizontal scrolling. As noted above, when no scrolling is done, both coordinate systems are identical, and we don’t need to take care about anything. But if the page is scrolled, we need to check which system is used. The test coordinates take the actual scroll offset and add the width or height of the visible area (this is the innerWidth and innerHeight of the “window” object) and subtract 1. This makes sure that the coordinate addresses the very last pixel line or column of the visible area measured relative to the document origin. This is always a valid document-based coordinate which lies within the document boundaries (a coordinate outside of the document boundaries would return null even with the elementFromPoint() call for the document-based coordinate system). If the page is scrolled by at least one single pixel, the test coordinates would lie outside of the viewport, when interpreted as relative to the viewport, so elementFromPoint() would return null. When elementFromPoint() would interpret them relative to the document, these coordinates are always valid and would always return an element. And this is how we can easily detect, which coordinate system elementFromPoint() is using.

function elementFromPointIsUsingViewPortCoordinates() {
  if (window.pageYOffset > 0) {     // page scrolled down
    return (window.document.elementFromPoint(0, window.pageYOffset + window.innerHeight -1) == null);
  } else if (window.pageXOffset > 0) {   // page scrolled to the right
    return (window.document.elementFromPoint(window.pageXOffset + window.innerWidth -1, 0) == null);
  }
  return false; // no scrolling, don't care
}

We can combine this to one custom elementFromPoint() function that is using a document-based coordinate system as input and will internally do all the magic for us:

function elementFromDocumentPoint(x,y) {
  if (elementFromPointIsUsingViewPortCoordinates()) {
    var coord = documentCoordinateToViewportCoordinate(x,y);
    return window.document.elementFromPoint(coord.x,coord.y);
  } else {
    return window.document.elementFromPoint(x,y);
  }
}

And the counterpart for viewport-based coordinates:

function elementFromViewportPoint(x,y) {
  if (elementFromPointIsUsingViewPortCoordinates()) {
    return window.document.elementFromPoint(x,y);
  } else {
    var coord = viewportCoordinateToDocumentCoordinate(x,y);
    return window.document.elementFromPoint(coord.x,coord.y);
  }
}

So instead of using elementFromPoint() directly, you simply use elementFromViewportPoint() or elementFromDocumentPoint() instead, depending of the coordinates you have to deal with. It will then work correct in old and new WebKit releases.

Please note: if you use the code of my older blog post “Customize the contextual menu of UIWebView” in your projects, you need to update this as well, because it also uses the elementFromPoint() call. But this should be really easy to do.

Posted in iPhone & iPod Touch, Programming, Tips & tricks, Web Technology.

Tagged with , , , , .


18 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. thantthet says

    Thanks. This solves the problem.

  2. NotAnExpert says

    Alexander,

    Thank you for sharing your ideas about using UIWebView to create a web browser app for iOS.

    Here’s a function that I use to find an element in a document that contains frames, in case anyone might find it useful. The function makes liberal use of your concepts, for which I am deeply indebted.

    function MyElementFromPoint(x, y)
    {
    	/*
    		The isUsingViewportCoords variable will be true in the case of iOS 5 or later 
    		behavior and the window is scrolled, and will be false if using pre-iOS 5 behavior 
    		or the window is not scrolled, even if using iOS 5 or later behavior.
    	*/
    	var isUsingViewportCoords = true;
    	// If scrolled in X direction...
    	if (window.pageXOffset > 0) {
    		// ...and if an element is found beyond the right edge of the window...
    		if (document.elementFromPoint(window.pageXOffset + window.innerWidth - 1, 0)) {
    			// ...the operating system is not using viewport coordinates.
    			isUsingViewportCoords = false;
    		}
    	}
    	// Else if scrolled in Y direction...
    	else if (window.pageYOffset > 0) {
    		// ...and if an element is found beyond the lower edge of the window...
    		if (document.elementFromPoint(0, window.pageYOffset + window.innerHeight - 1)) {
    			// ...the operating system is not using viewport coordinates.
    			isUsingViewportCoords = false;
    		}
    	}
    	// If not using viewport coordinates, must first add scroll offsets to get correct 
    	// element.
    	if (!isUsingViewportCoords) {
    		x += window.pageXOffset;
    		y += window.pageYOffset;
    	}
    	// Get element from point in document. Then move down hierarchy of frame and 
    	// iframe elements until a non-frame/iframe element is found. That element 
    	// is the element at the point x,y in the document.
    	var doc = document;
    	var elem = doc.elementFromPoint(x, y);
    	while (elem.nodeName == "FRAME" || elem.nodeName == "IFRAME") {
    		// If using viewport coordinates, must account for offset of the frame or 
    		// iframe element.
    		if (isUsingViewportCoords) {
    			var boundRect = elem.getBoundingClientRect();
    			x -= boundRect.left;
    			y -= boundRect.top;
    		}
    		doc = elem.contentDocument;
    		elem = doc.elementFromPoint(x, y);
    	}
    	return elem;
    }
    

    Caveat: I’m not an IT guy, and certainly not an expert in html, javascript, or any other computer stuff. But the code above is getting it done so far (tested on iOS 4.2, iOS 4.3, and iOS 5). I have developed a limited-feature web browser for accessing a company internal-use web site (about 5000 users). This site uses, shall we say, antiquated web techniques that just don’t work with any of the standard iOS browsers (frames, double-click handlers, command- and shift-keys, eyebrow-raising javascript, etc.).

    Alexander, I couldn’t have created this app without your ideas. Again, thank you.

  3. Shawn says

    I have an interesting issue … When I change my “Region Format” in Settings->General->International from “United States” to “Dutch->Belgium” the elementFromPoint function always returns null. If I switch it back it works perfectly. This happens in all iOS versions that I have been able to test.

    Any thoughts?

  4. Alexander Alexander says

    @Shawn
    I can’t reproduce this on my devices. But maybe some other settings have to be also configured in a certain way to get this effect. In any case, if you see this strange effect, and you’re sure that this is not your own fault, this is probably a bug and you should report this to Apple.

  5. max says

    i put all this code in the JSTools.js.. how do i use this new code?:[

  6. Alexander Alexander says

    @max
    Please also read the other blog posts, like http://www.icab.de/blog/2011/08/02/adding-javascript-files-as-resources-to-an-xcode-project/
    Here you’ll find out how to include JS code into your Xcode project and how to inject it into the web page.

  7. max says

    @Alexander
    i have the code working already. i used it from another guys post in that blog. it has the JS code and all. i put the functions in hear into there, but kept the others. it doesnt work any different? i tried lots of ways like nsloging the function and replacing the x and y with %i and using point.x and point.y, return nothing, and tried more.. i dont get anything from the functions.. can you show me code of how to use these nows? they are all in the JS code. thanks

  8. Alexander Alexander says

    Please make sure that you understand that we’re dealing with two different types of code: Objective-C and JavaScript. You can’t directly call Objective-C from within JS and vice versa. Also elements of one language does not work in another language. So for example
    NSLog() does not work for debugging JavaScript. Also when using placeholders like %i, you have to make sure that you replace these by the real values within your Objective-C code. Within the JS code this won’t work.

    There’s only one way to execute JavaScript code from within Objective-C code, and this is through the method “stringByEvaluatingJavaScriptFromString:” of UIWebView, which expects the JavaScript code as string, and which will return the result of the last expression in the JavaScript code as string.

    You can use this to “inject” any JavaScript code into a web page, by passing in a string with the definition of the functions. Once injected, you can call these injected functions with different parameters multiple times. But in all cases, the code has to be the final one, so all placeholders of a Objective-C string (like %i, %@ etc) must be already replaced with their correct values, because within JavaScript these placeholders don’t have any meaning.

    Also make sure that the JavaScript code is correctly included in your X-Code project, otherwise when you try to load it, it may fail any you try to pass an empty string to the web page instead of your code. This is a common issue many developers have when they start programming with JS for UIWebView, because by Xcode by default treats the JS code as “code” and not as “resource”, but in order to get this working, it must be treated as “resource”.

  9. max says

    @alexander
    Thanks for the explanation! but still lost. i know they are two different codes, what i don’t get it how to get the returned point or how to use the javascript above. i have code working fine on my ios5 device in my app. my problem is i added a uinavigationbar as a subview to the web view.. so if i see the navigation bar, then any link clicked won’t work. if i scroll down so i can’t see it so the actual page is actually starts at the top line, then the links work again. i put a all this code into the javascript file. so now where do i put code to get the returned values? i kept the javascript code with x and y. but in the objective c class I’m coding in on the web, i was using %i and stuff to see how to get a value. could never get any. so can you please show how to get the actual returned points so even when my navigation is showing i can see it? heres some of my code:

    JSTools.js
    ==============
    
    function MyAppGetHTMLElementsAtPoint(x,y) {
        var tags = ",";
        var e = document.elementFromPoint(x,y);
        while (e) {
            if (e.tagName) {
                tags += e.tagName + ',';
            }
            e = e.parentNode;
        }
        return tags;
    }
    
    function MyAppGetLinkSRCAtPoint(x,y) {
        var tags = "";
        var e = document.elementFromPoint(x,y);
        while (e) {
            if (e.src) {
                tags += e.src;
                break;
            }
            e = e.parentNode;
        }
        return tags;
    }
    
    function MyAppGetLinkHREFAtPoint(x,y) {
        var tags = "";
        var e = document.elementFromPoint(x,y);
        while (e) {
            if (e.href) {
                tags += e.href;
                break;
            }
            e = e.parentNode;
        }
        return tags;
    }
    
    function documentCoordinateToViewportCoordinate(x,y) {
        var coord = new Object();
        coord.x = x - window.pageXOffset;
        coord.y = y - window.pageYOffset;
        return coord;
    }
    
    function viewportCoordinateToDocumentCoordinate(x,y) {
        var coord = new Object();
        coord.x = x + window.pageXOffset;
        coord.y = y + window.pageYOffset;
        return coord;
    }
    
    function elementFromPointIsUsingViewPortCoordinates() {
        if (window.pageYOffset > 0) {     // page scrolled down
            return (window.document.elementFromPoint(0, window.pageYOffset + window.innerHeight -1) == null);
        } else if (window.pageXOffset > 0) {   // page scrolled to the right
            return (window.document.elementFromPoint(window.pageXOffset + window.innerWidth -1, 0) == null);
        }
        return false; // no scrolling, don't care
    }
    
    function elementFromDocumentPoint(x,y) {
        if (elementFromPointIsUsingViewPortCoordinates()) {
            var coord = documentCoordinateToViewportCoordinate(x,y);
            return window.document.elementFromPoint(coord.x,coord.y);
        } else {
            return window.document.elementFromPoint(x,y);
        }
    }
    
    function elementFromViewportPoint(x,y) {
        if (elementFromPointIsUsingViewPortCoordinates()) {
            return window.document.elementFromPoint(x,y);
        } else {
            var coord = viewportCoordinateToDocumentCoordinate(x,y);
            return window.document.elementFromPoint(coord.x,coord.y);
        }
    }
    
    DLWebViewController - Just a view controller
    ======================
    
    if (gestureRecognizer.state == UIGestureRecognizerStateBegan) {
            CGPoint point = [gestureRecognizer locationInView:web];
            
            // convert point from view to HTML coordinate system
            CGSize viewSize = [web frame].size;
            CGSize windowSize = [web windowSize];
            
    
            CGFloat f = windowSize.width / viewSize.width;
            if ([[UIDevice currentDevice].systemVersion doubleValue] >= 5.) {
                point.x = point.x * f;
                point.y = point.y * f;
            } else {
                // On iOS 4 and previous, document.elementFromPoint is not taking
                // offset into account, we have to handle it
                CGPoint offset  = [web scrollOffset];
                point.x = point.x * f + offset.x;
                point.y = point.y * f + offset.y;
            }
            
            
            // Load the JavaScript code from the Resources and inject it into the web page
            NSString *path = [[NSBundle mainBundle] pathForResource:@"JSTools" ofType:@"js"];
            NSString *jsCode = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
            [web stringByEvaluatingJavaScriptFromString: jsCode];
            
            NSString *tPoint = [web stringByEvaluatingJavaScriptFromString:@"viewportCoordinateToDocumentCoordinate(x,y);"];
            NSLog(@"test: %@",tPoint);
            
            // get the Tags at the touch location
            NSString *tags = [web stringByEvaluatingJavaScriptFromString:
                              [NSString stringWithFormat:@"MyAppGetHTMLElementsAtPoint(%i,%i);",(NSInteger)point.x,(NSInteger)point.y]];
            
            NSString *tagsHREF = [web stringByEvaluatingJavaScriptFromString:
                                  [NSString stringWithFormat:@"MyAppGetLinkHREFAtPoint(%i,%i);",(NSInteger)point.x,(NSInteger)point.y]];
            
            NSString *tagsSRC = [web stringByEvaluatingJavaScriptFromString:
                                 [NSString stringWithFormat:@"MyAppGetLinkSRCAtPoint(%i,%i);",(NSInteger)point.x,(NSInteger)point.y]];
            
            
            NSString *url = nil;
            if ([tags rangeOfString:@",IMG,"].location != NSNotFound) {
                url = tagsSRC;
            }
            if ([tags rangeOfString:@",A,"].location != NSNotFound) {
                url = tagsHREF;
            }
            NSLog(@"url : %@",url);
            
            NSArray *urlArray = [[url lowercaseString] componentsSeparatedByString:@"/"];
            NSString *urlBase = nil;
            if ([urlArray count] > 2) {
                urlBase = [urlArray objectAtIndex:2];
            }
            
            if ((url != nil) &&
                ([url length] != 0)) {
                // Release any previous request
                [req release], req = nil;
                // Save URL for the request
    
    ===================
    
  10. Alexander Alexander says

    @max
    Sorry, but I’m not really sure what exactly you’re doing with the navigation bar. Adding a navigation bar as subview to UIWebView will not shift the web page down so the top of the web page would be still located where it was located before, so the navigation bar is now overlapping and hiding parts of the web page. But it seems that according to your description the web page was shifted down by the navigation bar, and scrolling up by 44 px (the height of the navigation bar), the navigation bar is now out of view and the web page will directly start at the top of the screen. Is this what you wanted to say?

    In general, adding a navigation bar as a subview of UIWebView won’t make the navigation bar scroll at all, so you must have done something else if you can scroll the navigation bar along with the web page. I guess you’ve added the navigation bar as subview to one of the internal private subviews of UIWebView. In case this is what you’ve done, please note that this is not really conforming to the AppStore guidelines. Using private APIs or private internal structures is not something you are supposed to do. Maybe this was also part of your problem?
    Maybe all the coordinates you’re working with are therefore wrong, maybe offset by the height of the navigation bar.

    In any case, you might send me a complete Xcode project by email, so I can see your problem myself. Posting code fragments here probably doesn’t help, because there’s always missing something that is important. And the Navigation bar seems to be something that is important to fully understand your problem.

  11. Benny says

    Thanks.
    That is a good and clear explanation.

  12. thm says

    Thanks a lot, works perfectly!

  13. Manu says

    Is it possible to create a password for the Filters (Blocked Sites) ?

  14. Alexander Alexander says

    @Manu
    You mean in iCab Mobile? No there’s no password for the filters itself. But you can set a password for the whole App (In the “Privacy/Access Control” settings).

  15. karthees says

    I have UIWebView for displays HTML pages. I searched docs and find this [wbCont stringByEvaluatingJavaScriptFromString:@"window.getSelection().toString()"]; It displays copy,define. So i moved to @”document.elementFromPoint(%f, %f).innerHTML”. It’s working fine. But the problem is it automatically selecting x,y coordinates. User can’t manually select. How to do?

    -(void)singleTap:(UIGestureRecognizer *)gestureRecognizer
    {
    CGPoint touchPoint = [gestureRecognizer locationInView:wbCont];

    NSString *js = [NSString stringWithFormat:@"document.elementFromPoint(%f, %f).innerHTML", touchPoint.x, touchPoint.y];

    NSLog(@”js is %@”,js);

    NSString * tagName = [wbCont stringByEvaluatingJavaScriptFromString:js];
    NSLog(@”Selected Name: %@”,tagName);

    }

  16. Alexander Alexander says

    @karthees
    You need to convert the UIView-based coordinates you get from the gesture recognizer into the HTML-based coordinates first. The reason is that HTML document usually can be zoomed and scrolled, so a coordinate on the screen is not identical to the coordinate within the HTML document.

    You may find some helot how to do this in another blog post:
    http://www.icab.de/blog/2010/07/11/customize-the-contextual-menu-of-uiwebview/

    This blog post explains how to do this in order to position contextual menu.

  17. Alen says

    Hi,I am loading a XHTML file in a uiwebview and paginating its content by dividing its content in columns by using “-webkit-column” property. I need to find out the currently visible text on a uiwebview not the entire one( the entire text can be easily found by using “document.body.innerText” property)

    I tried using “elementFromPoint” to find the currently visible text but is returning the entire tag text which is starting from previous pages. I need the exact text thats is appearing on uiwebview currently

    Can anybody help in finding out this.

  18. Alexander Alexander says

    @Alen
    I think the problem is that the -webkit-column” property will layout the text in columns, but the HTML tree itself remains the same, so there are no extra elements or blocks exposed in the HTML tree, which could help to get finer control of the visible areas. So if the text is in one single “div” element, then you can only get this “div” as container element. elementFromPoint() is only working on “element” level (so you can only find HTML elements, not internal layout blocks).



Some HTML is OK

or, reply to this post via trackback.