Skip to content


URL filtering for UIWebView on the iPhone

iCab Mobile provides a filter manager which allows to filter out advertising banners and other stuff from web pages. It has a list of simple URL-based filter rules (which is even editable by the user) and when a web page contains resources (image files, JavaScript files, stylesheets etc.) whose URLs match one of these rules, the resources won’t be loaded.

But implementing filters seems to be impossible. When you look at the public API of the UIWebView class, you won’t see anything which would allow to find out which resources the UIWebView object is loading, and even worse, nothing is available which can be used to force the UIWebView not to load these resources when you want to filter them out.

But of course there is a solution, otherwise this blog post wouldn’t make much sense ;-) .

To implement filters we don’t have to look at UIWebView. As I mentioned above, nothing in the UIWebView API would allow to implement filtering.

To find a hook where we can intercept all the HTTP requests which are done by UIWebView we have to know a little bit about the URL loading system of Cocoa because UIWebView is using the URL loading system to get all the data from the web. One part of the URL loading system is the NSURLCache class, and this is our hook we’re looking for. Though the iPhone OS doesn’t cache any data on “disk” at the moment (this can be different in later iPhone OS release) and therefore the cache that is managed by the NSURLCache class is usually empty, UIWebView nevertheless checks if the requested resources are in the cache. So all we need to do is to subclass NSURLCache and overwrite the method

- (NSCachedURLResponse*)cachedResponseForRequest:(NSURLRequest*)request

This method is called for all resources the UIWebView is requesting. So all we need to do is to check if the URL of the request matches one of the filters. If it does, we create a fake response with no content, otherwise we just call the super class.  This is basically all we need to do.

Here’re some more details:

1. Subclassing NSURLCache:
In the Header file there’s almost nothing to do:

FilteredWebCache.h:

@interface FilteredWebCache : NSURLCache
{
}
@end

Now the main code for the subclass:

FilteredWebCache.m:

#import "FilteredWebCache.h"
#import "FilterManager.h"

@implementation FilteredWebCache

- (NSCachedURLResponse*)cachedResponseForRequest:(NSURLRequest*)request
{
    NSURL *url = [request URL];
    BOOL blockURL = [[FilterMgr sharedFilterMgr] shouldBlockURL:url];
    if (blockURL) {
        NSURLResponse *response =
              [[NSURLResponse alloc] initWithURL:url
                                        MIMEType:@"text/plain"
                           expectedContentLength:1
                                textEncodingName:nil];

        NSCachedURLResponse *cachedResponse =
              [[NSCachedURLResponse alloc] initWithResponse:response
                             data:[NSData dataWithBytes:" " length:1]];

        [super storeCachedResponse:cachedResponse forRequest:request];

        [cachedResponse release];
        [response release];
    }
    return [super cachedResponseForRequest:request];
}
@end

The code first checks if the URL should be blocked (the FilterManager class is doing all these checks, this class isn’t shown here). If yes, it creates a new response object with no content and stores this in the cache. One could assume that it should be possible to just return the fake response object and we don’t need to store it in the cache. But if we do this, the app would crash very soon because our fake response object is over-released by the iPhone OS. I don’t know why exactly this happens, this might be a bug in the iPhone OS (and also in MacOSX 10.5.x where the same thing happens. This works fine in 10.4.x and all older MacOSX releases) or caused by some undocumented internal dependencies between the different classes of the URL loading system. So we just store our fake response in the Cache. This makes sure that all response objects we return are really stored in the Cache and this is what the iPhone OS expects and then it won’t crash.

Update: It seems that it is also necessary that the “fake” response is initialized with a NSData object which has a size larger than zero.

2. Creating a new Cache:
We also need to create a new cache and tell the iPhone OS that it has to use this new cache instead of the default one so we really get called when the URL loading system checks the cache for a resource. This should be done before any of the UIWebView objects are starting to load web pages, very early within the launching process of the app.

NSString *path = ...// the path to the cache file
NSUInteger discCapacity = 10*1024*1024;
NSUInteger memoryCapacity = 512*1024;

FilteredWebCache *cache =
      [[FilteredWebCache alloc] initWithMemoryCapacity: memoryCapacity
                             diskCapacity: discCapacity diskPath:path];
[NSURLCache setSharedURLCache:cache];
[cache release];

We have to provide a path where the cache file is stored. The cache file is automatically created by the NSURLCache objects, so we don’t need to create the file, we only have to define where the file should be saved (this must be somewhere in the “sandbox” of our application, for example in the “tmp” folder or in the “Documents” folder).

This is all the magic to implement URL-based filters for UIWebView on the iPhone. You see, it’s not that complicated.

Note: If the filters can change while your app is running, you may need to remove the fake responses from the cache again. The NSURLCache class provides a method for this, so this isn’t a problem. If your filters are static, then you don’t need to care about this.

Posted in Programming, iPhone & iPod Touch.

Tagged with , , , .


25 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. iavian says

    Nice article. How to use FilteredWebCache in Webview ?

  2. Alexander says

    You mean WebView on the Mac? It works the same way. But you don’t need the FilteredWebCache here because on the Mac the WebView class has several delegate methods where you can intercept all HTTP requests and filter them out much easier.

    For example you can do all the filtering in the delegate method…
    webView:resource:willSendRequest:redirectResponse:fromDataSource:

  3. phoenix says

    Good dig.
    I just wondering why you don’t use the UIWebView delegate method to intercept? For example use

    (BOOL)webView:(UIWebView*)webView shouldStartLoadWithRequest:(NSURLRequest*)request navigationType:(UIWebViewNavigationType)navigationType

  4. Alexander says

    @phoenix
    This UIWebView delegate is only called for the main URL of a page (the HTML code) but not for embedded resources like images, stylesheets, external javascript code etc. This means for filtering advertising banners this delegate is useless because it is not called for these.

    On the Mac you would use WebView instead of UIWebView and for WebView there’s a delegate method which is called for all resources so here you can use the delegate and don’t need to use the Cache.

  5. iavian says

    Thanks for clarification , but how to use it on UIWebView [iPhone SDK] ?

  6. Alexander says

    You only need to initialize an object of the class FilteredWebCache and set it as “shared Cache” as explained in the blog post. Checking if a certain URL must be filtered is done in the “FilterMgr” class in my example, and this class must be implemented according to your requirements. UIWebView will internally call the shared cache object to find out if a resource needs to be loaded from the internet. So you don’t need to create any connections to your UIWebView objects, the iPhone OS already has all the required connections. You only need to set the FilteredWebCache as the “shared Cache” as soon as possible, before any of the UIWebView objects will load any data.

  7. iavian says

    I made it work 90% , however when it goes inside if block , the app crashes @ return [super cachedResponseForRequest:request];

  8. Alexander says

    @iavian
    OK, I’ve checked this again and you’re right. The problem seems to be the “fake” response which was created with an empty NSData object. If you create the NSData object with at least one byte (doesn’t matter which value it has), then it doesn’t crash anymore. I think this can be called a bug of the iPhone OS, because empty server responses are valid and there’s no reason why these shouldn’t be cached as well. Also in the Apple docs there’s nothing mentioned about any restrictions for the data object.

    In iCab Mobile I’ve used a static replacement object for filtered data, wrapped in an NSData object, so it was never empty. But for this tutorial these details are not important and so I just used an empty NSData object. I should have tested this before, but I didn’t expect that this would cause any crashes.

    I’ve updated the source of the blog post now. Now it will no longer crash.

  9. iavian says

    That works , that for the help & detailed article

  10. Mirko says

    Thanks for a great post. I have one question.
    What about POST requests, they are not cached (RFC 2616 section 13), so Safari will not even check the cache for POST requests.
    How to solve that problem?

  11. Alexander says

    @Mirko
    I think POST requests are usually not a real problem because these requests are usually coming from form submissions only. And the user usually submits the form himself/herself and it is unlikely that you need to filter these requests.

    But if you nevertheless need to filter these requests, you should know that UIWebView will call the delegate method

    - (BOOL)webView:shouldStartLoadWithRequest: navigationType:

    for form submissions (the “navigationType” argument will have the value UIWebViewNavigationTypeFormSubmitted in this case). And you only need to return NO in this delegate method to block the request.

  12. Mirko says

    Thanks Alexander! ;)

    I have one more short question, not really related to the post.
    How did you made iCabMobile Navigationbar?
    It looks like UINavigationBar with prompt displaying page title, but its thinner than regular UINavigationbar, also it contains two buttons and input, interface builder isn’t allowing me to do that.

    Thaks,
    Mirko

  13. Alexander says

    @Mirko
    You’re right, you can’t do this with IB. Basically this is just an empty UINavigationBar where the buttons, title and URL field are added as subviews programmatically. UINavigationBar is a subclass of UIView, so you can add subviews as you can do this with other UIViews.

  14. Mirko says

    Thanks for your answer Alex,
    I couldn’t get the label into navbar’s top view.
    Can you post your code for this if possible?

    Once again huge Thanks!

  15. Claus Kinkel says

    Thank you for your helpful articles! Good iPhone tutorials are really hard to find.

    I’m also interested in the code for programatically filling UINavigationBar items and title.

    Also can you please write an article about creating progressbar for UIWebView, there is really little information about that on the Web.

  16. Alexander says

    @Mirko & Claus Kinkel
    I’ll write an article about populating a UINavigationBar object. This is probable better than posting the code here in the comments. But it’s really easy, because you only need to use the UINavigationBar like an ordinary UIView object in which you place other objects.

  17. BiB1 says

    Hi,
    I try do play with webView, but i’m faced a small problem.
    Example, for a mail adresse : [request URL] contain “mailto:myName@myFai.com”
    [[request URL] scheme => give me the “prefix” : “mailto”
    but how can i get just “myName@myFai.com” ???

    Thanks
    BiB1

  18. Alexander says

    @BiB1
    The NSURL class als has the method “resourceSpecifier” which returns everything after the colon. So for your “mailto” URL it would return “myName@myFai.com”.

  19. Cheryl Lindsay says

    Hi,
    I have tried playing with caches but I have one problem.
    When I invoke reload method, [webView reload], caches are not used and my filters are not working. How can I handle UIWebView refreshes while still invoking my custom cache?
    If I just reopen URL it will create one history instance.
    I thought of rewriting UIWebView history mechanism because of this, but hope thats not needed :(

  20. Alexander says

    @Cheryl Lindsay
    Yes, reloading will bypass the cache. UIWebView won’t do a smart reload where it would check first if the data on the server is really newer than the data that is already in the cache. So all data is loaded from the internet.

    But you could use a simple line of JavaScript code instead of calling the “reload” method to do the reload. For example you could use

    [webView stringByEvaluatingJavaScriptFromString:@"location.replace(location.href)"];

    instead of

    [webView reload]

    to reload the page. The “location.replace()” function loads the current page again, but should replace the current history entry instead of adding a new entry to the history.

  21. Mark Aufflick says

    The pain with this approach (not that I have any better ideas) is that the first hit still results in a web download – so you save no time or bandwidth. In the simulator anyway (no firewall logs on the device for me to check, though i could setup a proxy).

  22. Alexander says

    @Mark Aufflick
    Are you sure? Why should the first request result in a download?

    The first request of a resource that is filtered will be answered by a newly created “fake” response from the web cache. So UIWebView should have no need to get the data from the web anymore.

    I’ve just checked this with a real device (iPod Touch) and a proxy which logs all the requests from my device. None of the “filtered” requests can be found in the proxy logs. So I don’t see any problems so far.

  23. Manu says

    Thanks a lot, clever hook for a closed source code.

  24. Dj says

    I was trying to cache a webpage but noticed that “- (void)storeCachedResponse:(NSCachedURLResponse *)cachedResponse forRequest:(NSURLRequest *)request
    ” method doesn’t store the cached response. I have customized the NSURLCache the way you have mentioned but still the cached response isn’t getting stored. As the cached response isn’t getting stored while retrieving the cache I am getting data as nil. What could be a possible solution to this issue..

  25. Alexander says

    @Dj
    I don’t know what exactly you’re doing. I think you could have done something wrong, or maybe my solution doesn’t match to your problem.
    You can send me an example project by email and I’ll check what’s going wrong.



Some HTML is OK

or, reply to this post via trackback.