Skip to content


URL filtering for UIWebView on the iPhone

iCab Mobile provides a filter manager which allows to filter out advertising banners and other stuff from web pages. It has a list of simple URL-based filter rules (which is even editable by the user) and when a web page contains resources (image files, JavaScript files, stylesheets etc.) whose URLs match one of these rules, the resources won’t be loaded.

But implementing filters seems to be impossible. When you look at the public API of the UIWebView class, you won’t see anything which would allow to find out which resources the UIWebView object is loading, and even worse, nothing is available which can be used to force the UIWebView not to load these resources when you want to filter them out.

But of course there is a solution, otherwise this blog post wouldn’t make much sense ;-).

To implement filters we don’t have to look at UIWebView. As I mentioned above, nothing in the UIWebView API would allow to implement filtering.

To find a hook where we can intercept all the HTTP requests which are done by UIWebView we have to know a little bit about the URL loading system of Cocoa because UIWebView is using the URL loading system to get all the data from the web. One part of the URL loading system is the NSURLCache class, and this is our hook we’re looking for. Though the iPhone OS doesn’t cache any data on “disk” at the moment (this can be different in later iPhone OS release) and therefore the cache that is managed by the NSURLCache class is usually empty, UIWebView nevertheless checks if the requested resources are in the cache. So all we need to do is to subclass NSURLCache and overwrite the method

- (NSCachedURLResponse*)cachedResponseForRequest:(NSURLRequest*)request

This method is called for all resources the UIWebView is requesting. So all we need to do is to check if the URL of the request matches one of the filters. If it does, we create a fake response with no content, otherwise we just call the super class.  This is basically all we need to do.

Here’re some more details:

1. Subclassing NSURLCache:
In the Header file there’s almost nothing to do:

FilteredWebCache.h:

@interface FilteredWebCache : NSURLCache
{
}
@end

Now the main code for the subclass:

FilteredWebCache.m:

#import "FilteredWebCache.h"
#import "FilterManager.h"

@implementation FilteredWebCache

- (NSCachedURLResponse*)cachedResponseForRequest:(NSURLRequest*)request
{
    NSURL *url = [request URL];
    BOOL blockURL = [[FilterMgr sharedFilterMgr] shouldBlockURL:url];
    if (blockURL) {
        NSURLResponse *response =
              [[NSURLResponse alloc] initWithURL:url
                                        MIMEType:@"text/plain"
                           expectedContentLength:1
                                textEncodingName:nil];

        NSCachedURLResponse *cachedResponse =
              [[NSCachedURLResponse alloc] initWithResponse:response
                             data:[NSData dataWithBytes:" " length:1]];

        [super storeCachedResponse:cachedResponse forRequest:request];

        [cachedResponse release];
        [response release];
    }
    return [super cachedResponseForRequest:request];
}
@end

The code first checks if the URL should be blocked (the FilterManager class is doing all these checks, this class isn’t shown here). If yes, it creates a new response object with no content and stores this in the cache. One could assume that it should be possible to just return the fake response object and we don’t need to store it in the cache. But if we do this, the app would crash very soon because our fake response object is over-released by the iPhone OS. I don’t know why exactly this happens, this might be a bug in the iPhone OS (and also in MacOSX 10.5.x where the same thing happens. This works fine in 10.4.x and all older MacOSX releases) or caused by some undocumented internal dependencies between the different classes of the URL loading system. So we just store our fake response in the Cache. This makes sure that all response objects we return are really stored in the Cache and this is what the iPhone OS expects and then it won’t crash.

Update: It seems that it is also necessary that the “fake” response is initialized with a NSData object which has a size larger than zero.

2. Creating a new Cache:
We also need to create a new cache and tell the iPhone OS that it has to use this new cache instead of the default one so we really get called when the URL loading system checks the cache for a resource. This should be done before any of the UIWebView objects are starting to load web pages, very early within the launching process of the app.

NSString *path = ...// the path to the cache file
NSUInteger discCapacity = 10*1024*1024;
NSUInteger memoryCapacity = 512*1024;

FilteredWebCache *cache =
      [[FilteredWebCache alloc] initWithMemoryCapacity: memoryCapacity
                             diskCapacity: discCapacity diskPath:path];
[NSURLCache setSharedURLCache:cache];
[cache release];

We have to provide a path where the cache file is stored. The cache file is automatically created by the NSURLCache objects, so we don’t need to create the file, we only have to define where the file should be saved (this must be somewhere in the “sandbox” of our application, for example in the “tmp” folder or in the “Documents” folder).

This is all the magic to implement URL-based filters for UIWebView on the iPhone. You see, it’s not that complicated.

Note: If the filters can change while your app is running, you may need to remove the fake responses from the cache again. The NSURLCache class provides a method for this, so this isn’t a problem. If your filters are static, then you don’t need to care about this.

Posted in iPhone & iPod Touch, Programming.

Tagged with , , , .


78 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Jim Huang says

    Why i always failed to post my comment? Great post, thanks Alexander .

  2. waiwaier says

    Is it possible to filter the https data and the data posted to server?

  3. Alexander Alexander says

    @waiwaier
    The filter method I’ve described here is based on the URL only. So it will work for https as well, but you can not filter based on the content that is received or sent this way.

  4. David Schiefer says

    This works on many websites, but for example on nytimes.com no banners are shown, but I get a “?” icon in the place where they used to be. Why is this?

  5. Alexander Alexander says

    @David Schiefer
    This is because the “?” is the standard replacement icon for missing or damaged image files. And because the filters do block these image files the iOS draws the replacement icon now. When the filter blocks non-image files, like JavaScript etc, you won’t get the replacement icons, because then the missing file is not an image file, or because the blocked JavaScript file can no longer create the IMG tags and so there’s no replacement icon.

  6. David Schiefer says

    Yes I know that – but maybe we should replace it with a transparent image…will play with this a bit and will let you know of any updates.

  7. Peter Bauer says

    It does not detect calls of MPVideoController

  8. Anupam says

    hi,
    I tried the above in iPhone simulator by loading the homepage of google.com in web-view & block the image-urls.
    In the filtering logic, I blocked the urls that have suffix of ‘.png’. But the all the images are getting loaded everytime. Can you tell me what might be going wrong.

  9. Alexander Alexander says

    @Anupam
    Did you create a new cache object and set is as the new “shared URL Cache”? And have you done this before the UIWebView is actually loading the web page? The best would be to do this even before actually creating the UIWebView objects itself.
    And are you sure that your test for the suffix is done correctly?

  10. Amy says

    Hi, I met the problem as Anupam said. And i indeed set the new shared url cahce and did it earliest. However, i still could not to block the image, waiting for your kindly help.

  11. Alexander Alexander says

    @Amy
    First of all, please check in the debugger if your own method cachedResponseForRequest: is actually called. If it is not called, then you might have done something wrong when creating a new cache object and and setting it as a new shared cache.

    It is definitely working. I’m using this method in iCab Mobile not only for the filters, but also for the offline cache. But please note, the iOS also uses caches in memory, so when you’ve loaded an image before without filtering it, and it is still in the memory cache, the iOS won’t check the cache object you’re created. Also note, that in “real world” web pages, images do not always have an extension like “.jpg” or “.png”, the extension can be missing completely, the URL can contain search parameters so the extension is not at the end of the URL etc.

    In case you can’t figure it out by yourself, please send my a sample Xcode project to my email address. Maybe I can find out what’s missing.

  12. Eric says

    hi
    Following you tutorial i have finished the image filter,but how i can make adblock? keyWords or other method? thank you so much.

  13. Alexander Alexander says

    @Eric
    An adblocker is very similar. Instead of checking the extension, simply check for the well known domain names of ad networks (like double-click.net) and common parts of path names like “banner”, “ads” etc.

  14. Nick says

    Hi,
    I was trying to implement this method to modify the urls on outgoing resource requests made by a uiwebview. I follow the tutorial and set up the sharedURL cache to be the sublcassed cache before anything is loaded in the uiwebview. In the subclassed method, I create a modified NSURLRequest with the modified url and return it in the method [super cachedResponseForRequest:request]. However, the uiwebview does not load the modified url for the resource, but instead just loads the original url. What am I doing wrong in this method?

  15. Alexander Alexander says

    @Nick
    It is very important to set your own cache object as the new sharedURLCache. And do this before you create the UIWebView or at least before you load any data in the UIWebView. UIWebView is using memory caches internally, so if a certain resource is still in the memory cache, UIWebView won’t check the NSURLCache.

  16. Amy says

    @Alexander
    Thank for your kindly help. BTW, how can i clear the cache in the memory and is there another general way to detect image urls instead of extension names.

  17. Bennsen says

    hi alexander,
    it works perfect – thanks for sharing – but when i ad more than 3 Ads-block-urls i get the “?” icon too.
    did you found any solution to replace / remove them?

  18. Alexander Alexander says

    @Bennsen
    The “?” icon in the web view is a replacement for missing images, which means the web view has requested an image, but the data it received could not be interpreted as image, or there was no data at all. So if you want to get rid if the replacement images you would need to return valid image data, like for example the data of an 1*1 px transparent GIF image, which would then be completely invisible.

    But depending of the task you want to achieve, it can be nevertheless be a good idea to show these “missing image” icons to make it obvious that you’ve blocked something. At least if the filter is a feature the user can control (like for example in iCab Mobile). This way the user can see that there’s something blocked, and if it is obvious that something important is missing on the page, the user has the chance to suspect an over-blocking filter and can do something to fix this. But he has to see that something is blocked to get suspicious.

  19. Suvarna says

    hi alexander,
    this code working fine with me,but i have one problem.i can block images on webpage but if i refresh web page more than 2 times it show me blocked images.i think it use main memory cache for loading.can you give me solution for solve this problem.How to clear cache of main memory?
    Thanks
    -Suvarna.

  20. Alexander Alexander says

    @Suvarna
    You can not access or clear the internal memory caches of UIWebView. If you call the “reload” method of UIWebView, it will bypass the cache (NSURLCache) and load the page directly. This way the image can end up in the memory cache.

    To avoid this you either have to implement your own HTTP protocol handler (by implementing the methods of the NSURLProtocol class) so you can intercept every single network request of UIWebView, or you only need to make sure that you never use the “reload” method of UIWebView.

  21. Jonathan says

    Hi Alexander,

    Does this still work on iOS 6.0?

  22. Alexander Alexander says

    @Jonathan
    Yes, this should still work under iOS 6.
    But please be aware that the UIWebView is also doing some caching internally as well. So the NSURLCache won’t get asked each time a resource is requested.

  23. zack says

    how cn i create Filtermanager class here??i am new in iOS

  24. Alexander Alexander says

    @zack
    The filter manager class would be not an iOS-specific thing. It’s just a normal class which has a method where you pass in a URL and which then returns wether this URL should be filtered or not. How exactly you’re doing this is up to you. If your filtering requirements are very simple, then it could be even OK if you do not write a separate class, but instead just add another method in the FilteredWebCache class which decides if a URL needs to be filtered.

  25. Max Litteral says

    How are you checking the url in the FilterManager.h? could you please post your code

  26. Alexander Alexander says

    @Max Litteral
    The criteria for blocking URLs is completely up to you. In simply basic cases this could be done by a simple string comparison, for more complex filtering requirements you may need to do some more. I don’t know what exactly you need to filter, so I can’t tell you how this needs to be done.

    For example the following would block the http://www.apple.com site and all secure web sites using the HTTPS scheme:

    - (BOOL)shouldBlockURL:(NSURL*)url
    {
      if ([[url host] isEqualToString:@"www.apple.com"]) {
        return YES;
      } else if ([[url scheme] isEqualToString:@"https"]) {
        return YES;
      }
      return NO;
    }
    

    For more complex filtering you would have an array of URLs or URL parts you would check against in a loop, so the filter manager class would also need methods to load and maintain the list of filters (URLs or URL components) etc. This is why I’ve assumed to do this in a special “filter manager” class. But depending of our requirements, you can do this differently as well.

  27. josh adblock says

    That’s really interesting. this would work great for building a browser that blocks advertisements in ios

  28. John says

    Thanks for the great post. Will this method work to check for links that won’t work in UI WebView? I am accessing a public website using a ui web view. The links open in a new window in Safari but not in a UI Web View.
    Can I create -(BOOL) shouldOpenURL: (NSURL *)
    {
    if ( [ [ url host] is EqualToString: @” …the link used in safari”] ){
    return Yes;
    }
    will this open the links?

    Thank you.

1 2



Some HTML is OK

or, reply to this post via trackback.