FAQ - Adblock Plus internals
- Where do I find the meaning of all Adblock Plus preferences?
- How do I access Adblock Plus from my extension?
- How does Adblock Plus block addresses?
- How does Adblock Plus process its filters and which filters are faster?
- How does element hiding work?
- How often will subscriptions be downloaded?
- My subscription has moved, how do I make sure everybody updates the address?
- What can the first line of a filters file look like?
- How do I protect my filter subscription against accidental download corruption?
Where do I find the meaning of all Adblock Plus preferences?
Adblock Plus uses a number of preferences that are accessible via about:config. All of them start with extensions.adblockplus. (this is different from Adblock and Adblock Plus 0.5 that use the prefix adblock.). A full list with explanations can be found here.
How do I access Adblock Plus from my extension?
To allow other extensions to integrate with Adblock Plus the interface IAdblockPlus is exported. Check out interface documentation for details.
How does Adblock Plus block addresses?
The hard work here is actually done by Gecko, the engine on top of which Firefox, Thunderbird and other applications are built. It allows something called "content policies". A content policy is simply a JavaScript (or C++) object that gets called whenever the browser needs to load something. It can then look at the address that should be loaded and some other data and decide whether it should be allowed. There is a number of built-in content policies (when you define which sites shouldn't be allowed to load images in Firefox or SeaMonkey, you are actually configuring one of these built-in content policies) and any extension can register one. So all that Adblock Plus has to do is to register its content policy, other than that there is only application logic to decide which addresses to block and user interface code to allow configuration of filters.
For developers: to register a content policy you have to write an XPCOM component that should implement the nsIContentPolicy interface. Make sure to adjust the module's registerSelf method to register your component in the "content-policy" category (use the category manager for this). That's it, now your component's shouldLoad method will be called and you can decide whether the specific request should be accepted or not.
How does Adblock Plus process its filters and which filters are faster?
All filters a translated into regular expressions internally, even the ones that haven't been specified as such. For example, the filter ad*banner.gif|
will be translated into the regular expression /ad.*banner\.gif$/
. However, when Adblock Plus is given an address that should be checked against all filters it doesn't simply test all filters one after another — that would slow down the browsing unnecessarily.
Besides of translating filters into regular expressions Adblock Plus also tries to extract text information from them. What it needs is a unique string of eight characters (a "shortcut") that must be present in every address matched by the filter (the length is arbitrary, eight just seems reasonable here). For example, if you have a filter |http://ad.*
then Adblock Plus has the choice between "http://a", "ttp://ad" and "tp://ad.", any of these strings will always be present in whatever this filter will match. Unfortunately finding a shortcut for filters that simply don't have eight characters unbroken by wildcards or for filters that have been specified as regular expressions is impossible.
All shortcuts are put into a lookup table, Adblock Plus can find the filter by its shortcut very efficiently. Then, when a specific address has to be tested Adblock Plus will first look for known shortcuts there (this can be done very fast, the time needed is almost independent from the number of shortcuts). Only when a shortcut is found the string will be tested against the regular expression of the corresponding filter. However, filters without a shortcut still have to be tested one after another which is slow.
To sum up: which filters should be used to make a filter list fast? You should use as few regular expressions as possible, those are always slow. You also should make sure that simple filters have at least eight characters of unbroken text (meaning that these don't contain any characters with a special meaning like *), otherwise they will be just as slow as regular expressions. But with filters that qualify it doesn't matter how many filters you have, the processing time is always the same. That means that if you need 20 simple filters to replace one regular expression then it is still worth it. Speaking of which — the deregifier is very recommendable.
The filter matching algorithm in detail
How does element hiding work?
Element hiding rules are translated into CSS and applied to all web pages the user is visiting. A rule like example.com#div(evil_ad)
then looks like:
@-moz-document domain(example.com) { div#evil_ad, div.evil_ad { display: none !important; } }
@-moz-document is a proposed extension to the CSS standard, you can read more about it in the Mozilla Developer Center.
Rules that are not restricted to a certain domain will be restricted to the protocols http:// and https:// to prevent them from hiding elements of the browser's user interface (it is using the chrome:// protocol scheme). For example the rule #div(evil_ad)
will be translated into:
@-moz-document url-prefix(http://),url-prefix(https://) { div#evil_ad, div.evil_ad { display: none !important; } }
For developers: Adblock Plus is using the stylesheet service here. This interface came with Gecko 1.8 and allows extensions to add user stylesheets dynamically (before that you could only modify userContent.css which requires you to restart the browser). User stylesheets will overwrite CSS code of all web sites, they have the highest possible importance.
How often will subscriptions be downloaded?
The default value is that the subscription is downloaded once daily. However, subscription authors can adjust this value, e.g. to prevent unnecessary waste of traffic. Anything between every hour and every 21 days is possible. One way to do this is to set the HTTP Expires header. The Apache mod_expires module allows you to do it, you simply write in the .htaccess file:
ExpiresActive on ExpiresByType text/plain "access plus 5 days"
That would make the filter list expire 5 days after it is downloaded. If your filter list is produced by a Perl script you can do the same with the command:
$cgi->header(-expires => "+5d");
This works similarly for other scripting languages like PHP. If for some reason you can't tweak the HTTP headers, you can put a comment into your filter list. Like this:
[Adblock] ! This list expires after 5 days
It doesn't matter where you have this comment. Adblock Plus will look for any comment with the keyword "expires after" or "expires:" followed by a number. By default the number is interpreted as the number of days. A number followed by the letter "h" will be interpreted as the number of hours however, e.g. "Expires: 3h" or "expires after 3 hours" will tell Adblock Plus to download the list again after 3 hours.
Note that this is only the minimal times between downloads. If the user doesn't have his browser running then the list will only be downloaded when he starts it again.
My subscription has moved, how do I make sure everybody updates the address?
Starting with Adblock Plus 0.7.5 permanent redirects are fully supported. This means that not only will the subscription be downloaded from its new location, but the address of the subscription changes in Adblock Plus so that further downloads will use the new address immediately. How can you make use of this feature?
First option: redirect with HTTP headers. The request for the old filter list address should result in a "301 Moved Permanently" response. Adblock Plus will follow the redirect and adjust the subscription address on success. You can create such a redirect for example with Apache's mod_alias module, add something like the following line to your .htaccess file:
Redirect permanent /old_list.txt http://example.com/new_list.txt
If you cannot create an HTTP redirect on your hosting you can still use a special comment to specify the new address of your subscription. It should look like this:
[Adblock] ! Redirect: http://example.com/new_list.txt
This comment can be anywhere in your filter list, Adblock Plus will scan all comments for the keywords "redirect:" and "redirect to" followed by an address. If such a comment is found, a new subscription update will be initiated after one hour with the new address. Should this update succeed, the address of the subscription will be adjusted.
Finally, it could happen that your server is unavailable and users will get a download error every time your subscription should be updated. Even if you cannot prevent this error, there is still a solution. After a number of failed download attempts (determined by the extensions. adblockplus. subscriptions_fallbackerrors preference) Adblock Plus will contact the address defined in the extensions. adblockplus. subscriptions_fallbackurl preference for further instructions. If the problem is known this address will give out the new location of your filter list. So if you are unable to indicate a location change for your subscription by regular means, please send an email.
What can the first line of a filters file look like?
Usually the first line of a filters file is simply [Adblock]
. However, you might have noticed that recent versions of Adblock Plus sometimes put a different text instead. This is done when you have filters in your list that use advanced filter syntax only supported by newer versions of Adblock Plus but not original Adblock. One example would be:
(Adblock Plus 0.6.1.2 or higher required) [Adblock]
This is simply a comment. Adblock (and Adblock Plus for that reason) will ignore anything before the actual mark. The required Adblock Plus version is not enforced because Adblock Plus 0.6.1.2 didn't support it. However, if you use even newer filter syntax, you might get something like:
[Adblock Plus 0.7.1]
This type of header is supported starting with Adblock Plus 0.7.1. Older Adblock Plus versions and Adblock cannot open files starting with this header. As to the current versions, they will check the version number in the header and compare it with their own version number. If the file happens to require a newer Adblock Plus, the user will be given a message on import asking him to upgrade. Subscriptions will still load files meant for newer Adblock Plus versions but display a warning in the preferences dialog.
Finally, if you want to require Adblock Plus but don't want to specify the version number you can start the file with [Adblock Plus]
. Of course this file will only be accepted by Adblock Plus 0.7.1 or higher again.
How do I protect my filter subscription against accidental download corruption?
Proxy servers as well as antivirus and firewall software might modify downloads. Sometimes a filter "*/example/*" mutates into "**" because of that and causes everything to be blocked. To prevent this subscription maintainers can insert a checksum towards the start of the filter list, like this:
[Adblock] ! Checksum: OaopkIiiAl77sSHk/VAWDA test
Adblock Plus will ignore the download if the checksum doesn't match the contents of the file. When exporting filters from Adblock Plus the checksum will be generated automatically. It is calculated as follows:
- Take UTF-8 encoded text of the filter list (including the first line)
- Convert all line breaks to UNIX style (remove \r characters if present)
- Remove empty lines (replace sequences of multiple \n characters by a single \n character)
- Remove existing checksum comment
- Calculate a base64-encoded MD5 checksum of the text, remove trailing = characters if any
The Perl code to achieve this looks like this (assuming that the file encoding is UTF-8):
use Digest::MD5 qw(md5_base64); my $data = readFile($file); # Normalize data $data =~ s/\r//g; $data =~ s/\n+/\n/g; # Remove existing checksum $data =~ s/^\s*!\s*checksum[\s\-:]+[\w\+\/=]+.*\n//mi; # Calculate new checksum my $checksum = md5_base64($data);
Reference implementations exists to validate a checksum and to add a checksum to a file.
PHP code for checksum calculation (again assuming that the file encoding is UTF-8):
$data = file_get_contents($file); # Normalize data $data = preg_replace('/\r/', '', $data); $data = preg_replace('/\n+/', "\n", $data); # Remove existing checksum $data = preg_replace('/^\s*!\s*checksum[\s\-:]+([\w\+\/=]+).*\n/mi', '', $data); # Calculate new checksum $checksum = base64_encode(pack('H*', md5($data))); $checksum = preg_replace('/=+$/', '', $checksum);