[Rabbit-dev] ad blocking

Luis Soltero lsoltero at globalmarinenet.com
Tue Dec 14 06:49:48 CET 2010


Hello All,

Does rabbit have the ability to block ads and malware sites using lists from aggregator sites such as

http://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=1&mimetype=plaintext
http://someonewhocares.org/hosts/hosts
http://www.montanamenagerie.org/hostsfile/hosts.txt
http://www.hosts-file.net/hphosts-partial.asp
http://www.mvps.org/winhelp2002/hosts.txt

I know that rabbit has the following blocking facility

---
[rabbit.filter.BlockFilter]
# This is a filter that blocks access to resources.

# return a 403 forbidden for these requests.
blockURLmatching=(\.sex\.|[-.]ad([sx]?)\.|/ad\.|adserving\.|ad101com-|pagead/imgad|as-us.falkag.net|clicktorrent.info)
---

This facility is very flexible but difficult to maintain in an every changing internet landscape.  A better approach
might be to use lists from aggragators who's mission is to keep up to date lists of sites that offer ads and malware.

My first naive approach at solving this problem was to augment /etc/hosts on our proxy server with lists from the above
sites. I soon discovered rabbit ignored these.  It seems that rabbit uses javands to access the DNS service directly to
do queries ignoring the system resolver.   So replacing /etc/hosts does not work.

A better solution would be for rabbit check against a preconfigured "block" list of sites and then return 403 errors
when the urls containing these hosts names are requested.  It should be pretty simple thing to do to query the bad host
table prior to doing DNS query.  I can see two implementations of this approach.
1. rabbit reads the bad host table on startup and then keeps an internal table for lookups (our current host table has
over 600K entries so this approach should be manageable)
2. a better approach might be to query an sql table for bad hosts prior to the lookup.  This would be faster and more
dynamic since the table could be updated automatically from an external process.

I think that adding this facility to Rabbit should be pretty easy and quite valuable to the community.  Those of us
using rabbit are mostly running in a bandwidth limited environment and what better way to save bandwidth than to strip
out ads. This approach also has the benefit protecting users from known malware sites.

Anyone have any thoughts on this?

Thanks,

--luis


-- 


Luis Soltero, Ph.D., MCS
Director of Software Development, CTO
Global Marine Networks, LLC
StarPilot, LLC
Tel: 865-379-8723
Fax: 865-681-5017
E-Mail: lsoltero at globalmarinenet.net
Web: http://www.globalmarinenet.net
Web: http://www.starpilotllc.com





More information about the Rabbit-dev mailing list