2016-01-24 19:05:59 +01:00
|
|
|
|
# py-squid-blacklists
|
|
|
|
|
Squid helper handling squidguard blacklists written in python
|
2016-02-03 21:31:33 +01:00
|
|
|
|
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* Only supports domains blacklists actually (ie : google.com, www.google.com, mail.google.com, etc.)
|
2016-02-22 22:05:41 +01:00
|
|
|
|
* In config specified blacklists are loaded in RAM or CDB backend using https://github.com/acg/python-cdb
|
2016-02-03 21:31:33 +01:00
|
|
|
|
* Usable as an external acl plugin of squid
|
2016-02-22 22:05:41 +01:00
|
|
|
|
* Written because of poor developpement on squidguard and some issues using blacklists on squid3
|
2016-02-03 21:53:36 +01:00
|
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
|
|
Add this configuration to squid.conf :
|
|
|
|
|
```
|
2016-02-20 16:29:59 +01:00
|
|
|
|
external_acl_type urlblacklist_lookup ttl=5 %URI /usr/bin/python /usr/local/py-squid-blacklists/py-squid-blacklists.py
|
2016-02-03 21:53:36 +01:00
|
|
|
|
...
|
|
|
|
|
acl urlblacklist external urlblacklist_lookup
|
|
|
|
|
...
|
|
|
|
|
http_access deny urlblacklist
|
|
|
|
|
```
|
2016-02-03 22:00:50 +01:00
|
|
|
|
|
|
|
|
|
config.py file must be include following statements
|
|
|
|
|
```
|
2016-02-20 14:46:12 +01:00
|
|
|
|
url = "http://dsi.ut-capitole.fr/blacklists/download/blacklists.tar.gz"
|
|
|
|
|
base_dir = "/usr/local/py-squid-blacklists/blacklists/"
|
|
|
|
|
categories = ["adult","malware"]
|
|
|
|
|
db_backend = "ram"
|
2016-02-03 22:00:50 +01:00
|
|
|
|
```
|
|
|
|
|
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* url : squidguard-like blacklists files, this variable is not already usable
|
|
|
|
|
* categories : blacklists to use for filtering
|
|
|
|
|
* base_dir : path containing blacklists files
|
|
|
|
|
* db_backend : database flavour (ram|cdb)
|
2016-02-03 22:00:50 +01:00
|
|
|
|
|
|
|
|
|
## TODO
|
|
|
|
|
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* Auto-fetcher using url if blacklists are not already downloaded or stored on the squid machine
|
2016-02-03 22:00:50 +01:00
|
|
|
|
* Compatibility with python3 only
|
|
|
|
|
* Filters for regex urls
|
2016-02-14 14:52:25 +01:00
|
|
|
|
* Reduce memory footprint (wip with CDB backend alternative)
|
2016-02-22 22:05:41 +01:00
|
|
|
|
* Code optimisation and cleaning (wip)
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* Object oriented programming (wip)
|
2016-02-14 14:52:25 +01:00
|
|
|
|
* Tests (wip)
|
2016-02-03 22:00:50 +01:00
|
|
|
|
* ...
|
2016-02-14 12:23:43 +01:00
|
|
|
|
|
|
|
|
|
## DBs support ideas
|
|
|
|
|
|
2016-02-14 14:52:25 +01:00
|
|
|
|
* High performance but heavy RAM usage when using dict()
|
2016-02-20 16:29:59 +01:00
|
|
|
|
* Sqlite3 tested, small memory footprint, but very slow
|
|
|
|
|
* CDB backend seems to be as fast as attended, with a very small footprint
|
2016-02-20 16:36:07 +01:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## DBs Benchmarks
|
|
|
|
|
|
|
|
|
|
RAM usage For one thread with categories ["adult","malware"]
|
|
|
|
|
|
|
|
|
|
Debian 8 / python 2.7.9 / squid 3.4.8
|
|
|
|
|
|
|
|
|
|
* ram : 90Mo
|
|
|
|
|
* cdb : 6Mo
|