pybl/README.md

57 lines
1.7 KiB
Markdown
Raw Normal View History

2016-01-24 19:05:59 +01:00
# py-squid-blacklists
Squid helper handling squidguard blacklists written in python
2016-02-03 21:31:33 +01:00
2016-02-20 14:46:12 +01:00
* Only supports domains blacklists actually (ie : google.com, www.google.com, mail.google.com, etc.)
2016-02-22 22:05:41 +01:00
* In config specified blacklists are loaded in RAM or CDB backend using https://github.com/acg/python-cdb
2016-02-03 21:31:33 +01:00
* Usable as an external acl plugin of squid
* Written because of poor development on squidguard and some issues using blacklists on squid3
2016-02-03 21:53:36 +01:00
## Usage
Add this configuration to squid.conf :
```
external_acl_type urlblacklist_lookup ttl=5 %URI /usr/bin/python /usr/local/py-squid-blacklists/py-squid-blacklists.py
2016-02-03 21:53:36 +01:00
...
acl urlblacklist external urlblacklist_lookup
...
http_access deny urlblacklist
```
2016-02-03 22:00:50 +01:00
Config file must be include following statements
2016-02-03 22:00:50 +01:00
```
url = http://dsi.ut-capitole.fr/blacklists/download/blacklists.tar.gz
base_dir = /usr/local/py-squid-blacklists/
categories = adult,malware
db_backend = cdb
2016-02-03 22:00:50 +01:00
```
2016-02-20 14:46:12 +01:00
* url : squidguard-like blacklists files, this variable is not already usable
* base_dir : root path containing blacklists files, metadata (update datetime)
2016-02-20 14:46:12 +01:00
* categories : blacklists to use for filtering
* db_backend : database flavour (ram|cdb)
2016-02-03 22:00:50 +01:00
## TODO
* Auto-fetcher using url if blacklists are not already downloaded or stored on the squid machine (wip)
2016-02-03 22:00:50 +01:00
* Compatibility with python3 only
* Filters for regex urls
2016-02-26 12:22:20 +01:00
* Code optimisation (profiling) and cleaning (wip)
2016-02-14 14:52:25 +01:00
* Tests (wip)
2016-02-03 22:00:50 +01:00
* ...
## DBs support ideas
2016-02-14 14:52:25 +01:00
* High performance but heavy RAM usage when using dict()
* Sqlite3 tested, small memory footprint, but very slow
* CDB backend seems to be as fast as attended, with a very small footprint
2016-02-20 16:36:07 +01:00
## DBs Benchmarks
2019-02-16 15:43:27 +01:00
RAM usage for one thread with categories ["adult","malware"]
2016-02-20 16:36:07 +01:00
Debian 8 / python 2.7.9 / squid 3.4.8
* ram : 90Mo
2016-02-26 12:22:20 +01:00
* cdb : 6Mo