2020-02-29 13:00:17 +01:00
|
|
|
|
# pybl
|
|
|
|
|
|
2016-01-24 19:05:59 +01:00
|
|
|
|
Squid helper handling squidguard blacklists written in python
|
2016-02-03 21:31:33 +01:00
|
|
|
|
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* Only supports domains blacklists actually (ie : google.com, www.google.com, mail.google.com, etc.)
|
2020-02-29 13:00:17 +01:00
|
|
|
|
* In config specified blacklists are loaded in memory or CDB backend using https://github.com/bbayles/python-pure-cdb
|
|
|
|
|
* Usable as an external acl plugin for squid 3
|
|
|
|
|
* Written because of poor development on squidguard and some issues using blacklists on squid
|
|
|
|
|
* Python 3 supported as of 2020
|
|
|
|
|
|
2016-02-03 21:53:36 +01:00
|
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
|
|
Add this configuration to squid.conf :
|
|
|
|
|
```
|
2020-02-29 13:00:17 +01:00
|
|
|
|
external_acl_type urlblacklist_lookup ttl=5 %DST /usr/bin/python /usr/local/pybl/pybl.py
|
2016-02-03 21:53:36 +01:00
|
|
|
|
...
|
|
|
|
|
acl urlblacklist external urlblacklist_lookup
|
|
|
|
|
...
|
|
|
|
|
http_access deny urlblacklist
|
|
|
|
|
```
|
2016-02-03 22:00:50 +01:00
|
|
|
|
|
2016-02-29 10:02:01 +01:00
|
|
|
|
Config file must be include following statements
|
2016-02-03 22:00:50 +01:00
|
|
|
|
```
|
2016-02-29 10:02:01 +01:00
|
|
|
|
url = http://dsi.ut-capitole.fr/blacklists/download/blacklists.tar.gz
|
2020-02-29 13:00:17 +01:00
|
|
|
|
basedir = /usr/local/pybl/
|
|
|
|
|
categories = adult,malware # categories are coma separated values
|
|
|
|
|
backend = cdb
|
2016-02-03 22:00:50 +01:00
|
|
|
|
```
|
|
|
|
|
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* url : squidguard-like blacklists files, this variable is not already usable
|
2020-02-29 13:00:17 +01:00
|
|
|
|
* basedir : root path containing blacklists files, metadata (update datetime)
|
2016-02-20 14:46:12 +01:00
|
|
|
|
* categories : blacklists to use for filtering
|
2020-02-29 13:00:17 +01:00
|
|
|
|
* backend : database flavour (ram|cdb)
|
|
|
|
|
|
2016-02-03 22:00:50 +01:00
|
|
|
|
|
|
|
|
|
## TODO
|
|
|
|
|
|
2016-02-29 10:02:01 +01:00
|
|
|
|
* Auto-fetcher using url if blacklists are not already downloaded or stored on the squid machine (wip)
|
2016-02-03 22:00:50 +01:00
|
|
|
|
* Compatibility with python3 only
|
|
|
|
|
* Filters for regex urls
|
2016-02-26 12:22:20 +01:00
|
|
|
|
* Code optimisation (profiling) and cleaning (wip)
|
2016-02-14 14:52:25 +01:00
|
|
|
|
* Tests (wip)
|
2016-02-03 22:00:50 +01:00
|
|
|
|
* ...
|
2016-02-14 12:23:43 +01:00
|
|
|
|
|
2020-02-29 13:00:17 +01:00
|
|
|
|
|
2016-02-14 12:23:43 +01:00
|
|
|
|
## DBs support ideas
|
|
|
|
|
|
2016-02-14 14:52:25 +01:00
|
|
|
|
* High performance but heavy RAM usage when using dict()
|
2016-02-20 16:29:59 +01:00
|
|
|
|
* Sqlite3 tested, small memory footprint, but very slow
|
|
|
|
|
* CDB backend seems to be as fast as attended, with a very small footprint
|
2016-02-20 16:36:07 +01:00
|
|
|
|
|
|
|
|
|
|
2020-02-29 13:00:17 +01:00
|
|
|
|
## DBs Benchmarks (2016)
|
2016-02-20 16:36:07 +01:00
|
|
|
|
|
2019-02-16 15:43:27 +01:00
|
|
|
|
RAM usage for one thread with categories ["adult","malware"]
|
2016-02-20 16:36:07 +01:00
|
|
|
|
|
|
|
|
|
Debian 8 / python 2.7.9 / squid 3.4.8
|
|
|
|
|
|
|
|
|
|
* ram : 90Mo
|
2016-02-26 12:22:20 +01:00
|
|
|
|
* cdb : 6Mo
|
2020-02-29 13:00:17 +01:00
|
|
|
|
|
|
|
|
|
Ubuntu 20.04 / python 3.8.2 / squid 4.9
|
|
|
|
|
|
|
|
|
|
* ram : 249Mo
|
|
|
|
|
* cdb : 12Mo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## License
|
|
|
|
|
|
|
|
|
|
Copyright (c) 2016, 2020 PaulBSD
|
|
|
|
|
All rights reserved.
|
|
|
|
|
|
|
|
|
|
Redistribution and use in source and binary forms, with or without
|
|
|
|
|
modification, are permitted provided that the following conditions are met:
|
|
|
|
|
|
|
|
|
|
* Redistributions of source code must retain the above copyright notice, this
|
|
|
|
|
list of conditions and the following disclaimer.
|
|
|
|
|
|
|
|
|
|
* Redistributions in binary form must reproduce the above copyright notice,
|
|
|
|
|
this list of conditions and the following disclaimer in the documentation
|
|
|
|
|
and/or other materials provided with the distribution.
|
|
|
|
|
|
|
|
|
|
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
|
|
|
|
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
|
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
|
|
|
|
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
|
|
|
|
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
|
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
|
|
|
|
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
|
|
|
|
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
|
|
|
|
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
|
|
|
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|