94 lines
3.1 KiB
Markdown
94 lines
3.1 KiB
Markdown
# pybl
|
||
|
||
Squid helper handling squidguard blacklists written in python
|
||
|
||
* Only supports domains blacklists actually (ie : google.com, www.google.com, mail.google.com, etc.)
|
||
* In config specified blacklists are loaded in memory or CDB backend using https://github.com/bbayles/python-pure-cdb
|
||
* Usable as an external acl plugin for squid 3
|
||
* Written because of poor development on squidguard and some issues using blacklists on squid
|
||
* Python 3 supported as of 2020
|
||
|
||
|
||
## Usage
|
||
|
||
Add this configuration to squid.conf :
|
||
```
|
||
external_acl_type urlblacklist_lookup ttl=5 %DST /usr/bin/python /usr/local/pybl/pybl.py
|
||
...
|
||
acl urlblacklist external urlblacklist_lookup
|
||
...
|
||
http_access deny urlblacklist
|
||
```
|
||
|
||
Config file must be include following statements
|
||
```
|
||
url = http://dsi.ut-capitole.fr/blacklists/download/blacklists.tar.gz
|
||
basedir = /usr/local/pybl/
|
||
categories = adult,malware # categories are coma separated values
|
||
backend = cdb
|
||
```
|
||
|
||
* url : squidguard-like blacklists files, this variable is not already usable
|
||
* basedir : root path containing blacklists files, metadata (update datetime)
|
||
* categories : blacklists to use for filtering
|
||
* backend : database flavour (ram|cdb)
|
||
|
||
|
||
## TODO
|
||
|
||
* Auto-fetcher using url if blacklists are not already downloaded or stored on the squid machine (wip)
|
||
* Compatibility with python3 only
|
||
* Filters for regex urls
|
||
* Code optimisation (profiling) and cleaning (wip)
|
||
* Tests (wip)
|
||
* ...
|
||
|
||
|
||
## DBs support ideas
|
||
|
||
* High performance but heavy RAM usage when using dict()
|
||
* Sqlite3 tested, small memory footprint, but very slow
|
||
* CDB backend seems to be as fast as attended, with a very small footprint
|
||
|
||
|
||
## DBs Benchmarks (2016)
|
||
|
||
RAM usage for one thread with categories ["adult","malware"]
|
||
|
||
Debian 8 / python 2.7.9 / squid 3.4.8
|
||
|
||
* ram : 90Mo
|
||
* cdb : 6Mo
|
||
|
||
Ubuntu 20.04 / python 3.8.2 / squid 4.9
|
||
|
||
* ram : 249Mo
|
||
* cdb : 12Mo
|
||
|
||
|
||
## License
|
||
|
||
Copyright (c) 2016, 2020 PaulBSD
|
||
All rights reserved.
|
||
|
||
Redistribution and use in source and binary forms, with or without
|
||
modification, are permitted provided that the following conditions are met:
|
||
|
||
* Redistributions of source code must retain the above copyright notice, this
|
||
list of conditions and the following disclaimer.
|
||
|
||
* Redistributions in binary form must reproduce the above copyright notice,
|
||
this list of conditions and the following disclaimer in the documentation
|
||
and/or other materials provided with the distribution.
|
||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|