From clamav-devel-bounces@lists.clamav.net  Wed May 30 07:02:19 2007
Return-Path: <clamav-devel-bounces@lists.clamav.net>
X-Original-To: list@tad.clamav.net
Delivered-To: list@tad.clamav.net
X-Virus-Scanned: Debian amavisd-new at tad.clamav.net
Received: from tad.clamav.net ([127.0.0.1])
	by localhost (tad.clamav.net [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id 2GjSEe7M7wuc; Wed, 30 May 2007 07:02:19 +0200 (CEST)
Received: from tad.clamav.net (localhost.localdomain [127.0.0.1])
	by tad.clamav.net (Postfix) with ESMTP id 25D5131C01B;
	Wed, 30 May 2007 07:02:18 +0200 (CEST)
X-Original-To: clamav-devel@tad.clamav.net
Delivered-To: clamav-devel@tad.clamav.net
X-Virus-Scanned: Debian amavisd-new at tad.clamav.net
Received: from tad.clamav.net ([127.0.0.1])
	by localhost (tad.clamav.net [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id 8QQ2tTe22YzL for <clamav-devel@tad.clamav.net>;
	Wed, 30 May 2007 07:02:15 +0200 (CEST)
Received: from mail.netfarm.it (skin.netfarm.it [151.1.32.181])
	by tad.clamav.net (Postfix) with ESMTP id 2DC8E31C005
	for <clamav-devel@lists.clamav.net>;
	Wed, 30 May 2007 07:02:15 +0200 (CEST)
X-Virus-Scanned: by AMaViS New (Debian) at mail.netfarm.it
Received: from mail.netfarm.it ([127.0.0.1])
	by localhost (mail.netfarm.it [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id 1pQIsD+CSqhJ for <clamav-devel@lists.clamav.net>;
	Wed, 30 May 2007 07:00:13 +0200 (CEST)
Received: from [192.168.129.2] (unknown [151.65.233.108])
	by mail.netfarm.it (Netfarm MailServer v1.2 [Powered by Postfix]) with
	ESMTP id F105949C90C for <clamav-devel@lists.clamav.net>;
	Wed, 30 May 2007 07:00:12 +0200 (CEST)
Message-ID: <465D056B.9050602@netfarm.it>
Date: Wed, 30 May 2007 07:02:35 +0200
From: Gianluigi Tiesi <sherpya@netfarm.it>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.8) Gecko/20051201 Thunderbird/1.5 Mnenhy/0.7.3.0
MIME-Version: 1.0
To: ClamAV Development <clamav-devel@lists.clamav.net>
X-Enigmail-Version: 0.95.0
Content-Type: multipart/mixed; boundary="------------070307070005090909040801"
X-Content-Filtered-By: Mailman/MimeDel 2.1.9
Subject: [Clamav-devel] Bloom Hash AV Matcher
X-BeenThere: clamav-devel@lists.clamav.net
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: ClamAV Development <clamav-devel@lists.clamav.net>
List-Id: ClamAV Development <clamav-devel.lists.clamav.net>
List-Unsubscribe: <http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-devel>,
	<mailto:clamav-devel-request@lists.clamav.net?subject=unsubscribe>
List-Post: <mailto:clamav-devel@lists.clamav.net>
List-Help: <mailto:clamav-devel-request@lists.clamav.net?subject=help>
List-Subscribe: <http://lists.clamav.net/cgi-bin/mailman/listinfo/clamav-devel>,
	<mailto:clamav-devel-request@lists.clamav.net?subject=subscribe>
Sender: clamav-devel-bounces@lists.clamav.net
Errors-To: clamav-devel-bounces@lists.clamav.net

This is a multi-part message in MIME format.
--------------070307070005090909040801
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Some time ago two guys wrote a patch for clamav
to use a different filter that works with the
collaboration of bm matcher,
in brief bloom av gives no false negative
but may have false positive, the file then is passed to
the bm matcher.

The attached patch is rather old, with new changes to the
engine I don't known if it still works,
and how it's easy to adapt.
it also needs to be tweaked to support scan with offset
(right now I've made as a false positive so the scan is passed to bm)

bloom av is faster than bm, the overall scan speed
is improved since the hypothesis is that
non virus files are a lot more than virus files.

I've attached also a profiled scan
look the detail:

[bm + ac]
 54.63    166.07   166.07     8012     0.02     0.02  cli_bm_scanbuff
 22.41    234.20    68.13 139428866     0.00     0.00  cli_findpos
 15.26    280.59    46.39     8012     0.01     0.01  cli_ac_scanbuff

[(bloom | bm) + ac]
 27.85     67.22    67.22 139428866     0.00     0.00  cli_findpos
 27.31    133.15    65.93      245     0.27     0.27  cli_bm_scanbuff
 19.52    180.26    47.11     8012     0.01     0.01  cli_ac_scanbuff

and
  2.20    217.76     5.31     8012     0.00     0.00  cli_bloom_filter_scanbuff

so we gain 8012 - 245 bm scans, replaced by 8012 bloom scans that are faster

the logic of the overall scan is:
bloom first, if positive bm,
then ac as normal flow

the patch has not yet included gpl header, but the guy gave me the permission
to distribute it as GPL

Hope this helps

- --
Gianluigi Tiesi <sherpya@netfarm.it>
EDP Project Leader
Netfarm S.r.l. - http://www.netfarm.it/
Free Software: http://oss.netfarm.it/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGXQVr3UE5cRfnO04RAvs6AJ0bp5AofqW5c/ssdW9BdVCd4rwaVQCcCQP9
WrTvFvzBrCKjr3ELiamVvgI=
=cN7P
-----END PGP SIGNATURE-----

--------------070307070005090909040801
Content-Type: text/plain;
 name="profiler.txt"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="profiler.txt"

W2JtICsgYWNdDQpGbGF0IHByb2ZpbGU6DQoNCkVhY2ggc2FtcGxlIGNvdW50cyBhcyAwLjAx
IHNlY29uZHMuDQogICUgICBjdW11bGF0aXZlICAgc2VsZiAgICAgICAgICAgICAgc2VsZiAg
ICAgdG90YWwNCiB0aW1lICAgc2Vjb25kcyAgIHNlY29uZHMgICAgY2FsbHMgICBzL2NhbGwg
ICBzL2NhbGwgIG5hbWUNCiA1NC42MyAgICAxNjYuMDcgICAxNjYuMDcgICAgIDgwMTIgICAg
IDAuMDIgICAgIDAuMDIgIGNsaV9ibV9zY2FuYnVmZg0KIDIyLjQxICAgIDIzNC4yMCAgICA2
OC4xMyAxMzk0Mjg4NjYgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9maW5kcG9zDQogMTUuMjYg
ICAgMjgwLjU5ICAgIDQ2LjM5ICAgICA4MDEyICAgICAwLjAxICAgICAwLjAxICBjbGlfYWNf
c2NhbmJ1ZmYNCiAgNS41NCAgICAyOTcuNDMgICAgMTYuODQgICAgMTQ1MTQgICAgIDAuMDAg
ICAgIDAuMDAgIGJvZHkNCiAgMC40OSAgICAyOTguOTEgICAgIDEuNDggICAgICA1ODIgICAg
IDAuMDAgICAgIDAuMDAgIGNsaV9odG1sX25vcm1hbGlzZQ0KICAwLjI0ICAgIDI5OS42NCAg
ICAgMC43MyAgNTQ4MTE3MiAgICAgMC4wMCAgICAgMC4wMCAgY2xpX2hleDJpbnQNCiAgMC4x
OSAgICAzMDAuMjEgICAgIDAuNTcgICAxNDk4MDYgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9y
ZWFkbGluZQ0KICAwLjE3ICAgIDMwMC43MiAgICAgMC41MSAgICAgNTE4OCAgICAgMC4wMCAg
ICAgMC4wMCAgY2xpX3Zlcm1kNQ0KICAwLjE2ICAgIDMwMS4yMSAgICAgMC40OSAgNTEyNTEy
MiAgICAgMC4wMCAgICAgMC4wMCAgaHRtbF9vdXRwdXRfYw0KICAwLjE2ICAgIDMwMS43MCAg
ICAgMC40OSAgICAzOTE3MyAgICAgMC4wMCAgICAgMC4wMCAgY2xpX2hleDJzdHINCiAgMC4x
MCAgICAzMDEuOTkgICAgIDAuMjkgICAgMzU2NzcgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9w
YXJzZV9hZGQNCiAgMC4wNyAgICAzMDIuMjAgICAgIDAuMjEgICAgIDM0NDIgICAgIDAuMDAg
ICAgIDAuMDAgIGNsaV9maWxldHlwZQ0KICAwLjA3ICAgIDMwMi40MCAgICAgMC4yMCAgICA0
MTQ1NyAgICAgMC4wMCAgICAgMC4wMCAgY2xpX2Nob21wDQogIDAuMDYgICAgMzAyLjU4ICAg
ICAwLjE4ICAgIDUyMjI5ICAgICAwLjAwICAgICAwLjAwICBjbGlfc3RydG9rDQogIDAuMDUg
ICAgMzAyLjcyICAgICAwLjE0ICAgICA1MTg4ICAgICAwLjAwICAgICAwLjA2ICBjbGlfc2Nh
bmRlc2MNCiAgMC4wNCAgICAzMDIuODMgICAgIDAuMTEgICAgICAgIDcgICAgIDAuMDIgICAg
IDAuMDIgIGNsaV9ibV9mcmVlDQogIDAuMDMgICAgMzAyLjkyICAgICAwLjA5ICAgMTA3NjA5
ICAgICAwLjAwICAgICAwLjAwICBodG1sX3RhZ19hcmdfYWRkDQogIDAuMDMgICAgMzAzLjAx
ICAgICAwLjA5ICAgICA0NTEyICAgICAwLjAwICAgICAwLjAwICBjbGlfaGV4MnNpDQogIDAu
MDMgICAgMzAzLjA5ICAgICAwLjA4ICAgIDMzMzY3ICAgICAwLjAwICAgICAwLjAwICBjbGlf
Ym1fYWRkcGF0dA0KICAwLjAyICAgIDMwMy4xNSAgICAgMC4wNiAgICAgMzQ0MiAgICAgMC4w
MCAgICAgMC4wNiAgY2xpX21hZ2ljX3NjYW5kZXNjDQogIDAuMDIgICAgMzAzLjIxICAgICAw
LjA2ICAgICAyNDc3ICAgICAwLjAwICAgICAwLjAwICBjbGlfZGVxdWV1ZQ0KICAwLjAyICAg
IDMwMy4yNyAgICAgMC4wNiAgICAgMjE0NCAgICAgMC4wMCAgICAgMC4wMCAgaXNfdGFyDQog
IDAuMDIgICAgMzAzLjMzICAgICAwLjA2ICAgICAgICA3ICAgICAwLjAxICAgICAwLjAxICBj
bGlfYm1faW5pdA0KICAwLjAyICAgIDMwMy4zOCAgICAgMC4wNSAgICA1NzExNSAgICAgMC4w
MCAgICAgMC4wMCAgaHRtbF90YWdfYXJnX2ZyZWUNCiAgMC4wMSAgICAzMDMuNDMgICAgIDAu
MDQgICAxMjk4NDYgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9jYWxsb2MNCiAgMC4wMSAgICAz
MDMuNDcgICAgIDAuMDQgICAgIDI3MjcgICAgIDAuMDAgICAgIDAuMDAgIHp6aXBfZmlsZV9v
cGVuDQogIDAuMDEgICAgMzAzLjUxICAgICAwLjA0ICAgICAgIDIyICAgICAwLjAwICAgICAw
LjI1ICBjbGlfc2NhbnppcA0KDQpbKGJsb29tIHwgYm0pICsgYWNdDQpGbGF0IHByb2ZpbGU6
DQoNCkVhY2ggc2FtcGxlIGNvdW50cyBhcyAwLjAxIHNlY29uZHMuDQogICUgICBjdW11bGF0
aXZlICAgc2VsZiAgICAgICAgICAgICAgc2VsZiAgICAgdG90YWwNCiB0aW1lICAgc2Vjb25k
cyAgIHNlY29uZHMgICAgY2FsbHMgICBzL2NhbGwgICBzL2NhbGwgIG5hbWUNCiAyNy44NSAg
ICAgNjcuMjIgICAgNjcuMjIgMTM5NDI4ODY2ICAgICAwLjAwICAgICAwLjAwICBjbGlfZmlu
ZHBvcw0KIDI3LjMxICAgIDEzMy4xNSAgICA2NS45MyAgICAgIDI0NSAgICAgMC4yNyAgICAg
MC4yNyAgY2xpX2JtX3NjYW5idWZmDQogMTkuNTIgICAgMTgwLjI2ICAgIDQ3LjExICAgICA4
MDEyICAgICAwLjAxICAgICAwLjAxICBjbGlfYWNfc2NhbmJ1ZmYNCiAgNy4wMiAgICAxOTcu
MjEgICAgMTYuOTUgICAgMTQ1MTQgICAgIDAuMDAgICAgIDAuMDAgIGJvZHkNCiAgNC4xMSAg
ICAyMDcuMTMgICAgIDkuOTIgNzE4OTY5MzYgICAgIDAuMDAgICAgIDAuMDAgIGJhX3ZhbHVl
DQogIDIuMjAgICAgMjEyLjQ1ICAgICA1LjMyICAgOTkzMzI1ICAgICAwLjAwICAgICAwLjAw
ICBsb29rdXBfaW5fcmJfdGFibGUNCiAgMi4yMCAgICAyMTcuNzYgICAgIDUuMzEgICAgIDgw
MTIgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9ibG9vbV9maWx0ZXJfc2NhbmJ1ZmYNCiAgMS4y
NyAgICAyMjAuODEgICAgIDMuMDYgIDEwMjY2OTIgICAgIDAuMDAgICAgIDAuMDAgIHJiX2Zp
bmQNCiAgMS4wNCAgICAyMjMuMzEgICAgIDIuNTAgMTM5ODQ1MjggICAgIDAuMDAgICAgIDAu
MDAgIGNvbXBhcmVfaW50cw0KICAwLjk0ICAgIDIyNS41OSAgICAgMi4yNyAyOTQ5MzU5NiAg
ICAgMC4wMCAgICAgMC4wMCAgYmFfYXNzaWduDQogIDAuODggICAgMjI3LjcyICAgICAyLjEz
IDE1NDA3Mzc2ICAgICAwLjAwICAgICAwLjAwICBjbGlfaGV4MmludA0KICAwLjczICAgIDIy
OS40OCAgICAgMS43NiAgICAgIDU4MiAgICAgMC4wMCAgICAgMC4wMSAgY2xpX2h0bWxfbm9y
bWFsaXNlDQogIDAuNzMgICAgMjMxLjI0ICAgICAxLjc1ICAzNDY1NDcxICAgICAwLjAwICAg
ICAwLjAwICBmYXN0X2hhc2gNCiAgMC42MyAgICAyMzIuNzYgICAgIDEuNTIgICAgICAgIDcg
ICAgIDAuMjIgICAgIDAuNTQgIGluaXRfYmxvb21fZmlsdGVyDQogIDAuNjAgICAgMjM0LjIx
ICAgICAxLjQ1ICAgMTA1OTA3ICAgICAwLjAwICAgICAwLjAwICBjbGlfaGV4MnN0cg0KICAw
LjQ5ICAgIDIzNS4zOSAgICAgMS4xOCAxMzg3ODYzMCAgICAgMC4wMCAgICAgMC4wMCAgeG9y
X2hhc2gNCiAgMC40NSAgICAyMzYuNDggICAgIDEuMDkgIDIxMDk5MTYgICAgIDAuMDAgICAg
IDAuMDAgIHNkYm0NCiAgMC4yNSAgICAyMzcuMDkgICAgIDAuNjEgICAxNDk4MDYgICAgIDAu
MDAgICAgIDAuMDAgIGNsaV9yZWFkbGluZQ0KICAwLjIxICAgIDIzNy42MCAgICAgMC41MiAg
NTEyNTEyMiAgICAgMC4wMCAgICAgMC4wMCAgaHRtbF9vdXRwdXRfYw0KICAwLjE3ICAgIDIz
OC4wMyAgICAgMC40MiAgMzQzMjEwNCAgICAgMC4wMCAgICAgMC4wMCAgaXNfcG9zc2libGVf
bWF0Y2gNCiAgMC4xNCAgICAyMzguMzcgICAgIDAuMzQgICAgIDUxODggICAgIDAuMDAgICAg
IDAuMDAgIGNsaV92ZXJtZDUNCiAgMC4xMyAgICAyMzguNjggICAgIDAuMzEgICAgNTIyMjkg
ICAgIDAuMDAgICAgIDAuMDAgIGNsaV9zdHJ0b2sNCiAgMC4xMiAgICAyMzguOTcgICAgIDAu
MjkgICAgMzU2NzcgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9wYXJzZV9hZGQNCiAgMC4xMSAg
ICAyMzkuMjQgICAgIDAuMjcgICAgIDQ1MTIgICAgIDAuMDAgICAgIDAuMDAgIGNsaV9oZXgy
c2kNCiAgMC4xMCAgICAyMzkuNDcgICAgIDAuMjMgICAgIDM0NDIgICAgIDAuMDAgICAgIDAu
MDAgIGNsaV9maWxldHlwZQ0KICAwLjA2ICAgIDIzOS42MCAgICAgMC4xNCAgICA0MTQ1NyAg
ICAgMC4wMCAgICAgMC4wMCAgY2xpX2Nob21wDQogIDAuMDYgICAgMjM5Ljc1ICAgICAwLjE0
ICAgICAgICA3ICAgICAwLjAyICAgICAwLjAyICBjbGlfYm1fZnJlZQ0KICAwLjA1ICAgIDIz
OS44OCAgICAgMC4xMyAgIDEwNjIwOCAgICAgMC4wMCAgICAgMC4wMCAgY2xpX3JuZG51bQ0K
ICAwLjA1ICAgIDI0MC4wMCAgICAgMC4xMiAgICAgICAgNyAgICAgMC4wMiAgICAgMC4wMiAg
cmJfY3JlYXRlDQogIDAuMDUgICAgMjQwLjExICAgICAwLjEyICAgIDMwNTE2ICAgICAwLjAw
ICAgICAwLjAwICByYl9wcm9iZQ0KICAwLjA1ICAgIDI0MC4yMiAgICAgMC4xMSAgICAzMzM2
NyAgICAgMC4wMCAgICAgMC4wMCAgY2xpX2JtX2FkZHBhdHQNCiAgMC4wNSAgICAyNDAuMzMg
ICAgIDAuMTEgICAgMzMzNjcgICAgIDAuMDAgICAgIDAuMDAgIGluc2VydF9pbnRvX3JiX3Rh
YmxlDQogIDAuMDUgICAgMjQwLjQ0ICAgICAwLjExICAgICAgIDIyICAgICAwLjAxICAgICAw
LjI2ICBjbGlfc2NhbnppcA0KICAwLjA0ICAgIDI0MC41NCAgICAgMC4xMCAgICAgNTE4OCAg
ICAgMC4wMCAgICAgMC4wNCAgY2xpX3NjYW5kZXNjDQogIDAuMDMgICAgMjQwLjYxICAgICAw
LjA3ICAgICAzNDQyICAgICAwLjAwICAgICAwLjA0ICBjbGlfbWFnaWNfc2NhbmRlc2MNCiAg
MC4wMiAgICAyNDAuNjcgICAgIDAuMDYgICAgIDI0NzcgICAgIDAuMDAgICAgIDAuMDAgIGNs
aV9kZXF1ZXVlDQogIDAuMDIgICAgMjQwLjcyICAgICAwLjA1ICAgICAgICA3ICAgICAwLjAx
ICAgICAwLjAxICBjbGlfYm1faW5pdA0KICAwLjAyICAgIDI0MC43NiAgICAgMC4wNCAgIDE5
NjU4MCAgICAgMC4wMCAgICAgMC4wMCAgY2xpX2NhbGxvYw0K
--------------070307070005090909040801
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html
Please submit your patches to our Bugzilla: http://bugs.clamav.net
--------------070307070005090909040801--

