Re: [Clamav-users] Signature dups

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Tom Shaw
Date:  
To: ClamAV users ML
New-Topics: [Clamav-users] List bounces
Subject: Re: [Clamav-users] Signature dups
At 11:05 PM +0200 6/30/09, Tomasz Kojm wrote:
>On Tue, 30 Jun 2009 11:26:25 -0700
>"Bill Landry" <> wrote:
>
>> So if I were to include a signature in my 3rd party database, and then a
>> few days later ClamAV adds the same signature to the official signature
>> database, that is not your problem, but rather my problem? Seems like if
>> you (ClamAV) is providing the means for including 3rd party databases,
>> then wouldn't you agree that it really is ClamAV's responsibility to make
>> sure that duplicate signatures do not get loaded and used?
>
>Hi Bill,
>
>taking care about duplicates in the engine doesn't make sense (see below).
>Without a centralized system for signature maintenance we offered to 3rd
>parties, it's not possible to avoid duplicates. Having said that,
>even if there
>were a few thousands of duplicated sigs, it shouldn't cause any significant
>slowdown to the engine.
>
>> > We had an idea to allow 3rd party signature
>> > creators to use our mechanisms for signature maintenance ([1], easy
>> > checking for FPs, dups, name collisions) and also our network
>> > infrastructure and freshclam to make everything more smooth but
>> > unfortunately this idea didn't get much interest.
>>
>> Hmmm, first I've heard of this. Why was there a lack of interest?
>
>Well, I don't know why.. AFAIK, only Securiteinfo was interested in using
>that solution. And in my opinion it would only have advantages - all the
>mechanisms we developed for the last 7 years, including the mirror
>infrastructure, could be used to maintain and distribute the 3rd party
>sigs making all processes much more efficient!
>
>> > It would be inefficient (and could be even unsafe in some cases) to do
>> > such things in the engine.
>>
>> Why is that? If ClamAV sorts all signatures when reloading, and ignores
>> duplicate signatures, why would that be dangerous in the engine?
>
>Because detecting duplicated signatures is not that easy and must be
>done with a great care so that we don't incorrectly skip some unique sigs!
>
>Eg. the following logical sigs are all duplicates:
>
>Sig1;Target:0;0&1&(2|3);dead;beef;feed;face
>Sig2;Target:0;0&((1&2)|(1&3));dead;beef;feed;face
>Sig3;Target:0;0&1&(2|3);dead;beef;face;feed
>Sig4;Target:0;(0|1)&2&3;feed;face;dead;beef
>
>but this one is not (and still is very similar):
>
>Sig5;Target:0;(0|1)&2&3;feed;dead;face;beef
>
>Even for some very simple hex signatures there may be cases where
>it's not easy to detect dups, eg. dead{3}beef is in practice a duplicate
>of dead??????beef but since the engine handles these signatures
>differently, the situation complicates again. So in the engine we could
>only implement some very limited checks, but then the other day
>someone would open a bug report that this "feature" doesn't work
>nicely for some sigs... (take the issue with local.ign for example)
>
>The centralized system for signature development eliminates the
>problem because one can easily see that a sample is already detected
>(such samples automatically get "closed"). It could also provide some
>detection of duplicates which could be later handled manually. It's
>working really great for us that's why we made that offer to 3rd party
>signature developers. Hopefully, we will close the bug #781 some day...


Tomas,

I like having a central DB. In fact I think the central DB should be
queryable (eg submit signatures and get feedback if they are already
superceded but other detections)

On a similar line I suggested to Luca a while ago that it would be go
if you maintained a DB of MD5 signatures of files that you have
processed. I have submitted over 1600 unique malware files since 23
Mar and I am pretty sure that 99% are real malware because they show
up in my honeypot. Unfortunately, I have 1054 outstanding that I
have in my winnow_malware.hdb sig file that still do not have
"official" signature for them.

As far as an MD5 DB, I would like it to include the following status:
in queue, verified benign, and in work. This would allow me to know
that you have it and know when something is benign. I know you must
have something like this internally if for any reason to cull dups
and to checkout or signature creation so adding some exposure of the
DB shouldn't be an issue.

Unfortunately nothing has come from this....

Tom
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml