This is a public Forum  publicRSS

Topic

    Pavol Procka
    Barracuda SPAM filtering Best Practices?Answered
    Topic posted September 8, 2017 by Pavol ProckaApprentice, last edited September 8, 2017 
    31 Views, 5 Comments
    Title:
    Barracuda SPAM filtering Best Practices?
    Content:

    Hi,

    I am wondering whether anyone can suggest some "best practices" for using the Barracuda SPAM filtering?

    It seems there must be something we are not doing exactly right as the tool gives us way too many false positives in the Bayesian analysis.

    I understand that there has to be a minimum of 200 messages classified as both SPAM and NOT SPAM before the Bayesian works, but was wondering whether maybe having too many in either one can be confusing for the database? We have performed a reset as we had close to 90k in the SPAM category and are currently at about 2k and just reaching 200 in the NOT SPAM category.

    Question is whether it makes more sense to stop adding messages once they reach certain numbers (and just continue deleting them without submitting to Barracuda), or is it better to keep adding them (if so, until when)?

    Thanks

    Pavol

    Best Answer

    Pramod V

    I am not sure if its relevant, but I constantly scan through KB answers; here are few answers I feel will be helpful:

    ~VIP

    Answer

     

    • Pramod V

      I am not sure if its relevant, but I constantly scan through KB answers; here are few answers I feel will be helpful:

      ~VIP

    • Pavol Procka

      Hi Pramod,

       

      thanks for the reply, I must say I have gone through most of the links you provided (the last one was even started by me :-) )

      One behavior I saw is that even if we do not have 200 messages categorized as SPAM and NON-SPAM, we still were getting emails tagged based on the X-Barracuda-Bayes line every day.

      I will anyway mark your answer as the "best" as I do not believe anyone maybe outside the Barracuda support people can give a better answer from the resources available here.

      In the meantime, I am also having a case with the tech support open to:

      - answer why some of our innocent looking emails are scoring so high

      - answer why the Bayesian scoring still worked even if we did not have 200 messages.

      Should I receive any relevant answers from there, I will post them here.

       

      We have now (today) reached the 200 also on the NON-SPAM category and as per the advise directly in the settings, I will continue to only add these and not the ones for SPAM category. Hopefully this will decrease the amount of false positives we are getting TAG-ed every day.

       

      Many thanks

       

      Pavol

    • Pramod V

      Well, I am not entirely sure, I would have a look at VCIO or mailbox filters just to be sure. And also check Techmail -S dbstatus to be sure.

      ~VIP

    • Steven House

      Hello Pavol,

      This is a great question and one that can be answered in a few different ways. Instead of trying to over complicate it with all what Bayesian can do, I’m just going to boil it down to what I think is important and then expand on what I think could be an ideal setup.

      We know that the Bayesian filtering is reliant on the user defining Spam/NotSpam. We also know that it uses the content to define the rules and when it gets applied. This can cause issues if what you’re trying to solve is already being addressed by other filters built into firewall. For example, if you think a link is spammy, well, that’s already run through various link reputation filters to weed out possible ‘malicious links’ (It’s also given a spammyness score). Using Bayesian to try and weed out those messages will likely negatively impact legitimate sends. That’s because spammers use real legitimate email templates, created by reputable brands, and replace one link to try and fool the ‘rules’.  If you’re looking to best lavage the Bayesian filter, I would suggest focusing on templates that you know don’t traditionally reflect your customer’s sends. This could be messages with no text and one link. Or it could be a message with many links, and no text. The point is, you’d want to create a Bayesian rule that addresses a portion of you mail that barracuda wouldn’t normally define as ‘spam’.

      My suggestion, I think the focus should be more on the scoring and less on how Bayesian algorithm is applied (Note: I’m not going to get into specifics here, but I’ll make blanket statements that may/may not to your account Pavol). This includes adjusting the three available metrics at your disposal (Block, Quarantine, Tag). Let’s assume you have the following Default Spam Filter Configurations:

      Block

      9

      Quarantine

      10 (disabled)

      Tag

      3.5

      This, obviously, isn’t an ideal configuration for some groups. That’s because the customer is putting the responsibility to define the ‘Spammyness’ of a message on to the support team. They may not have the training to differentiate Spam/Not Spam. It, also, doesn’t leverage the ‘Block’ feature within the tool.

      Personally, depending on your industry, I tend to go with the following:

      Block

      7

      Quarantine

      3

      Tag

      2

      I’m a fan of this because, well, I like to be the final filter before messages get delivered (Also, I’m not a fan of the ‘tag’ feature, but that’s because I like to be in control of such rules). This can be a lot of work, and means that I’ll be constantly reviewing messages, but this works me. That’s because I’d like to be the one to deliver/block messages. Over time, depending on how the scoring is being applied (Information can be found in the message header), I may start decreasing the ‘Block Rule’ down to a lower number (5/6). This would decrease the number of messages I would see in the quarantine folder, and eliminate a lot of the extra noise I may be receiving.

      It’s important to restate that the scoring rule is based on a number of different factors: Sandboxing (Maltrace/RUL Trace), Advanced Intent Analysis, Anti-Fraud Intelligence, Polymorphic Virus Detection, Reputation Analysis, Fingerprinting, Human Analysis. This just means that it already does a ton of the work before any security admin lays eyes on the messages (AKA: Us). That’s why I tend to be more comfortable with applying the block rule to the high scoring messaging.

      I know this is a lot, so I hope there’s something here in my answer that can help. If not, please feel free to continue working with tech support and I’m sure we can get you a more specific answer to address your concerns.

      Best,

      Steven House | Senior Deliverability Specialist
      Oracle Cloud Operations

    • Pavol Procka

      Thanks, I have now closed the helpdesk case and will continue as per the advise here.