Jump to content

The perils of AI - and how Malwarebytes got it right


exile360

Recommended Posts

There is a lot of talk about supposed 'AI' these days (which I continue to argue isn't true 'Intelligence', artificial or otherwise, in any meaningful sense of the words; it's just complex mathematical algorithms and branching datasets at best; there is no true thinking, consciousness, or actual decision making going on beneath all that seemingly complex, morphological code that gets so much praise and fear in the media and tech industry, at least not yet), but with AI, especially in its current state, there are risk factors that must be considered because a wrong answer from an AI can potentially have catastrophic consequences.

I just watched the following YouTube video of a TEDx Talk on this very subject and I recommend anyone interested, and especially anyone in the AI development industry or in a profession where data from AIs is relied upon view it and consider the points that it makes (and it comes from an individual who works in the AI development industry, not some overly paranoid, tinfoil hat wearing conspiracy theorist):

The Real Reason to be Afraid of Artificial Intelligence | Peter Haas | TEDxDirigo

The speaker goes into detail about the risks inherent in trusting AI and shares doubts about its future if it is not used carefully and responsibly and I have personal first-hand knowledge about how AI came to be a part of Malwarebytes, how it was implemented and the checks and balances put in place to make sure that it was and continues to be handled correctly and given the appropriate amount of weight and how this differs greatly from many of Malwarebytes' competitors throughout the AV/AM industry, especially in the so-called 'next-gen' segment of the industry that many (including myself) would argue that Malwarebytes not only occupies, but was and continues to be a pioneer in to this day even if some of their existing methods that they've carried forward may seem 'dated' in comparison to some of the most notable examples and why that's actually a GOOD thing.

This all goes back to the very beginning of Malwarebytes, long before I ever joined the company as an employee and revolves around this idea that if a human being can figure out the patterns being used by malware infections to attack systems and semi-randomize their file naming and internal file structural schemes to evade traditional AV signature detection, then those man-made and machine-made patterns can actually be weaponized and used against them through heuristics techniques to very effectively eliminate them, even when new 'morphs' or as they're more commonly called 'variants' emerge thus enabling the detection of more threats with smaller databases and fewer updates as well as more thorough disinfection of not only the primary threat components (the binary files that run in memory/are written to disk), but every trace of the infection, including all of the loading points in the registry, obscure data structures like commands and scripts stored in randomly named temp files, as well as the hidden drivers and DLLs most AV researchers weren't even aware of and kept missing, resulting in their own customers turning to Malwarebytes to clean their systems because each time they tried to remove the threats with their world-famous AV, the threat would come right back, either during the current Windows session or on the next reboot, usually with a different name thus starting the process all over again (until they finally found Malwarebytes and it nailed the entire threat, all its loading points and hidden components, and eliminated all of it after a single scan and reboot of the system).  For the first time in a long time we had a weapon against malware that behaved far less like a traditional flat file scanner like an AV, and more like a professional threat researcher or expert malware removal technician on a help forum who at the time would read logs from tools like HijackThis and eliminate the threats starting with their loading points and every component they could find on the system that didn't belong to take it all out in one go.  Since then, major AV vendors have started paying much more attention to these traces which they once ignored (believe it or not, until the past several years many AVs didn't even bother scanning the registry as they assumed removing the files belonging to an infection was enough and that those 'leftovers' were just harmless traces, but had they followed the pointers in the registry to begin with they would have discovered where the files they didn't see were being loaded from to keep resurrecting the threats they thought they were eliminating; I personally had a long argument about this over on the Kaspersky forums during the heyday of the threat known as Vundo/Virtumonde (which at the time Kaspersky had classified as 'Monder' or 'Trojan.Monder') and even flat out told them that the reason they were failing to eliminate it permanently was because they were leaving all the loading points in the registry behind and just going after the files, and were missing the hidden DLL in System32 that was bringing the threat back (under a new filename every time no less) because of this and that was why Malwarebytes was nailing it while they kept on getting beaten by this new threat just as all the other major AV vendors were; this was also around the time that I really started hanging around the Malwarebytes forums because I realized they had the secret sauce that the industry was missing and that this wonderful technology, however they were doing it, was finally a match for what I'd been doing by hand for years repairing systems where I would literally boot a system from Linux or WinPE and scour through the filesystem by hand and manually delete everything that I knew didn't belong and subsequently search for every related entry in the registry after booting the system and delete every key/value belonging to the infection one-by-one).

Well, a few years ago Malwarebytes decided to try something new.  At the time I was working for them in Product Management.  The idea was to use AI and cloud technology to create a 'smart' malware detection engine that would improve over time as new data was fed to it and as the Devs tweaked it so that it could positively identify new/unknown malware while simultaneously reducing FPs.  The trouble is, and this is where it all connects to the video I linked to and the overall idea of proper AI implementation, it still has FPs and when that happens it identifies totally clean files as threats which is obviously something we didn't want.  I have seen plenty of products come and go that were based on the same idea, and while some of them have stuck around, I just don't see any of them dominating because doing AI right is hard, and trusting AI completely is a mistake.  Where Malwarebytes got it right is, to this day, even though FPs from this new technology still occur, Malwarebytes still employs Threat Researchers; actual human beings, who look at, review, and respond to every single FP that gets reported and gets it corrected immediately, and the Developer behind this new tool continues to make tweaks to the AI's algorithms as time goes on and as this data on FPs (as well as new threats, of course) are gathered to ensure that it is working as effectively and efficiently as possible.  Yet even with all this great innovation in this new component, it still only makes up a very small percentage of what Malwarebytes is as a product.  It still contains its more old-school heuristics signatures/pattern matching, rootkit detection and disinfection technology, more traditional bad website blacklisting as well as newer behavior based technologies like the proactive exploit protection engine as well as the more reactive ransomware protection technology that looks for ransomware behavior in processes already in memory.

My point is, while some of the 'innovators' in this space have built their entire products/companies on the kinds of AI technologies in this one module that Malwarebytes is using, Malwarebytes has kept humans in the loop and still relies heavily on additional layers, both more traditional as well as more innovative, to provide the best protection that they can, and this is how they've handled AI the right way.  They didn't find this new technology and throw everything into it, abandoning all of their other methods of detection and protection; they simply diversified that much more and integrated this new technology into their already rather robust set of technologies to make their existing product that much stronger.  There are certainly other players out there in this field that have done the same, but there are also plenty that have bet everything on this technology, and I truly believe this is a mistake.  You really can't take the human element out of the equation, and you can't expect a machine to always learn from or even recognize its mistakes and determine how to correct them.  You need human eyes on these problems to ensure that the technology is serving its purpose as it should and you place other engines and layers around it to round out the areas where it isn't as strong.

Link to post
Share on other sites

By the way, for the curious, Malwarebytes' own Director of Research, Mieke Verburgh (also known by her online handle Miekemoes (yes, that's literally "Mickey Mouse")) does an excellent job of explaining Malwarebytes' approach to AI in this post.  I highly recommend reading it, particularly the sections entitled What does Malwarebytes do instead? and What are the weaknesses?.

Link to post
Share on other sites

  • 3 weeks later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
Back to top
×
×
  • Create New...

Important Information

This site uses cookies - We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.