# How can detect a file that was infected throuh hash code?

## Recommended Posts

Hello you guys, I'm a student.

When I study malware, I wonder how we can detect a file that was infected? I google and then knowing that, AV, and some sandboxs detect malware through hash256. Howerver, I can not understand how it works. Please help me know, how we can hash a file then compare with hash database?

##### Share on other sites

A hash value is a CheckSum value.  It is a mathematical representation of a binary.  Supposedly each file will have a unique checksum value.  Thus a checksum value can be used to represent a particular file.  If a File Infecting Virus alters a given binary, the Checksum value for that binary will change.

A hash Table is basically a library of CheckSum values.  An algorithm is used to generate a CheckSum value and it is them compared to the values in said library.

##### Share on other sites
1 hour ago, David H. Lipman said:

A hash value is a CheckSum value.  It is a mathematical representation of a binary.  Supposedly each file will have a unique checksum value.  Thus a checksum value can be used to represent a particular﻿ file.  If a File Infecting Virus alters a given binary, the Checksum value for that binary will change.

A hash Table is basically a library of CheckSum values.  An algorithm is used to generate a CheckSum value and it is them compared to the values in said library.

Thank you for reply! But as you said, a checksum value can be used to represent a particalar file, so how can we detect which malware that infect into file? For example: If we have a infected file, how can we detect that it was infected and which malware infected through checksum? And if we upload that file onto virustotal.com, how it detects malware?

##### Share on other sites

You asked... " so how can we detect which malware that infect into file "

To answer the question, I need to get further information.  When you state "infect into file" are you talking about a file infecting virus that prepends, appends or cavity injects malicious code into that file ?

##### Share on other sites
59 minutes ago, David H. Lipman said:

You asked... " so how can we detect which malware that infect into file "

To answer the question, I need to get further information.  When you state "infect into file" are you talking about a file infecting virus that prepends, appends or cavity injects malicious code into that file ?

I just wonder why virus engines can detect the file which was infected and the name and type of malware infected that file

##### Share on other sites

If a virus prepends, appends or cavity injects malicious code into a legitimate file it is a relatively consistent set of instructions.  That consistent set of instructions is then used to generate a signature for detection.  Each infector will have a different consistent set of instructions and based upon their differences one can conclude that a specific set of instructions can be tied to the infector and thus a name is applied/created for that infector such as Virus, Parite and Sality..  If there is a variation of said set of instructions then the new variant is a assigned a variant detection such as Parite.A.

The rreal trick of a true anti virus application is to not only detect the infected file but to return an infected file to its preinfected state.  That is whatever code that is appended, prepended or cavity injected must be removed in a way that leaves the file in a working state that must match as closely as possible to the file's preinfected state.  Which may or may not return the file to its pre-infected CheckSum value.

##### Share on other sites
16 minutes ago, David H. Lipman said:

If a virus prepends, appends or cavity injects malicious code into a legitimate file it is a relatively consistent set of instructions.  That consistent set of instructions is then used to generate a signature for detection.  Each infector will have a different consistent set of instructions and based upon their differences one can conclude that a specific set of instructions can be tied to the infector and thus a name is applied/created for that infector such as Virus, Parite and Sality..  If there is a variation of said set of instructions then the new variant is a assigned a variant detection such as Parite.A.

Thanks for your explaination, I understood a little bit. Can you give me some keywords to search more?

##### Share on other sites

Sorry, no 'cause I don't really have a handle of what you are seeking.

##### Share on other sites
20 hours ago, David H. Lipman said:

Sorry, no 'cause I don't really have a handle of what you are seeking.

btw, thanks for all your help!

##### Share on other sites
On 8/1/2018 at 3:55 AM, h4niz said:

Thanks for your explanation, I understood a little bit. Can you give me some keywords to search more?

Read from Davids words > > "If a virus prepends, appends or cavity injects malicious code into a legitimate file it is a relatively consistent set of instructions".

These can be identified in a vague way, without being over specific by using online dictionaries where you will get definitions, but your question is still a bit vague, and that is why YOU should be doing a bit more research.

I got a slap on the wrist for asking outside of the school where I was learning Malware Fighting.. The secrets lay in coded instructions like the help that you are getting from David ..

EDIT : P.S. Sorry if I over-posted your views, that I found very helpful David ..

Edited by noknojon
##### Share on other sites
8 hours ago, h4niz said:

btw, thanks for all your help!

##### Share on other sites
• 2 months later...

Hi there,

I want to learn more about viruses, detection, .... however I'm student and I have not known basic knowlegde yet. So can you give me some recommendation about ebook, online courses?

Thanks!

##### Share on other sites
• Staff

Greetings,

Most scan engines today, including and especially Malwarebytes, don't actually use hash calculations to detect when a file is malicious very often any more.  This is for several reasons.  First, malicious files change quite frequently in order to evade detection by security software like Malwarebytes, and since a hash based detection method would only target one specific copy of an infection and not any of its variants/modified versions (often called "morphs" to reflect the polymorphism of modern threats) or variants (later iterations of the same infection/within the same "family" of infections), technology called "heuristics" is generally used instead.  Heuristics is a way of using attributes common to malware or a specific family or type of malware to detect other variants of that malware within the same threat family.  The bad guys have taken it so far attempting to evade detection that these days you will often find that when you download a threat from a malicious source, if you download multiple copies of the same threat within a short time span, you will find that each copy downloaded is unique/different from the others in some way.  There are many ways to accomplish this, be it making small arbitrary changes to the malicious file without affecting its function and behavior or even compressing or "packing" the malware using different settings or compression tools (often called "packers", with some packers being known to be used exclusively or almost exclusively by malware, which is why some security programs target some types of packed/compressed files deliberately based on that factor alone).  Complex mathematical algorithms, whitelists of known safe/good files and signature-less behavior based detection are now far more common techniques used in the antivirus and anti-malware industry to detect threats than static hash based threat detection databases.

This also has the added benefit of being more proactive since, in order to detect a file based on its hash checksum, the threat researcher would need to have a copy of the malicious file already, whereas more generic, heuristic and behavioral detection methods rely on other factors, often allowing them to detect new threats and new variants of existing threats without having to rely on having seen each specific copy of a particular threat yet, thus enabling what is known in the industry as "0-hour" or "0-day" threat detection (something that Malwarebytes and many other security vendors strive for constantly so that their users/customers are protected at hour 0 on day 0 of a new infection going live in the wild rather than having to wait for their researchers to capture a sample of the threat and for users to be infected so that the threat gets reported to them, thus preventing anyone from needing to get infected by the threat in the first place and making the web as a while much safer).

Edited by exile360
• 1

## Create an account

Register a new account

×

• Back