Open Source Antivirus: ClamAV

1. Introduction

We all know that checking downloaded files for virus signatures should be a mandatory thing to do nowadays because of all the malware existing in the world. ClamAV can be used to scan downloaded files, emails, pdf and rtf documents, etc. We can install ClamAV on all major operating systems including Linux, Windows, BSD, Solaris and even MacOSX.

We can install ClamAV on the Ubuntu Linux distribution with the command below:

# apt-get install clamav clamav-base clamav-daemon clamav-dbg
clamav-docs clamav-freshclam clamav-testfiles

2. ClamAV Commands

After the installation, we need to update the virus database. We can do that with the command below:

# freshclam

The freshclam command will first check if we're running an updated version of ClamAV, which we aren't (since it's not yet in the Ubuntu repositories). Because of this it will print the message below:

WARNING: Your ClamAV installation is OUTDATED!

WARNING: Local version: 0.97.5 Recommended version: 0.97.6


After that the freshclam downloads files main.cvd, daily.cvd and bytecode.cvd. The main.cvd contains main signatures and daily.cvd contains additional daily updates whereas the bytecode.cvd contains more advanced signatures. We can also download those files manually with wget:

# wget

# wget

# wget

The CVD is a ClamAV Virus Database archive that contains various files. We can unpack the CVD file with the sigtool command:

# sigtool --unpack main.cvd

After the unpacking process, the following files are present: COPYING, main.db, main.fp, main.hdb,, main.mdb, main.ndb and main.zmd. CVD supports the listed file extensions:

  • cdb: container metadata
  • fp: database of known good files
  • hdb: MD5 hashes of known malicious programs
  • ldb: matching signatures, icon signatures and PE metadata strings
  • mdb: MD5 hashes of PE sections in known malicious programs
  • ndb: hexadecimal signatures
  • rmd: archive metadata signatures
  • zmd: archive metadata signatures

There are other file extensions like: cfg, db, ftm, hdu, idb, ign, ign2, info, mdu, ndu, pdb, and wdb, but we won't describe them here. If you like, you can read more about signatures in general and what's stored in the certain files (with certain file extensions) here.

The command line program clamscan scans files and directories for viruses. Let's test if ClamAV detects the standard test virus eicar, which is not really a virus, but a safe way to test whether the antivirus software is working as it should. By standardization, every antivirus software must be able to detect the eicar test virus. The contents of the eicar test virus are presented below:

# cat


We can see it's just some gibberish that doesn't actually do anything.

We can download eicar from: . If we scan the with ClamAV, it is detected as a virus, which can be seen below:

# clamscan -i -r Eicar-Test-Signature FOUND

----------- SCAN SUMMARY -----------

Known viruses: 1301403

Engine version: 0.97.5

Scanned directories: 0

Scanned files: 1

Infected files: 1

Data scanned: 0.00 MB

Data read: 0.00 MB (ratio 0.00:1)

Time: 4.530 sec (0 m 4 s)

Now we can search for the string “Eicar-Test-Signature” in all previously extracted files (from .cvd). We can quickly observe that the found signature is located in the main.ndb file:

# grep "Eicar-Test-Signature" main.ndb



We can do that same with the sigtool command with the -f option like this:

# sigtool -f "Eicar-Test-Signature"



If we convert the hexadecimal representation into ASCII representation, we get the output below:



The ASCII representation is exactly the same as the contents of the test virus file. We could change the string to look for similar variations of the string and save the signatures in a new database .ndb. When running clamscan afterwards, we need to specify the new database to search in with the -d command line parameter.

The signature format above is constructed from four fields separated by a colon ':'. First there is a signature name which can be any unique name. What follows is the target parameter, which specifies the type of the file to match. Afterward there's an offset representing a specific position in the file and a hexadecimal signature.

If we take a look at our test signature, it's quite clear that the signature name is “Eicar-Test-Signature”, the target file is 0 (which means any file type), the offset is 0, and the hexadecimal signature is:  58354f2150254041505b345c505a58353428505e2937434329377d2445494341522d5354414e444152442d414e544956495255532d544553542d46494c452124482b482a.

The hexadecimal signature can also have wildcards that correspond to regular expressions when searching for some signature in the files. ClamAV supports the following wildcards when used in hexadecimal representation [1]:

  • ??                              : match any byte
  • a?                              : match the high four bits
  • ?a                              : match the low four bits
    •                                : match any number of bytes
  • {n}                            : match n bytes
  • {-n}                          : match n or less bytes
  • {n-}                          : match n or more bytes
  • {n-m}                       : match between n and m bytes
  • (aa|bb)                      : match aa or bb
  • !(aa|bb)                     : match any byte except aa and bb
  • HEXSIG[x-y]aa        : match an anchored to hex-signature
  • (B)                            : match word boundary
  • (L)                            : match CR, CRLF or file boundaries

3. ClamAV Daemon

We can also run ClamAV as a daemon. To start a ClamAV daemon, we need to execute the command below:

# /etc/init.d/clamd start

* Starting clamd ...
  [ ok ]

* Starting freshclam ...
[ ok ]

By running ClamAV as a daemon, the program is already loaded and in memory, which makes the scanning quicker. To scan for infected files and folders on a computer we can use the command clamdscan instead of the command clamscan with the same parameters. By using the ClamAV daemon, other programs can easily connect to ClamAV antivirus program, like email client, etc.

Another important parameter we can pass to the clamscan or clamdscan commands is the --remove switch that removes all the infected files. But I would not advise you to use that switch, because of false positives. If the ClamAV mistakenly identifies non-malicious file as being infected and thus malicious, it will delete it without making a backup. To make things worse, what if that file contained some important documents that we didn't have another backup somewhere, the files are lost then. This is why it's way better to configure ClamAV to notify us when an infected file is detected. This way we can check if the file is indeed malicious and delete it ourselves.

If we run the clamdscan on the file again we can see that the file was scanned instantly, since all the antivirus definitions are already loaded into memory. We can see the output below. Do you notice that the scan took 0 seconds, but with a clamscan command it took 4 seconds.

# clamdscan -i Eicar-Test-Signature FOUND

----------- SCAN SUMMARY -----------

Infected files: 1

Time: 0.000 sec (0 m 0 s)

4. Conclusion

We can conclude that scanning files with ClamAV is really helpful in detecting malicious programs. The great advantage of ClamAV is that it's open-source, which means that we can extract signatures, look at the exact signatures used to detect certain file as malicious, etc.


[1]: Creating signatures for ClamAV, accessible on