Vx Heavens Virus Collection
Vx Heavens Virus Collection List. Part of the message reads: For many years we were tried hard to establish a reliable work of the site, which supplied you with a.
Dmitry Gryaznov
Proceedings of the Fifth International Virus Bulletin Conference, pp.225-234
1999
Is the VX Heavens Virus collection consisting of more than ten thousand mali-cious PE files 27. The second malware dataset is the Malfease dataset, which contains more than five thousand malicious PE files 21. We also collected more than one thousand benign PE files from our virology lab, which we use in con. VX Heaven Download Broken. I have used this website for a very long time and it goes up and down (sometimes raids ) and it came back up about 3 weeks ago after being. Available collection called ‘VX Heavens Virus Collection’ 3. The results of our experiments show that our system pro-vides an accuracy of 0.98 on the average. We have also car-ried out the scalability analysis to identify a minimal subset of API categories to be monitored whilst maintaining high detection accuracy. Visit millions of free experiences on your smartphone, tablet, computer, Xbox One, Oculus Rift, and more.
- 1. Why scanners
- 2. Why heuristics
Introduction
At the beginning of 1994 the number of known MS-DOS viruses was estimated at around 3,000. One year later, in January 1995, the number of viruses was estimated at about 6,000. By the time this paper is being written (July 1995), the number of the known viruses has exceeded 7,000. Several anti-virus experts expect the number of viruses to reach 10,000 by the end of the year 1995. This big number of viruses, which keeps growing fast, is known as glut problem and it does cause problems to anti-virus software, especially to scanners.Today scanners are the most often used kind of anti-virus software. Fast growing number of viruses means that scanners should be updated frequently enough to cover new viruses. Also, as the number of viruses grows, so does the size of scanner or its database. And in some implementations the scanning speed suffers.
It was always very tempting to find an ultimate solution to the problem, to create a generic scanner, which can detect new viruses automatically, without need to update its code and/or database. Unfortunately, as it was proved by Fred Cohen, the problem of distinguishing a virus from a non-virus program is algorithmically unsolvable in general case.
Nevertheless, some generic detection is still possible. It is based on analysing a program for features typical or not typical for viruses. The set of features, possibly together with a set of rules, is known as heuristics. Today more and more anti-virus software developers are looking towards heuristic analysis as at least a partial solution to the glut problem.
Working at the Virus lab, S&S International PLC, the author is also carrying out a research project on heuristic analysis. The article explains what heuristics are. Positive and negative heuristics are introduced. Some practical heuristics are represented. Different approaches to a heuristic program analysis are discussed. False alarms problem pointed and discussed. Several well-known scanners employing heuristics are compared (without naming the scanners) in both the virus detection rate and false alarms rate.
1. Why scanners
If you are following computer virus related publications, such as proceedings of anti-virus conferences, magazines' reviews, anti-virus software manufacturers press releases, you read and hear mainly 'scanners, scanners, scanners'. An average user might even get an impression there is no anti-virus software other than scanners. This is not true. There are other methods of fighting computer viruses. But they are not that much popular and well known as scanners are. And anti-virus packages based on non-scanners technology do not sell well. So that sometimes people, who are trying to promote non-scanner based anti-virus software even come to conclusion there must be some kind of an international plot of popular anti-virus scanners producers. Why is it so? Let us briefly discuss existing types of anti-virus software. Those interested in more detailed discussion and comparison of different types of anti-virus software can find it in [Bontchev1], for example.
1.1. Scanners
So, what is a scanner? Simplifying, a scanner is a program which searches files and disk sectors for byte sequences specific to this or that known virus. Those byte sequences are often called virus signatures. There are many different ways to implement a scanning technique: from so called dumb or grunt scanning of the whole file to sophisticated virus-specific methods of deciding which particular part of the file should be compared to a virus signature.
Nevertheless, one thing is common to all the scanners: they can detect only known viruses. That is, viruses which were disassembled or analysed this or that way and from which virus signatures unique to a specific virus were selected. In most cases a scanner cannot detect a brand new virus until the virus is passed to the scanner developer who then extracts an appropriate virus signature and updates the scanner. This all takes time. And new viruses appear virtually every day. This means, scanners have to be updated frequently to provide an adequate anti-virus protection. A version of a scanner which was very good half a year ago might have become no good today if you got hit by just one of the several thousands new viruses appeared since the version was released.
So, are there any other ways to detect viruses? Are there any other anti-virus programs which do not depend so heavily on certain virus signatures and thus might be able to detect even new viruses? The answer is - yes, there are: integrity checkers and behaviour blockers (monitors). These types of anti-virus software are almost as old as scanners and are known to specialists for ages. Why are they not used so widely the scanners are then?
1.2. Behaviour blockers
A behaviour blocker (or a monitor) is a memory-resident (TSR) program which monitors system activity and looks for virus-like behaviour. In order to replicate a virus needs to create a copy of itself at this or that point. Most often viruses modify existing executable files to achieve this. So, in most cases behaviour blockers try to intercept system requests which lead to modifying executable files. When such or another suspicious request is intercepted, a behaviour blocker typically alerts a user and, based on the user's decision, can prohibit such a request from being executed. This way a behaviour blocker does not depend on detailed analysis of a particular virus. Unlike a scanner, a behaviour blocker does not need to know what a new virus looks like to catch it.
Unfortunately, it is not that easy to block all the virus activity. Some viruses use quite effective and sophisticated techniques, such as tunnelling, to bypass possibly present behaviour blockers. Even worse, some legitimate programs use virus-like methods which could trigger a behaviour blocker. For example, an install or setup utility is often modifying executable files. So, when a behaviour blocker is triggered by such a utility, it's up to the user to decide whether it is a virus or not. And this is often a tough choice - you would not assume all the users are anti-virus experts, would you?
But even an ideal behaviour blocker (there is no such thing in our real world, mind you!), which never triggers on a legitimate program and never misses a real virus, still has a major flaw. To enable a behaviour blocker to detect a virus, the virus must be run on a computer. Not to mention virtually any user would reject the very idea of running a virus on his/her computer, by the time a behaviour blocker catches the virus attempting to modify executable files, the virus could have triggered and destroy some of your valuable data files, for example.
1.3. Integrity checkers
An integrity checker is a program which should be run periodically (say, once a day) to detect all the changes made to your files and disks. This means, when an integrity checker is first installed to your system, you need to run it to create a database of all files on your system. Then during subsequent runs the integrity checker compares files on your system to the data stored in the database and detects any changes made to the files. Since all the viruses modify either files or system areas of disks in order to replicate, a good integrity checker should be able to spot such changes and to alert the user. Unlike a behaviour blocker, it is much more difficult for a virus to bypass an integrity checker, provided you run your integrity checker in a virus clean environment - e.g. having booted your PC from a known virus free system diskette.
But again, as in the case of behaviour blockers, there are many possible situations when the user's expertise is necessary to decide whether changes detected are results of a virus activity. Again, if you run an install or setup utility, this normally results in some modifications made to your files which can trigger an integrity checker. That is, every time you are installing some new software to your system, you have to tell your integrity checker to register these new files in its database.
Also, there is a special type of viruses, aimed at integrity checkers specifically - so called slow infectors. A slow infector only infects objects which are about to be modified anyway - e.g. as a new file being created by a compiler. Then an integrity checker will add this new file to its database to watch its further changes. But in the case of a slow infector the file added to the database is infected already!
But even if integrity checkers were free of the above drawbacks, there still would be a major flaw. That is, an integrity checker can alert you only after a virus has run and modified your files. As in the example given while discussing behaviour blockers, this might be well too late..
1.4. That's why scanners!
So, the main drawbacks of both behaviour blockers and integrity checkers, which prevent them from being widely used by an average user, are:
- Both behaviour blockers and integrity checkers, by their very nature, can detect a virus only after you have run an infected program on your computer and the virus started its replication routine. By this time it might be too late - many viruses can trigger and switch to destructive mode before they make any attempts to replicate. It's somewhat like you decide to find out whether these beautiful yet unknown berries are poisonous by eating them and watching the results. Gosh! You would be lucky to get away with just a dyspepsia!
- Often enough the burden to decide whether it is a virus or not is transferred to the user. It's like your doctor leaves you to decide whether your dyspepsia is simply because the berries were not ripe enough or it is the first sign of a deadly poisoning and you'll be dead in few hours if you don't take an antidote immediately. Tough choice!
On the contrary, a scanner can and should be used to detect viruses before an infected program gets a chance to be executed. That is, by scanning the incoming software prior to installing it to your system, a scanner tells you whether it is safe to proceed with the installation. Continuing our berries analogy, it's like having a portable automated poisonous plants detector, which quickly checks the berries against its database of known plants and tells you whether it is safe to eat the berries.
But what if the berries are not in the database of your portable detector? What if it is a brand new species? What if a software package you are about to install is infected with a new very dangerous virus unknown to your scanner? Relying on your scanner only, you might find yourself in a big trouble. This is where behaviour blockers and integrity checkers might come helpful. It's still better to detect the virus while it's trying to infect your system or even after it has infected it but yet before it destroys your valuable data. So, the best antivirus strategy would include all the three types of anti-virus software:
- a scanner to ensure the new software is free of at least known viruses before you run the software;
- a behaviour blocker to catch the virus while it is trying to infect your system;
- an integrity checker to detect infected files after the virus propagated to your system but has not triggered yet.
As you can see, the scanners are the first and the most simply implemented line of the anti-virus defence. Moreover, most people have scanners as the only line of the defence.
2. Why heuristics
2.1. Glut problem.
As was mentioned above, the main drawback of scanners is that they can detect only known computer viruses. Six-seven years ago this was not a big deal. New viruses appeared rarely. Anti-virus researches were literally hunting for new viruses, spending weeks and months tracking down rumours and random reports about a new virus to include its detection to their scanners. Probably at those times a most nasty computer virus related myth was born that anti-virus people are developing viruses themselves to force users to buy their products and to make profit this way. Some people believe this myth even today. Whenever I hear it, I can't help hysterical laughter. Our days with two to three hundreds new viruses arriving monthly, it would be total waste of time and money for anti-virus manufacturers to develop viruses. Why should they bother if new viruses arrive in dozens virtually daily completely free of charge? There were about 3,000 known DOS viruses at the beginning of 1994. A year later, in January 1995, the number of viruses was estimated at at least 5,000. Another six months later, in July 1995, the number exceeded 7,000. Many anti-virus experts expect the number of known DOS viruses to reach 10,000 mark by the end of 1995. With this tremendous and still fast growing number of viruses to fight, traditional virus signature scanning software is pushed to its limits [Skulason, Bontchev2]. While several years ago quite often a scanner was developed, updated and supported by a single person, today a team of a dozen skilled employers is merely enough. With increasing number of viruses, R&D and Quality Control time and resources requirements grow. Even monthly scanners updates are often late by.. one month at least. Many former successful anti-virus vendors are giving up and leaving the anti-virus battleground and market. The fast growing number of viruses heavily affects scanners themselves. They become bigger and sometimes slower. Just few years ago a 360Kb floppy diskette would be enough to hold half a dozen of popular scanners, still leaving plenty of room for system files to make the diskette bootable. Today an average good signature-based scanner alone would occupy at least 720Kb floppy, leaving virtually no room for anything else.
So, are we losing the war? I would say, not yet. But if we get stuck with just virus signature scanning, we would lose it sooner or later. Having realised this some time ago, anti-virus researches started to look for more generic scanning techniques, known as heuristics.
2.1. What are heuristics?
In the anti-virus area, heuristics are a set of rules which should be applied to a program to decide whether the program is likely to contain a virus or not. From the very beginning of the history of computer viruses different people started looking for an ultimate generic solution to the problem. Really, how does an anti-virus expert know the program is a virus? This usually involves some kind of reverse engineering - most often disassembly - and reconstructing and understanding the virus' algorithm: what it does and how it does this. Having analysed hundreds and hundreds of computer viruses, it takes just few seconds for an experienced anti-virus researcher to recognise a virus, although the virus is a new one and was never seen before. It is kind of an almost subconscious automated process. Automated? Wait a minute! If it is an automated process, let's make a program to do this!
Unfortunately (or rather - fortunately!) the analytic capabilities of human brains are far beyond those of a computer. As was proved by Fred Cohen [Cohen], it is impossible to construct an algorithm (e.g. a program) to distinguish a virus from a non-virus with 100 per cent reliability. Fortunately, this does not rule out a possibility of 90 or even 99 per cent reliability. And with the remaining one per cent cases we hopefully shall be able to deal with using our traditional virus signatures scanning technique. Anyway, it's worth trying at least.
2.2. Simple heuristics
So, how do they do it? How an anti-virus expert recognises a virus? Let us consider the simplest case: a parasitic non-resident appending .COM file infector. Something like Vienna but even more primitive. Such a virus appends its code to the end of an infected program, stores few (usually just three) first bytes of the victim file in the virus body and replaces those bytes with a code to pass control to itself. When the infected program is executed, the virus gets control. First it restores the original victim's bytes in its memory image. It then starts looking for other .COM files around. When found, the file is opened in Read_and_Write mode, the virus reads first few bytes of the file and writes itself to the end of the file. So, the primitive set of heuristic rules for a virus of this kind would be:
- The program immediately passes control close to the end of itself;
- it modifies some bytes at the beginning of its copy in memory;
- then it starts looking for executable files on a disk;
- when found, a file is opened;
- some data are read from the file;
- some data are written to the end of the file.
Soundtoys 5 mac torrent. Each of the above rules has corresponding sequence in binary machine code or assembler language. In general, if you look at such a virus under DEBUG program, the favourite tool of anti-virus researches, this is usually represented in a code similar to this:
Figure 1. A sample virus code.
When an anti-virus expert sees such a code, it is immediately obvious to him/her that this must be a virus. So, our heuristic program should be able to disassemble a binary machine-language code a similar way DEBUG does and to analyse it looking for particular code patterns a similar way an anti-virus expert does. In the simplest cases as the one above a set of simple wildcard signature string matching would do for the analysis. In the case the analysis itself is simply checking whether the program in question satisfies rules 1 through 6. In other words, whether the program contains pieces of code corresponding to each of the rules.
In more general case, there are many very different ways to represent one and the same algorithm in machine code. Polymorphic viruses, for example, do this all the time. So, a heuristic scanner must use much clever methods rather than simple pattern-matching technique. Those methods may involve some statistical code analysis, partial code interpretation and even CPU emulation, especially to decrypt self-encrypted viruses. But you would be surprised to know how many real life viruses would be detected by the above six simple heuristics alone! Unfortunately, some non-virus programs would be 'detected' too.
2.3. False alarms problem
Strictly speaking, heuristics are not detecting viruses. Similar to behaviour blockers, heuristics are looking for virus-like behaviour. Moreover, unlike the behaviour blockers, heuristics are able to detect not the behaviour itself, but just the potential ability to perform this or that action. Indeed, the fact a program contains certain piece of code does not necessarily mean this piece of code is ever executed. And the problem to find out whether this or that code in a program ever gets control is known in theory of algorithms as the Halting Problem and is unsolvable in general case. By the way, this was the basis of Fred Cohen's proof of impossibility to write an absolute virus detector. For example, some scanners contain pieces of virus code as the signatures to scan for. Those pieces might correspond to each and every of the above six rules. But they are never executed - the scanner uses them just as its static data. Since in general case there is no way for heuristics to decide whether these code pieces are ever executed or not, this can (and sometimes does) cause false alarms.
A false alarm is when an anti-virus software reports a virus in a program, which in fact does not contain any viruses at all. Different types of false alarms as well as most widespread causes of false alarms are described in [Solomon] for example. A false alarm might be even more costly than an actual virus infection. We all keep saying to users: 'The main thing to remember when you think you've got a virus - do not panic!'. Unfortunately, this does not work well. An average user does panic. And the user panics even more if the anti-virus software is unsure itself whether it is a virus or not. In the case, say, a scanner definitely detects a virus, the scanner is usually able to detect all infected programs and to remove the virus from them. And at this point the panic is usually over. But if it is a false alarm, the scanner will not be able to remove the virus and most likely will report something like 'This file seems to have a virus', naming just s single file as infected. This is when the user really starts to panic. 'It must be a new virus!' - the user thinks. 'What do I do?!' As the result, the user well might format his/her hard disk, causing himself far worse disaster than a virus could cause. An unnecessary and not justified act, by the way. More as there are many viruses which would survive the formatting, unlike the legitimate software and data stored on the disk.
Another problem a false alarm can (and did) cause is a negative impact on a software manufacturing company. If an anti-virus software falsely detects a virus in a new software package, the users will stop buying the package. And the software developer will suffer not only profit losses but also a loss of reputation. Even if later it will be made known it was a false alarm, too many people would think 'There is no smoke without a fire' and would treat the software with a suspicion. This affects the anti-virus vendor as well. There was already a case when an anti-virus vendor was sued by a software company in product of which the anti-virus mistakenly reported a virus.
In a corporate environment when a virus is reported by an antivirus software, whether it is a false alarm or not, the normal flow of operation is interrupted. It takes at best several hours to contact the antivirus technical support and to make sure it was a false alarm before the normal operation is resumed. And, as we all know, time is money. And in the case of a big company, time is big money. So, it is not surprising at all that when asked what level of false alarms is acceptable (10 per cent? 1 per cent? 0.1 per cent?), corporate customers answer: 'Zero per cent! We do not want any false alarms!'.
As was explained before, by its very nature heuristic analysis is more prone to false alarms than traditional scanning methods. Indeed, not only viruses but many scanners as well would satisfy the six rules we used as an example: a scanner does look for executable files, opens them, reads some data and even writes something back when removing a virus from a file. Can anything be done to avoid triggering on a scanner? Let's again turn to the experience of a human anti-virus expert. How does one know that this is a scanner, not a virus? Well, this is more complicated then the above example of a primitive virus. Still, there are some general rules too. For example, if a program heavily relies on its parameters or involves an extensive dialogue with a user, it is highly unlikely the program is a virus. This way we come to the idea of negative heuristics, that is - a set of rules which are true for a non-virus program. Then while analysing a program, our heuristics should estimate the probability of the program to be a virus using both positive heuristics, such as the above six rules, and negative heuristics, typical for non-virus programs and very rarely used by real viruses. So that if a program satisfies all of our six positive rules, but also expects some command line parameters and uses an extensive user dialogue as well, we would not call it a virus.
So far so good. Looks like we found a solution to the virus glut problem, right? Not really! Unfortunately, not all virus writers are stupid. Some of them are also well aware of heuristic analysis. And some of their viruses are written in a way to avoid the most obvious positive heuristics. On the other hand, these viruses include otherwise useless pieces of code with the only aim to trigger the most obvious negative heuristics, so that such a virus does not draw the attention of a heuristic analyser.
2.4. Virus detection vs. false alarms trade-off.
Each heuristic scanners developer sooner or later comes to the point when it is necessary to make a decision: 'Do I detect more viruses or do I cause less false alarms?'. The best way to decide would be to ask users what do they prefer. Unfortunately, the users' answer is: 'I want it all! 100 per cent detection rate and no false alarms!'. As was mentioned above, this is not achievable. So, a virus detection versus false alarms trade-off problem is up to the developer to decide. It is very tempting to make your heuristic analyser to detect almost all viruses, despite the false alarms. Afterall, reviewers and evaluators, who publish their tests results in magazines which are read by thousands of users world-wide, are testing just the detection rate. It is much more difficult to run a good false alarms test: there are gigabytes and gigabytes of non-virus software in the world, far much more than there are viruses. And it is more difficult to get hold of all this software and to keep it for your tests. 'Not enough disk space' is only one of the problems. So, let's forget false alarms and negative heuristics and call a virus each and every program which happens to satisfy just few of our positive heuristics. This way we shall score top most points in the reviews. But what about the users? They normally run scanners not on a virus collection but on a clean disks. Thus, they won't notice our almost perfect detection rate but are very likely to notice our not that perfect false alarms rate. Tough choice. That's why some developers have at least two modes of operation of their heuristic scanners .The default is so called 'normal' or 'low sensitivity' mode, when both positive and negative heuristics are used and a program needs to trigger many enough positive heuristics to be reported as a virus. In this mode a scanner is less prone to false alarms, but its detection rate might be far below from what is claimed in its documentation or advertisement. The often used in advertising figures of 'more than 90 per cent' virus detection rate by heuristic analyser refer to the second mode of operation, which is often called 'high sensitivity' or 'paranoid' mode. It is really a paranoid mode: in this mode negative heuristics are usually discarded and the scanner reports as a possible virus any program which happens to trigger just one or two positive heuristics. In this mode a scanner indeed can detect 90 per cent of viruses, but it also produces hundreds and hundreds of false alarms, making the 'paranoid' mode useless and even harmful for a real life everyday usage, but still very helpful when it comes to a comparative virus detection test. Some scanners have special command line option to switch the paranoid mode on, some other switch to it automatically whenever they detect a virus in the normal low sensitivity mode. Although the latter approach seems to be a smart one, it takes just a single false alarm out of several dozens of thousands of programs on a network file server to produce an avalanche of false virus reports.
2.5. How it all works in practice - different scanners compared.
Being himself an anti-virus researcher and working for a leading anti-virus manufacturer, the author has developed a heuristic analyser of his own. And of course, the author could not resist comparing it to other existing heuristic scanners. We believe the results are interesting to other people. And they underscore what was said about both virus detection and false alarms rates. As the products testes are our competitors, we decided not to publish their names in the test results. So, only FindVirus of Dr.Solomon's AntiVirus Toolkit is called by its real name. All the other scanners are just Scanner_A, Scanner_B, Scanner_C and Scanner_D. The latest versions of the scanners available at the time of the test were used. For FindVirus it was version 7.50 - the first one to employ a heuristic analyser.
Each scanner tested was run in heuristics-only mode, with their normal virus signature scanning disabled. This was achieved by either using a special command line option, where available, or by using a special empty virus signatures database in other cases.
The test consisted of two parts: virus detection rate and false alarms rate. For the virus detection rate S&S International PLC ONEOFEACH virus collection was used, which contained more than 7,000 samples of about 6,500 different known DOS viruses. For the false alarms test the shareware and freeware software collection of SIMTEL20 CD-ROM (fully unpacked), all utilities from different versions of MS-DOS, IBM DOS, PC-DOS and other known not infected files were used (current basic S&S false alarms test set). When measuring false alarms and virus detection rate, all files reported were counted - either reported as 'Infected' or 'Suspicious'. Separate figures for the two categories are given where applicable.
In both parts of the test the products were run in two heuristic sensitivity modes, where applicable: normal or low sensitivity mode and paranoid or high sensitivity mode. The automatic heuristic sensitivity adjustment was prohibited, where applicable.
The results of the tests are given below.
Table 1. Virus detection test results.
Table 2. False alarms test results.
3. Why 'of the year 2000'?
Well, first of all simply because the author could not resist the temptation of splitting the name of the paper into three questions and using them as the titles of the main sections of his presentation. The author believed it was funny. Maybe he has a weird sense of humour. Who knows..
On the other hand, the year 2000 is very attractive by itself. Most people consider it a distinctive milestone in all aspects of human civilisation. This usually happens to the years ending with double zero, still more - to the end of a millennium with its triple zero at the end. The anti-virus area is not an exclusion. For example, during the EICAR'94 conference there were two panel sessions discussing 'Viruses of the year 2000' and 'Scanners of the year 2000' respectively. The general conclusion made by a panel of well-known anti-virus researches was that at the current pace of new virus creating by the year 2000 we well might face dozens if not hundreds of thousands of known DOS viruses. As the author tried to explain in the second section of this paper (and other authors explained elsewhere [Skulason, Bontchev2]), this might be far too much for a current standard scanners' technique, based on known virus signatures scanning. More generic anti-virus tools, such as behaviour blockers and integrity checkers, while being less vulnerable to the growing number of viruses and the rate at which the new viruses appear, can detect a virus only when it is already running on a computer or even only after the virus has run and infected other programs. In many cases the risk of allowing a virus to run on your computer is just not affordable. Using a heuristic scanner, on the other hand, allows to detect most of new viruses with in a regular scanner safe manner: before an infected program is copied to your system and executed. And very much like behaviour blockers and integrity checkers, a heuristic scanner is much more generic than a signature scanner, requires much rare updates and provides an instant response to a new virus. Those 15-20 per cent of viruses a heuristic scanner cannot detect could be dealt with using current well-developed signature scanning techniques. This will effectively decrease the virus glut problem fivefold at least.
Yet another reason for choosing the year 2000 and not, say, 2005 is that the author has his very strong doubts whether the current computer virus situation will survive the year 2000 by more than a couple of years. With the new operating systems and environments appearing (Windows NT, Windows'95, etc.) the author believes DOS is doomed. So are DOS viruses. So is modern anti-virus industry. This does not mean viruses are not possible for the new operating systems and platforms. They are possible in virtually any operating environment. We are aware of viruses for Windows, OS/2, Mac OS and even UNIX. But to create viruses for these operating systems, as well as for Windows NT and Windows'95, it requires much more skills, knowledge, efforts and time than for the virus-friendly DOS. Moreover, it will be much more difficult for a virus to replicate under these operating systems. They are far more secure than DOS, if it is possible to talk about DOS security at all. Thus, there will be much less virus writers and they will be capable of writing much less viruses. The viruses will not propagate fast and far enough to represent a major problem. Subsequently, there will be no virus glut problem. Regrettably, there will be no such a vast anti-virus market and most of today's anti-virus experts will have to find another occupation..
But until then, DOS lives and anti-virus developers still have a lot of work to be done!
References
- [Bontchev1] Vesselin Bontchev, 'Possible Virus Attacks Against Integrity Programs And How To Prevent Them', Proc. 2nd Int. Virus Bulletin Conf., September 1992, pp. 131-141.
- [Skulason] Fridrik Skulason, 'The Virus Glut. The Impact of the Virus Flood', Proc. 4th EICAR Conf., November 1994, pp. 143-147.
- [Bontchev2] Vesselin Bontchev, 'Future Trends in Virus Writing', Proc. 4th Int. Virus Bulletin Conf., September 1994, pp. 65-81.
- [Cohen] Fred Cohen, 'Computer Viruses - Theory and Experiments', Computer Security: A Global Challenge, Elsevier Science Publishers B. V. (North-Holland), 1984, pp. 143-158.
- [Solomon] Alan Solomon, 'False Alarms', Virus News International, February 1993, pp. 50-52.
Vesselin Bontchev
Proc. 6th Int. Virus Bull. Conf., 1996, pp. 97-127.
1996
Research Associate
Virus Test Center
University of Hamburg
Vx Heavens Virus Collection Youtube
Vogt-Koelln-Str. 30, 22527 Hamburg, Germany- Introduction.
- Possible virus attacks against the integrity checking programs.
- Companion viruses.
With the advent of the polymorphic viruses it is becoming obvious that the virus-specific scanners have exhausted themselves. Currently one of the most powerful methods to detect viruses is the so-called integrity programs. They will certainly be used more frequently in the future. Yet, they are also not a universal anti-virus protection tool. In the current paper we will try to show the different ways in which viruses could attack integrity checking programs. When appropriate, we will also demonstrate what can be done against these attacks.
1. Introduction.
There are three main kinds of anti-virus programs [McAfee]. Essentially these are scanners, monitors and integrity checkers.
1.1. Scanners.
Scanners are programs that scan the executable objects (files and boot sectors) for the presence of code sequences that are present in the known viruses. Currently, these are the most popular and the most widely used kind of anti-virus programs. There are some variations of the scanning technique, like virus removal programs (programs that can 'repair' the infected objects by removing the virus from them), resident scanners (programs that are constantly active in memory and scan every file before it is executed), virus identifiers (programs that can recognize the particular virus variant exactly by keeping some kind of map of the non-modifiable parts of the virus body and their checksums), heuristic analyzers (programs that scan for particular sequences of instructions that perform some virus-like functions), and so on.
The reason that this kind of anti-virus program is so widely used nowadays is that they are relatively easy to maintain. This is especially true for the programs which just report the infection by a known virus variant, without attempting exact identification or removal. They consist mainly of a searching engine and a database of code sequences (often called virus signatures or scan strings) that are present in the known viruses. When a new virus appears, the author of the scanner needs just to pick a good signature (which is present in each copy of the virus and in the same time is unlikely to be found in any legitimate program) and to add it to the scanner's database. Often this can be done very quickly and without a detailed disassembly and understanding of the particular virus.
Furthermore, scanning of any new software is the only way to detect viruses before they have the chance to get executed. Having in mind that in most operating systems for personal computers the program being executed has the full rights to access and/or modify any memory location (including the operating system itself), it is preferable that the infected programs do not get any chance to be executed.
At last, even if the computer is protected by another (not virus-specific) defense, a scanner will still be needed. The reason is that when the non virus-specific defense detects a virus-like behavior, the user usually wants to identify the particular virus, which is attacking the system - for instance, to figure out the possible side-effects or intentional damage, or at least to identify all infected objects.
Unfortunately, the scanners have several very serious drawbacks.
The main one is that they must be constantly kept up-to-date. Since they can detect only the known viruses, any new virus presents a danger, because it can bypass a scanner-only based protection. In fact, an old scanner is worse than no protection at all - since it provides a false sense of security.
Simultaneously, it is very difficult to keep a scanner up-to-date. In order to produce an update, which can detect a particular new virus, the author of the scanner must obtain a sample of the virus, disassemble it, understand it, pick a good scan string that is characteristic for this virus and is unlikely to cause a false positive alert, incorporate this string in the scanner, and ship the update to the users. This can take quite a lot of time. And new viruses are created every day - with a current rate of up to 100 per month. Very few anti-virus producers are able to keep up-to-date with such a production rate. One can even argue that the scanners are somehow responsible for the existence of so many virus variants. Indeed, since it is so easy to modify a virus in order to avoid a particular scanner, lots of 'wannabe' virus writers are doing it.
However, the fact that the scanners are obsolete as a single line of defense against the computer viruses, became obvious only with the appearance of the polymorphic viruses. These are viruses, which use a variable encryption scheme to encode their body and which even modify the small decryption routine, so that the virus looks differently in each infected file. It is impossible to pick a simple sequence of bytes that will be present in all infected files and use it as a scan string. Such sequence simply does not exist. Some polymorphic viruses can be detected using a wildcard scan string, but more and more viruses appear today, which cannot be detected even if the scan string is allowed to contain wildcard bytes..
The only possible way to detect such viruses is to understand their mutation engine in detail. Then one has to construct an algorithmic 'scanning engine' specific to the particular virus. However, this is a very time-consuming and effort-expensive task, so many of the existing scanners have problems with the polymorphic viruses. And we are going to see more such viruses in the future. The Bulgarian virus writer known under the handle Dark Avenger has even released a 'mutating engine' - a tool for building extremely polymorphic viruses.. Very few scanners are able to detect the viruses, which are using it, with 100 reliability.
One last drawback of the scanners is that scanning for lots of viruses can be very time-consuming. The number of currently existing viruses is about 1,600 and is expected to reach 3,000 at the end of 1992. Indeed, some scanners use clever scanning methods like fixed-point scanning, top-and-tail scanning, hashing and so on. The detailed description of these methods is outside the scope of this paper, but as has been proved in [Cohen90], scanning is not cost-effective in the long run, despite the scanning method used.
1.2. Monitors.
The monitoring programs are memory resident programs, which constantly monitor some functions of the operating system. Those are the functions that are considered to be dangerous and indicative for virus-like behavior. Such functions include modifying an executable file, direct access of the disk bypassing the operating system, and so on. When a program tries to use such a function, the monitoring program intercepts it and either denies it completely or asks the user for confirmation.
Unlike the scanners, the monitors are not virus-specific and therefore need not to be constantly updated. Unfortunately, they have other very serious drawbacks - drawbacks that make them even weaker than the scanners as an anti-virus defense and almost unusable today.
The most serious drawback of the monitors is that they can be easily bypassed by the so-called tunneling viruses. The reason for this is the total lack of memory protection in most operating systems for personal computers. Any program that is being executed (including the virus) has full access to read and/or modify any area of the computer's memory - including the parts of the operating system. Therefore, any monitoring program can be disabled because the virus could simply patch it in the memory. There are other clever techniques as interrupt tracing, DOS scanning, and so on, which allow the viruses to find the original handlers of any operating system function. Afterwards, this function can be called directly, thus bypassing any monitoring programs, which watch for it.
Another drawback of the monitoring programs is that they try to detect a virus by its behavior. This is essentially impossible in the general case, as proven in [Cohen84]. Therefore, they cause many false alarms - since the functions that are expected to be used by the computer viruses usually have pretty legitimate use by the normal programs. And if the user gets used to the false alerts, s/he will be likely to oversee a real one.
The monitoring programs are also completely useless against the slow viruses, described later in this paper.
1.3. Integrity checking programs.
According to Dr. Fred Cohen's definition [Cohen84], a computer virus is a program that can `infect' other programs by modifying them to include a possibly evolved copy of itself. Therefore, in order to be a virus, a program must be able to infect. And, in order to infect, the program must cause modifications to the programs that are infected. Therefore, a program, which can detect that the other executable objects have been modified, will be able to detect the infection. Such programs are usually called integrity checkers.
The integrity checkers compute some kind of checksum of the executable code in a computer system and store it in a database. The checksums are re-computed periodically and compared with the stored originals. Several authors point out that in order to avoid forging attempts from the part of the virus, the checksums must be cryptographically strong. This can be achieved by using some kind of trap-door one-way function, which is algorithmically difficult to be inverted. Such functions include DES, MD4, MD5, and so on. But, as has been shown by [Radai], this is not mandatory. A simple CRC is sufficient, if implemented correctly.
There are several kinds of integrity checkers. The most widely used ones are the off-line integrity checkers , which are run to check the integrity of all the executable code on a computer system. Another kind are the integrity modules , which can be attached (with the help of a special program) to the executable files, so that when started the latter will check their own integrity. Unfortunately, this is not a good idea, since not all executable objects can be 'immunized' this way. Additionally, the 'immunization' itself can be easily bypassed by stealth viruses, as described later in this paper. The third kind of integrity software are the integrity shells . They are resident programs, similar to the resident scanners, which check the integrity of an object only at the moment when this object is about to be executed. These are the least widespread anti-virus programs today, but the specialists predict them a bright future [Cohen90].
The integrity checking programs are not virus-specific and therefore do not need constant updating like the scanners. They do not try to block virus replication attempts like the monitoring programs and therefore cannot be bypassed by the tunneling viruses. In fact, as demonstrated by [Cohen90], they are currently the most cost-effective and sound line of defense against the computer viruses.
They also have some drawbacks. For instance, they cannot prevent an infection - they are able only to detect and report it after the fact. Second, they must be installed on a virus-free system, otherwise they will compute and store the checksums of already infected objects. Therefore, they must be used in a combination with a scanner at least before installation. This is needed, in order to ensure that the system they are being installed on is virus-free. Third, they are prone to false positive alerts. Since they detect changes, not viruses, any change in the programs (like updating the software with a new version), is likely to trigger the alert. Sometimes this can be avoided or at least reduced by using some intelligent heuristics and educating the users. Fourth, while the integrity checkers are able to detect the virus spread and identify the newly infected objects, they usually cannot determine the initially infected object, i.e., the source of the infection.
Despite the drawbacks mentioned, the integrity checking programs are the currently most powerful line of defense against computer viruses and are likely to be used more widely in the future. Therefore, we should expect that new viruses will appear, which will target the integrity programs in the same way as the polymorphic viruses are targeting the scanners and the tunneling viruses are targeting the monitors. Let's see what kinds of attacks are possible against the integrity checking programs and how these programs can be improved to avoid them.
2. Possible virus attacks against the integrity checking programs.
2.1. Stealth viruses and fast infectors.
The first generic attack against both the scanners and the integrity checkers came with the appearance of the stealth viruses. When such a virus is active in memory, it intercepts the access requests to the infected objects. It then modifies their results in such a way, that these objects look as if they are not infected.
The first stealth virus was Brain - a boot sector infector. Writing a stealth file infector is much more difficult, since a lot more things have to be considered by the virus writer. However, such viruses are perfectly possible and several of them already exist. The first fully stealth file infector was the Bulgarian virus Number of the Beast. Currently there are also semi-stealth viruses - viruses which only hide the increase of the file size after infection. However, they are not of particular interest, since they cannot defeat most intelligent integrity checking programs.
If a fully stealth virus is active in the computer memory during the integrity check, all infected objects will look as non-infected, or in fact not modified. Therefore, the integrity checking program will not report anything unusual.
Also, if the stealth virus is also a fast infector (a virus, which infects files not only when they are executed, but also when they are accessed for whatever reason), the process of computing the checksum of every file will cause its infection. The reason for this is that the files must be opened and read (therefore - accessed) in order their checksum to be computed. This will provide the virus (if it is present in the memory and active) with the possibility to infect them. And all currently known stealth viruses are fast infectors as well.
The only remedy to this is to ensure that no virus is active in the computer memory during the integrity check. There is only one 100 foolproof way to do this - to cold boot the computer that is about to be checked from a non-infected write-protected system diskette. This way we can make both the stealth and the fast infecting viruses obsolete - since they will not be active in the memory during the check and they have no way to play their subverting tricks.
Unfortunately, the usage described above of an anti-virus program is often considered inconvenient by most users. In fact, most users never use the program this way. That is why, most integrity checking programs usually have different levels of safety and only the most secure one requires the user to boot from a system diskette. The intelligent integrity checkers can even keep track of how often the most secure mode is used and to remind the user if s/he does not use it often enough.
There is yet another possibility, which gets more and more widely used lately. It consists of using the so-called anti-stealth techniques. These techniques are, in fact, very similar to the tricks that the tunneling viruses use to bypass the monitoring programs. The integrity checkers that use anti-stealth techniques try to bypass the stealth viruses much in the same way as the tunneling viruses try to bypass the monitors. Such techniques are often very useful, but they must not be abused. It is important to emphasize that they are not fool-proof, due to the complete lack of memory protection in the operating systems for personal computers. Usually it is possible to circumvent them by using just a combination of the known stealth techniques.
If a program wants to bypass the stealth viruses reliably, it must access the disk sector by sector, using direct calls to the ROM BIOS (not interrupts!). It must bypass the whole operating system and interpret the disk structure itself. Even then, great care has to be taken to avoid some pitfalls that can be used by a clever stealth virus. However, if the program does all this, it will become very incompatible - with any kind of disks accessed through an installable device driver. At first sight this may not look like a great restriction, but such disks include SCSI disks, hardcards, CD-ROMs, networked drives, encrypted partitions (e.g., with the program DiskReet from the Norton Utilities package), compressed disks (e.g., with Stacker, SuperStor, DoubleDisk), special large partitions (e.g., created by Disk Manager), SUBSTed and JOINed disks, and so on. A virus writer can safely ignore this - his creation will spread widely enough even if it does not infect such disks. However, an anti-virus program can never permit itself to be so widely incompatible, or it simply will not be used. Therefore, the integrity checking programs that use anti-stealth techniques must turn the tricks off at least in the cases mentioned above.
Therefore, the users should be aware that any claims of the sort of 'our program is able to bypass any stealth virus' are nothing more than marketing tricks. They should be taken with a pinch of salt.
Remember: the only 100 foolproof anti-stealth technique is to cold booting the computer from a non-infected write-protected system diskette, to ensure that no virus is present in memory.
A third way to avoid the stealth viruses is described in [Cohen91]. It consists of taking a snapshot of a clean state of the system (including the boot sectors, the operating system, the device drivers, the command interpreter, the startup files, and even the memory structure and the contents of the registers of the CPU). Each time the system is rebooted, it is fist restored to that known clean state. This way, we can make sure that no virus is active in memory at that time. This method is very useful and convenient especially against boot sector viruses. It is significantly less convenient if it is implemented completely - especially in environments when the system startup configuration is changed relatively often.
2.2. Companion viruses.
The integrity checking programs detect modifications of the executable files. Therefore, to avoid detection, the viruses could try not to modify the files themselves. For instance, they change their execution path instead. The particular implementation is operating system dependent; in our examples we shall assume mainly MS-DOS.
When the user enters the name of a file to be executed from the command line, the command interpreter first looks for a file with the same name as the one entered by the user and a .COM extension. Only if such file is not found, the command interpreter tries the .EXE and .BAT extensions (in this order) and if none is found, the same search (in the same order) is performed in every directory listed in the PATH variable. Only if no file that has any of these three extensions is found, the command interpreter outputs the well-known 'Bad command or filename' message.
2.2.1. Regular companions.
The search procedure described above can be used by a virus. It can locate a file with an .EXE extension and put the virus body in the same directory and with the same file name as the file being 'infected' but with a .COM extension. This way the virus will make sure that the next time the user enters the name of the main file from the command line, the virus will be executed instead. It can then perform its task (e.g., 'infect' in this way other file(s)) and pass control to the main file by executing it directly (the Exec function call does not perform any PATH or file extension search - it is done only by the command interpreter).
Such viruses already exist. They are called regular companion viruses or more simply 'companions.' The first one was the relatively unknown Bulgarian virus, called TP Worm. Today several other such viruses exist. Some of them are memory resident, and some even employ a limited range of stealth tricks.
Such viruses are usually not considered particularly dangerous. The reason is that they tend to spread only inside a particular computer system. Since people do not often execute files from floppy disks or include the floppy disk drives in their paths, these viruses have almost no chance to spread between computers.
Unfortunately, relying on this might be dangerous. It is perfectly possible to design and implement a well-spreading companion virus. Such virus will be memory resident. It will place its copies and mark them as hidden. (The files with the Hidden attribute set still can be executed - unlike the files with the System attribute set). Then, it could intercept the FindFirst and FindNext functions of the operating system and 'hide' the presence of this file, even if a file manager that can show the hidden files is used. The virus could even be a fast infector - it could infect (i.e., create its copy in a hidden COM file) not only when EXE files are executed, but also when they are copied. This way it will have more chances to spread to floppy disks and from them - to other computers.
The companion viruses are a particularly dangerous form of virus attack in some environments, like Novell NetWare. Under Novell NetWare it is possible to mark some executable files as ExecuteOnly. This will make them unreachable for any kind of access (except execution) by anyone, including the supervisor. Once set, this attribute cannot be reset any more. The only thing that can be done is to delete the file - and even this can be done only by the supervisor. If the directory that contains .EXE files protected in this way has access rights, which allow file creation and/or renaming, a companion virus could spoof the protected file by creating in the same directory a file with the same name and a .COM extension. If the effective access rights of the newly created files permit that, the virus could even set its attributes to ExecuteOnly , thus preventing even the supervisor from detecting the attack. This kind of attack has been first described in [Cohen92].
In the attack described above, it is obvious that not only the ExecuteOnly attribute is unable to stop the virus spread, but it also effectively prevents the supervisor from performing regular backups and integrity checks, and therefore noticing the attack. And, from a MS-DOS workstation, this attribute does not effectively prevent read access to the files, protected by it. Indeed, the attacker could us a LoadOverlay function call, in order to get a copy of the file in the memory of the local workstation. Once the executable image is there, it could be stored in a local file, examined, etc. The EXE files present some difficulties to this approach, since the LoadOveraly function will perform the necessary relocation and the image of the file in memory will not be an exact copy of the file. However, this could be easily bypassed, by loading the file twice at different memory segments and using the difference between the two loaded images to construct an EXE header and an equivalent set of relocation items.
With the appearance of more modern command interpreters, other kinds of companion spoofing became possible. Even with COMMAND.COM it is possible to spoof a BAT file by a COM or EXE companion much in the same way as an EXE file can be spoofed by a COM companion. With 4DOS used as command interpreter a new kind of executable extension is introduced - the BTM files, which are searched for after the EXE files but before the BAT files in the execution path. Therefore, under 4DOS a BAT file can be spoofed by a BTM, an EXE, or a COM companion. 4DOS also allows the user to define new executable extensions by setting special environment variables. All files with these extensions can be spoofed by either BAT, or BTM, or EXE, or COM companions. A clever virus could randomly generate companions of the possible extension range, in order to reduce the possibility of being detected.
What can be done against this kind of attack? It is relatively trivial to make the integrity checking programs aware of it. They just need to inspect every directory for files with the same name but with different executable extensions and to alert the user if such files are found. Some care must be taken to handle the new extensions as with the 4DOS example above. For this purpose the integrity checkers must provide the user the possibility to define all the executable extensions that have to be checked. Also, s/he must have the possibility to indicate the search order of these extensions (e.g., COM -> EXE -> BTM -> BAT -> ZIP).
2.2.2. PATH companions.
Instead of putting its body in a file in the same directory, but with an extension that is searched earlier than the original one, a virus could simply put its body in a file with any executable extension, but in a directory that comes earlier in the PATH variable than the directory of the original file. This will have the same effect - when the user types the name of the file, the virus will be executed first. Such viruses are called PATH companions.
This trick is less operating system dependent, since relatively many operating systems with hierarchical file structure have the concept of the PATH variable, while the multiple executable extensions exist mainly in a few weird OSes like VMS or MS-DOS. Under Unix, of instance, the PATH trick is quite popular, but mainly for creating trojan horses, not viruses.
Again, this attack presents a significantly more considerable danger in a Novell Netware environment. A PATH companion virus could infect all executable files in all directories - regardless how well they are protected. It is sufficient that at least one writeable directory exists - e.g., the user's home directory. The virus could insert the name of the writeable directory at the beginning of the PATH variable and create copies of its body in it with names, designed to spoof the executable files in the protected directories. Even if no writeable directory on the server exists, the virus could use a directory on the local workstation instead. Indeed, the infection will exist only from the point of view of the user being attacked - it will not be able to spread between users. Nevertheless, the attack provides enough dangerous possibilities.
What can be done to prevent such viruses? Well, basically the same that is done to prevent the regular companions. However, this time the whole file system must be searched for files with the same name and different executable extensions. This often uses too much memory and is too time-expensive, especially if the file system includes all networked drives.
An intelligent shortcut is to store the contents of the PATH variable used by the user and check only the directories in it. An even more intelligent approach consists of parsing the contents of the user's AUTOEXEC.BAT file and fetching the contents of the PATH variable from there, optionally allowing the user to change it (e.g., to add more directories to be checked). This approach is even more useful when the user boots from a diskette and the normal contents of the PATH variable is not available. And, as we tried to emphasize in the previous section, this is the only safe way to use an integrity checking program.
Still, this shortcut does not work against a PATH companion, which modifies the PATH variable in memory, in order to make it include the special directory, which contains the virus body. Again, this modification cannot be prevented, because of the lack of memory protection in the operating system. It could, however, be detected by some kind of resident integrity checking program (i.e., an integrity shell).
2.2.3. Alias companions.
The latest additions to the MS-DOS operating system, namely 4DOS and the DOSKEY program, allow the user to define command-line macros, called aliases. The commands defined in them will be executed before any file with executable extension. This, of course, opens a gaping hole for alias-based viruses - viruses that install an alias to spoof a particular executable file (regardless of its extension). Fortunately, such viruses, while possible in theory, do not represent a significant danger. They would be very much system-dependent and the alias technique is not widely used and standardized. However, the approach should be considered, since a virus could use it as an alternative way of spreading (besides a more 'conventional' way of infection).
Since the alias technique is not standardized, it is less easy to look for and prevent this kind of attack. A 4DOS-based approach could be to parse the AUTOEXEC.BAT file for the ALIAS command and also check all files that contain aliases (they are used by this command). A DOSKEY -aware approach is more difficult.
2.3. Infection of unusual objects.
The alias companions are a particular case of a more general approach - to infect unusual objects, that are unlikely to be checked by an integrity program. We shall try to list some possibilities.
According to the von Neumann's principle, there is no strict difference between executable code and data. One man's code is another man's data and vice versa. The source code of a program is considered to be a program by the humans, but is treated as data by the editors and the compilers. Therefore it can be infected by a source-code virus. At first glance it seems rather unlikely that such a virus will remain undetected. Still, finding the offending code in a 10,000-line C program could be quite troublesome. Especially if the code is designed in the style of the 'Obfuscated C Contest.' A very good example of source code tampering that leaves no traces is described in [Thompson].
Some more examples include:
- The macro files used by some spreadsheets (Lotus 1-2-3 for instance) and word processors can be used to spread viruses, as has been demonstrated in [Highland].
- The LIB files contain object libraries - perfectly executable compiled code that can be infected.
- The OBJ files themselves are prone to infection - their format has been described in detail in the manuals, so the implementation of an OBJ -file infector is just a question of time.
- Some integrity programs check only the code in the master boot sector. This can be quite dangerous, as demonstrated by the StarShip virus, which infects the computer by modifying only three bytes of the partition table data! (In fact, it is possible to achieve the same by modifying only one byte.)
- The device drivers are perfectly infectable and indeed there are already several viruses that can infect them.
- The files with extension .PIF under Microsoft Windows contain a pointer to the proper executable file. Therefore, a virus could modify this pointer, so that its spoofing body is executed first.
- Since version 3.1, Microsift Windows has introduced a conception, called Object Linking and Embedding (OLE). Effectively it consists of introducing in the data files a pointer to the application that is designed to process them. Again, this pointer could be modified and the true application - spoofed by a companion-like virus.
- The Dynamically Linked Libraries (DLLs) in Microsoft Windows and OS/2 also contain executable and therefore infectable code.
- The Novell Loadable Modules (NLMs) under Novell NetWare 3.11 also contain executable code that can be infected. Indeed, the format of those files differs considerably from the format of the MS-DOS executables, but it is only a matter of time before the virus writers learn to exploit it.
- Under some operating systems (e.g., Unix), the compiler generates some temporary object files during the compilation. These files are usually created in a world-writeable directory and deleted after the compilation. If the access to these temporary files is not restricted, a virus could infect them just before they are linked with the system libraries to produce an executable file.
All these attacks are not dangerous by themselves, since any virus that depends only on them is not likely to spread very far. However, they can be used in a combination with the conventional methods. Therefore, they should be detected, otherwise some copies of the virus risk to survive the disinfection and to cause a second epidemy. This could be quite expensive..
What can be done against these attacks? Most of them are only of theoretical interest nowadays, so it is useless to force regular integrity checking of everything that could be infected in theory. However, the integrity programs must be designed to be flexible enough and to provide the user the possibility to define what files have to be checked (and how often) him/herself. Also, there should be a possibility to check the integrity of all files present on the system - including those that are supposed to contain only non-executable (and non-interpretable) data. The latter is a good practice not only against viruses but also against other kinds of integrity corruption.
2.4. The DOS file fragmentation attack.
This kind of attack has been described in [Kabay]. It is specific to MS-DOS.
The operating system is contained in two files (IO.SYS and MSDOS.SYS or IBMBIO.COM and IBMSYS.COM), which are loaded during the bootstrap process. However, at boot time there is no operating system to interpret the contents of the disk as a file system. Therefore, the first of these files is loaded in memory not as a file, but as consecutive sectors. In the same time, the integrity checking programs usually treat the operating system files as regular files and check them in the normal way. A virus could use this discrepancy to infect the system and remain undetected by the integrity checking software.
The virus could place its body over the first cluster occupied by the first DOS file. Then it should allocate a new cluster (from the free disk space), store the original contents of the first cluster there, and fix the FAT chain in such a way, that the newly allocated cluster is re-linked at the beginning of the file. The old first cluster (which now contains the virus body) is removed from the chain of clusters allocated for this file and is marked somehow as used - for instance by marking it as bad, or by 'hiding' it using the stealth approach.
A program which now computes the checksum of the file by considering it to be a normal file will be fooled to believe that the file has not been changed - since the logical structure and the contents of the clusters allocated to this file has not been changed indeed. However, the location of the file on the disk will be different and this will cause the virus body to be loaded at boot time. It could then install itself in memory, intercept some controls, then load the original first cluster and transfer control to it.
To avoid this kind of attack, the integrity checking software must be aware of it and treat the two hidden DOS files in a special way. It should not only check their contents as files, but also their position as sectors on the hard disk. This is relatively trivial to be achieved, but it is amazing how few producers of integrity checking software know about this kind of attack at all.
2.5. Deleting the database of checksums.
A very trivial, but surprisingly successful attack a virus could perform consists of just locating the database, where the integrity checking program stores the checksums of the executable objects and simply deleting it. When the file, containing the database, suddenly disappears, many integrity checkers will consider that they are going to be installed for the first time on this system. They will then begin duly to checksum all executable objects again and to rebuild the database of checksums. Some of the existing integrity checkers are doing this even without requesting confirmation from the user. Unfortunately, this means that the new database will contain checksums of the already infected objects..
Such viruses already exist. One of them (Peach) targets the database of checksums, created by Central Point Anti-Virus. Another virus (Tequila) simply removes the 10-byte checksum, attached to the executable files by McAfee Associates' VIRUSCAN (it attaches such checksums to the executable files when it is run with the /AV option). Yet another virus (Groove), targets a whole set of integrity programs. It looks for the files with the default names of the database of checksums that these programs create, and simply deletes them.
Protection against this kind of attack is relatively simple. The integrity checking programs must not automatically assume that they should run in installation mode, if no database of checksums is available. They must be installed with a separate installation program instead. Also, they must provide a flexible way to the user to select the name of the file, where the database of checksums is to be stored. They also must allow the user freely to rename the program that performs the integrity check, to prevent the virus from doing some cheap tricks, like examining the two startup files (CONFIG.SYS and AUTOEXEC.BAT), in order to find the exact name of the file with the checksums.
On the other side, the users must know that the only safe place for the database that contains the checksums of the executable objects is off-line, on a write-protected floppy, out of the reach of any virus.
2.6. Diskette-only infectors.
Usually the integrity checking programs are used to watch the integrity of the hard disks only. They do not try to compute checksums of the executable objects on the floppy disks, because the floppy disks are often modified, exchanged between computer systems, and so on.
Therefore, a virus that infects only floppy disks, will not be detected by such an integrity program. And indeed, most of the currently available integrity checking products are unable to detect such old and well-known viruses like Brain and VirDem [Solomon].
Vx Heavens Virus Collection Download
Some people claim that a diskette-only infector is not a very viable and dangerous virus. However, the Brain virus has proved that the former is not true. This diskette-only infector was so widespread, that we are still unable to eradicate it completely. As to the latter, consider a virus, which infects only floppy disks, but slightly corrupts data files on the hard disks. The corruption is not likely to be discovered soon, even in the hard disk is protected by an integrity checking system - because the integrity checking programs usually do not check the data files (it will take a lot of time to check all data files, and most of them are often modified anyway). Since the file corruption occurs slowly, it is likely that the already corrupted files will be transferred in the backups, thus making the restoration of the system impossible. A virus with such a destructive payload already exists (the Nomenklatura virus). One only needs to combine its payload with a diskette-only infector, like Brain.
2.7. Infecting only modifiable objects.
Vx Heavens Virus Collection Software
There is a class of executable files, which cause a lot of troubles to the integrity checking software. Those are the programs that modify their image in the files - for instance, to store some configuration data. Regardless that designing such programs is widely considered as a bad practice, self-modifiable programs exist, are widely used, and are likely to be continued to be created in the future. Such programs include e.g., Borland's SideKick and Turbo PASCAL, McAfee's VIRUSCAN, the program SETVER, which comes with DOS 5.0, and others.
Since these programs often modify themselves, they will trigger the integrity checking programs, because the latter are designed to detect modifications. To avoid such false positives, most of the integrity checking packages allow the user to create a set of exceptions - a list of executable objects that are to be excluded from the integrity control.
However, a virus could use this security hole, look for the most popular self-modifying programs (by searching the whole file system, or just by examining the two startup files), and to infect only them. If such a self-modifying program is executed during the startup, this will provide the virus with the excellent possibility to spread, yet it will remain undetected by the integrity checking software. Even if the regular integrity check detects the modification, the user is likely not to pay attention to the alert, since it is well known that the program, which has triggered the alert is self-modifying.
What can be done against this kind of attack? Well, improving the integrity checking software cannot help very much. The best solution is not to use any self-modifying programs at all. If the users firmly decide to stop using this kind of software (just like they decided to stop using programs that apply some kind of copy protection scheme), maybe the software producers will turn to a better programming practice. Self-modification of the executable objects can and should be avoided. This will close a security hole, which can be exploited by computer viruses.
2.8. Slow viruses.
The slow viruses are a natural extension of the idea to infect only floppy disks, or only objects, which are known to often modify themselves.
These viruses represent probably the greatest danger to the integrity checking software. We call them 'slow' viruses as opposed to the fast infectors, because they are rather selective regarding the objects they decide to infect. Their spread lacks the spectacular speed of the superfast infectors like Dir II, but they can remain undiscovered for much longer time and spread wider in the long run.
The slow viruses use one intrinsic flaw of the integrity checking software. The integrity checking programs do not detect viruses - they detect modifications. Whether the modifications are caused by a virus or not is left to the user to decide. The slow viruses choose to infect programs only in these cases when the latter are created or modified by the user. This usually occurs when a file is copied or re-compiled. If an integrity checker is run after a slow virus has infected its victim, the user will get a report that a new executable file has appeared or that an old one has been modified. However, this is unlikely to cause any suspicions, since the new (or modified) file is copied there (or recompiled) by the user him/herself.
Such viruses already exist. The first of them was the Bulgarian virus Darth Vader. This virus infects only COM files, only when they are written to, and only when they contain a large enough block of zeroes to hold the virus body. The virus uses the fact that during the execution of the COPY command, MS-DOS copies the COM files in one pass - since they are known not to exceed 64 Kb. Therefore, the virus is certain to find the whole file in memory by intercepting the Write request and looking where the address of the buffer to be written points to. It then tries to locate a sufficiently large block of zeroes in this buffer, copies itself there and adjusts the first three bytes of the buffer to contain an instruction that will transfer control to the virus body. This is all; the virus even does not bother to write to the infected files itself. It knows that DOS will do it itself when executing the COPY command.
Surprisingly, the virus has not been created as an attack against the integrity checking software (although it evades all integrity checking programs very successfully). The initial idea of its author was to bypass the monitoring programs. His reasoning was that since they are monitoring the modification of the executable files, a safe way to evade them will be to infect only when the user requests modification of these files (via the WriteHandle request) him/herself. And indeed, the virus successfully evades these programs. It can even 'bypass' the diskette write protection tabs - since when the user copies a COM file to a diskette the write protection tab is removed. The virus achieves this without even having a critical error handler - it just does not need it.
Another virus of this kind is the Bulgarian virus Compiler. This one has been designed probably against the programmers, since it infects mainly when a program is re-compiled - when the size of the file (the virus infects only EXE files) is modified. However, in most environments this does not occur very often, so the virus is unlikely to spread very widely. Besides, it is a multi-partite virus, which infects the master boot record and at least this will be detected by the integrity checking software - if it is installed before the virus (i.e., on a clean system).
The last virus that successfully evades the integrity checkers is of Russian origin and is known under the name StarShip. It is rather common in Russia. It has been probably designed especially as an attack against the integrity checking software. The virus contains several interesting tricks: it installs itself in the video memory, infects the hard disk by modifying only three bytes in the partition table data, is polymorphic, multi-partite, uses the stealth technology, and so on.
A detailed description of this virus is outside the scope of this paper. It is sufficient to say that when an infected file is executed, the virus creates a fake partition in the master boot record, by modifying the parameters of the active one. When the computer is rebooted, the virus is loaded and receives control. It relocates itself to the video memory, then transfers control to the boot sector of the original DOS partition. While the virus is active in memory, it infects all executable files in drives A: and B: (the floppy disk drives) when they are created or modified.
If an integrity checker is installed on the computer before the virus attacks it, the attack will be discovered (because it modifies the master boot record). However, if the computer is already infected when the integrity program is installed, the latter will not be able to detect the further spread of the virus. And the virus spreads rather well - mainly when the executable files are copied to floppies. In fact, the virus tries to maximize the number of infected machines, instead of the number of infected executable objects - a common characteristic of the slow viruses, which is likely to be used more frequently in the future.
This particular virus does not present a significant danger to the West, because it is incompatible with most of the modern hardware and software platforms: monochrome video cards (including monochrome VGAs), MS-DOS with version higher than 3.30, large hard disk partitions (above than 32 Mb) and so on. It also contains a quite visible audio-visual payload, which is unlikely to remain unnoticed for a long time.
However, all the viruses described above present a dangerous trend - ideas, which if combined can make the life very difficult to the integrity checking software.
What can be done against the slow viruses? Very little, having in mind that they exploit an intrinsic flaw of the integrity checking programs.. A possible solution is a careful implementation of integrity shells as proposed in [Cohen89].
The idea is to have a resident program, which checks the integrity of the programs before they are executed. However, the program also keeps the so-called dependency paths and checks the integrity of all files the currently executed program depends on. This may include overlays called by the program, macro files, data files that it uses and so on. When the user copies a program, the integrity shell remembers that the copy depends on the original and checks the integrity of both when either of them is executed. (It also checks whether the two match.) The case of the virus that infects only when executable objects are copied to diskettes can be solved by having the integrity shell check that the copy and the original match immediately after the copying process.
Unfortunately, implementing the above conception can be very time-expensive. If the program loading takes too much time because all its dependency files have to be checked too, the user is likely to turn the protection off.
2.9. A combined attack.
Almost all the attacks described above have been 'tried' by the virus writers, by implementing them in some virus - at least to demonstrate that it is possible and 'can be done.' However, most of these viruses have been 'demonstration-only' and not able to spread widely. Let's try to imagine what can be done by just combining the different kinds of attack listed above. This (imaginary) virus will be a slow infector, so we shall name it Kuang [Gibson]. Since it combines only the currently known infection techniques, it is just a matter of time before such viruses begin to appear. The reader is invited to try to figure out him/herself how well such virus will spread and how well prepared is the line of anti-virus defense that s/he currently uses against such viruses.
Vx Heaven Virus Collection
Kuang comes with an infected utility that you get from a BBS or a public archive site, from the boot sector of a Conference, Virus News and Reviews, January 1992, pp. 2-7.