DOG and Environment - Sun, Nov 24, 2024
How DOG manages the environment segment
The environment Segment
This article describes the environment block, which is a key-value pair memory store of up to 32 KB in size available for all programs in DOS. As DOG is a DOS shell, it is responsible for setting up and maintaining the environment segment and maintaining the all important environment variables.
These variables are often used to set program behavior, define default flags and more. DOG uses several variables, and also utilizes environment in DOGfile programs.
Structure
The environment is a relatively simple structure. Each environment variable is
stored as a NUL
-terminated string (or C-string) of the form KEY=VALUE
. The
variables are stored in memory one after another, separated only by the NUL
character terminating the string. There is one final NUL
byte at the end of
the block signaling the end of the variables.
After the final NUL
comes a WORD
indicating how many additional strings
follow. This is usually just 0001h
and the string is the path to the program
owning the environment block. The string is NUL
terminated.
The environment block:
+-------------+ Segment aligned.
| VAR1=VALUE1 | Variables are ASCIIZ strings separating the KEY and
| VAR2=VALUE2 | value by an '=' character.
| ... | Only the size of the environment block limits
| VARn=VALUEn | the number of strings.
+-------------+
| BYTE 0 | The variable part ends with a 0 byte.
+-------------+
| WORD | Number of strings to follow, typically 0001h
+-------------+
| C:\DOG.COM | ASCIIZ command owning the environment
+-------------+ Max 32 KB
Setting, modifying and deleting variables
Setting variables in DOG is done with the command SE
. Internally DOG uses the
same functions as SE to set environment variables (defined in
alse.c), called
setudata()
. The way it works is that first it calculates the number of bytes
in use by the whole environment (see the usedbytes()
function) to see if the
new value would fit.
Next DOG checks if the variable already exists. It speeds it up a bit by first
checking if the nth character of the string is =
(to save some string
comparison time). If it’s not =
, DOG fast forwards until it finds a 0. If it
was an =
then DOG compares the key with the given key.
If the search resulted in no match then the new string is appended to the end of the environment. But since the environment data doesn’t end at the 0 byte, the tail is first copied before it’s overwritten by the new variable.
If the search resulted in a match, then the function saves a pointer to the end of the value part of the environment variable. This part and the end of the environment is copied over to a temporary location before it’s overwritten by the new variable.
The rest (everything after the end of the environment variable) is then copied back to the environment memory.
Removing variables is done in DOG by calling SE with just the name of the
variable. In DOG this is implemented in the setudata()
function which when
called with a NULL
value will simply overwrite it.
Variables defined by DOG
DOG makes use of the following environment variables:
-
COMSPEC
- This is used to define where the DOG binary was loaded from. In future versions it will be used to handle resident/transient parts, but more on that in a later post. -
PATH
- The list of directory paths where to search for commands. The paths are separated by the;
character. DOG looks first for commands which have aCOM
extension, nextEXE
and finallyDOG
. -
PROMPT
- The string displayed indicating that DOG is waiting for a command. The default prompt is defined as:SE PROMPT $n$_$_$b\_$n$_$_$b$_.\---.$n$_/$_$_$_,__/$n/$_$_$_/$p$_%%
And looks like:
|\_ | .\---. / ,__/ / /C:\DOG %
You can use these special characters:
$$
- the $ sign.$_
- space$b
- vertical bar|
$e
- The ESC character (ASCII 27 = 1b in hex)$l
- Left angle<
$g
- Right angle>
$n
- New line$r
- Carriage return$t
- Tabulator$p
- Current drive and path$c
- Current time
Setting up the environment
When DOG starts it checks its Program Segment Prefix (PSP) for the segment of the environment. To get the size of the allocated environment DOG checks it from the Memory Control Block (MCB).
Program Segment Prefix (PSP)
The Program Segment Prefix, or PSP for short, is a Data Structure describing a running Program. In DOS running programs are identified by the PSP address in a similar manner to how running processes are identified by the PID in Linux systems. The PSP contains many interesting fields, some of which are undocumented but well known in the programming community. In FreeDOS by its nature of being open source there is really no undocumented fields, but some are marked as “*_fill”, indicating it’s unused. DOG doesn’t use any of these fields.
DOG uses these fields:
- Offset 0Ah - Interrupt vector for INT 22h (4 bytes, offset + segment)
- Offset 0Eh - Interrupt vector for INT 23h (4 bytes, offset + segment)
- Offset 12h - Interrupt vector for INT 24h (4 bytes, offset + segment)
- Offset 16h - Parent PSP segment
- Offset 2Ch - Segment for the environment block.
When DOG is started with the -P
switch it sets Interrupts 22h, 23h, 24h to
point to DOG internals, and then changes the parent PSP to itself, effectively
making itself its own parent.
As mentioned above, the environment segment is used to identify its initial
environment block, and when the -E
switch is given DOG will write the segment
of the new environment block to the PSP.
Memory Control Block (MCB)
The memory Control Block is a small 16-byte data structure which DOS uses to keep track of memory. The structure is as follows:
+---------+
| BYTE | MCB type. If this is 'Z' then it's the last memory block, otherwise 'M'
| WORD | PSP of owner of the memory block.
| WORD | Size of the block in paragraphs. A paragraph is 16 bytes.
| 3 BYTEs | Unused / reserved
| 8 BYTEs | Program name when MCB is followed by program code. Unused otherwise.
+---------+
The MCB has a few quirks which may be good to know about. First it’s possible that the size of the MCB is 0. This happens when all memory but 1 paragraph is reserved leaving only 1 paragraph (16 bytes) left, which is then taken up by the MCB itself, leaving 0 free bytes. If the PSP is 0000h, it means that the memory controlled by the block is free for allocation. Finally the 8 bytes reserved for the program name are only used when the MCB controls a program code section, i.e. it contains a PSP followed by program code.
Also note that you can traverse the memory chain even though it’s not a linked list. Simply take the segment of the MCB and add 1 + the size (offset 03h) to that to get the next MCB, as long as the type is ‘M’. You stop at ‘Z’.
Initial Environment
If the config.sys
file is empty or doesn’t contain a MENU
then the
environment passed to DOG is 0000h
, meaning that DOG needs to create one from
scratch.
When there is a menu, the kernel sets up a small environment block and gives it to the shell. This environment block is usually too small for practical use, typically less than 160 bytes (MS-DOS seems to give 112 bytes and FreeDOS seems to give 96 bytes).
Reallocating a new memory block
When initializing, and the -E
flag is given, DOG creates a new memory block
by calling the
INT 21h/AH=48 function.
If the used environment space doesn’t fit in the new size, DOG will allocate
just enough memory to fit the variables. Once DOG has successfully allocated a
new Environment block, it copies the data from the old block to the new
block. Then it frees the memory of the old block by calling the
INT 21h/AH=49 function.
Environment of a child process
DOG executes programs by using the INT 21h/AX=4Bh,AL=00h function. DOG sets the the environment field of the EXEC parameter block to 0000h, indicating that the OS should clone the environment of DOG as an environment for the child process.
The EXEC memory block has the following structure:
+-------+
| WORD | segment of environment to copy for child process
| | (copy caller's environment if 0000h)
| DWORD | pointer to command tail to be copied into child's PSP
| DWORD | pointer to first FCB to be copied into child's PSP
| DWORD | pointer to second FCB to be copied into child's PSP
+-------+
Alias
DOG saves aliases in a separate memory area from environment variables, but uses the same functions therefore the structure of the memory block is identical to the environment block.
By default the alias block is empty and DOG allocates 2kb of memory for it. It
can be overridden with the -A
flag for DOG. Each instance of DOG keeps its
own Alias block, so a sub shell wouldn’t have access to the aliases set by
the parent.