sendmail

sendmailSearch this book
Previous: 27.6 PitfallsChapter 28Next: 28.2 Tokenizing Rules
 

28. Rules

Contents:
Overview
Tokenizing Rules
The Workspace
The Behavior of a Rule
The LHS
The RHS
Pitfalls

Rules are like little if-then clauses, existing inside rule sets, that test a pattern against an address and change the address if the two match. The process of converting one form of an address into another is called rewriting. Most rewriting requires a sequence of many rules, because an individual rule is relatively limited in what it can do. This need for many rules, combined with the sendmail program's need for succinct expressions, can make sequences of rules dauntingly cryptic.

In this chapter we dissect the components of individual rules. In the previous chapter we showed how groups of rules are combined to perform necessary tasks.

28.1 Overview

Rules are declared in the configuration file with the R command. Like all configuration commands, the R rule configuration command must begin a line. The general form consists of an R command followed by three parts:

Rlhs    rhs   comment
     -^      -^
    tabs    tabs

The lhs stands for left-hand side and is most commonly expressed as LHS. The rhs stands for right-hand side and is expressed as RHS. The LHS and RHS are mandatory. The third part (the comment) is optional. The three parts must be separated from each other by one or more tab characters (space characters will not work).

Space characters between the R and the LHS are optional. If there is a tab between the R and the LHS, sendmail silently uses the LHS as the RHS and the RHS becomes the comment.

The tabs leading to the comment and the comment itself are optional and may be omitted. If the RHS is absent, sendmail prints the following warning and ignores that R line:

invalid rewrite line "bad rule here" (tab expected)

This error is printed when the RHS is absent, even if there are tabs following the LHS. (This warning is usually the result of tabs being converted to spaces when text is copied from one window to another in a windowing system.)

28.1.1 Macros in Rules

Each noncomment part of a rule is expanded as the configuration file is read. [1] Thus any references to defined macros are replaced with the value that the macro has at that point in the configuration file. To illustrate, consider this mini-configuration file (called x.cf):

[1] Actually, the comment part is expanded too, but with no effect other than a tiny expenditure of time.

DAvalue1
R$A    $A.new
DAvalue2
R$A    $A.new

In it, $A will have the value value1 when the first R line is expanded and value2 when the second is expanded. Prove this to yourself by running sendmail in -bt rule-testing mode on that file:

% echo =S0 | /usr/lib/sendmail -bt -Cx.cf
> =S0
Rvalue1                 value1 . new 
Rvalue2                 value2 . new

Here, we use the =S command (see Section 38.4.1, "Show Rules in a Rule Set with =S") to show each rule after it has been read and expanded.

Another property of macros is that an undefined macro expands to an empty string. Consider this x.cf file:

DAvalue1
R$A    $A.$B
DAvalue2
R$A    $A.$B

and this rule-testing run of sendmail:

% echo =S0 | /usr/lib/sendmail -bt -Cx.cf
> =S0
Rvalue1                 value1 . 
Rvalue2                 value2 .

Beginning with V8.7 sendmail, macros can be either single-character or multicharacter. Both forms are expanded when the configuration file is read:

D{OURDOMAIN}us.edu
R${OURDOMAIN}    localhost.${OURDOMAIN}

Multicharacter macros may be used in the LHS and in the RHS. When the configuration file is read, the previous example is expanded to look like this:

> =S0
Rus . edu               localhost . us . edu

28.1.2 Rules Are Treated Like Addresses

After each side (LHS and RHS) is expanded, each is then normalized just as though it were an address. A check is made for any tabs that may have been introduced during expansion. If any are found, everything from the first tab to the end of the string is discarded. Then RFC822-style comments are removed. An RFC822 comment is anything between and including an unquoted pair of parentheses:

DAroot@my.site (Operator)
R$A  tab RHS
  -v
Rroot@my.site (Operator)  tab RHS<- expanded
  -v
Rroot@my.site  tab RHS<- RFC822 comment stripped

Finally, a check is made for balanced quotation marks, right parentheses balanced by left, and right angle brackets balanced by left. [2] If any right-hand character appears without a corresponding left-hand character, sendmail prints one of the following errors, where configfile is the name of the configuration file that is being read, num shows the line number in that file, and expression is the part of the rule that was unbalanced:

[2] The $> operator isn't counted in checking balance.

configfile: line num: expression...Unbalanced '"'
configfile: line num: expression...Unbalanced '>'
configfile: line num: expression...Unbalanced ')'

The first line shows that sendmail has balanced the unbalanced quotation mark by appending a second quotation mark. Each of the last two lines shows that the unbalanced character is removed. For example, the file:

Rx      RHS"
Ry      RHS>
Rz      RHS)

produces these errors and rules:

% echo =S0 | /usr/lib/sendmail -bt -Cx.cf
x.cf: line 1: RHS"... Unbalanced '"'
x.cf: line 2: RHS>... Unbalanced '>'
x.cf: line 3: RHS)... Unbalanced ')'
> =S0
Rx              RHS "" 
Ry              RHS 
Rz              RHS

Note that prior to V8.7 sendmail, only an unbalanced right-hand character was checked: [3] Beginning with V8.7 sendmail, unbalanced left-hand characters are also detected, and sendmail attempts to balance them for you. Consider, the following file:

[3] That is, for example, there must not be a > before the < character, and they must pair off.

Rx      "RHS
Ry      <RHS
Rz      (RHS

Here, sendmail detects and fixes the unbalanced characters but does so with warnings:

% echo =S0 | /usr/lib/sendmail -bt -Cx.cf
x.cf: line 1: "RHS... Unbalanced '"'
x.cf: line 2: <RHS... Unbalanced '<'
x.cf: line 3: (RHS... Unbalanced '('
x.cf: line 3: R line: null RHS
> =S0
Rx              "RHS" 
Ry              < RHS > 
Rz

Note that in the last example (Rz), sendmail balanced the RHS by adding a rightmost parenthesis. This caused the RHS to become an RFC822 comment, which was then deleted, resulting in the null RHS error.

If you get one of these Unbalanced errors, be sure to correct the problem at once. If you leave the faulty rule in place, sendmail will continue to run but will likely produce erroneous mail delivery and other odd problems.

28.1.2.1 Backslashes in rules

Backslash characters are used in addresses to protect certain special characters from interpretation (see Section 35.3.2, "Escape Character in the Header Field"). For example, the address blue;jay would ordinarily be interpreted as having three parts (or tokens, which we'll discuss soon). To prevent sendmail from treating this address as three parts and instead have it viewed as a single item, the special separating nature of the ; can be escaped by prefixing it with a backslash:

blue\;jay

V8 sendmail handles backslashes differently than other versions have in the past. Instead of stripping a backslash and setting a high bit (see below), it leaves backslashes in place:

blue\;jay      becomes   ->   blue\;jay

This causes the backslash to mask the special meaning of characters, because sendmail always recognizes the backslash in that role.

The only time that V8 sendmail strips backslashes is during local delivery, and then only if they are not inside full quotation marks. Mail to \user is delivered to user on the local machine (bypassing further aliasing) with the backslash stripped. But for mail to \user@otherhost the backslash is preserved in both the envelope and the header.

Prior to V8 sendmail, for addresses and rules (which are normalized as addresses), the backslash was removed by sendmail and the high bit of the character following the backslash was set (turned on).

blue;jay
    -^
    high-bit turned on

This practice was abandoned because of the need to support international characters (such as ö, [Delta], and [phi]). Most international character sets include many characters that have the high bit set. Escaping a character by setting its high bit is a practice that no longer works in our modern, international world.


Previous: 27.6 PitfallssendmailNext: 28.2 Tokenizing Rules
27.6 PitfallsBook Index28.2 Tokenizing Rules