[Chapter 29] 29.10 The check

29.10 The check_... Rule Sets

The rapid spread of the Internet has led to an increase of mail abuses. Prior to V8.8 sendmail, detecting and rejecting abusive email required that you write C language code for use in the checkcompat() routine (see Section 20.1, "How checkcompat() Works"). Beginning with V8.8 sendmail important and useful checking and rejecting can be done from within four brand new rule sets: check_mail Validate the sender-envelope address given to the SMTP MAIL command. check_rcpt Validate the recipient-envelope address given to the SMTP RCPT command. check_relay Validate the host initiating the SMTP connection. check_compat Compare or contrast each envelope sender and envelope recipient pair of addresses just before delivery, and validate based on the result.

These routines are handled in the same manner. If the rule set does not exist, the address is accepted. If the rule set returns anything other than a #error delivery agent, the message is accepted. Otherwise, the message is rejected by using the mechanism described under the #error delivery agent (see Section 30.5.2, "The error Delivery Agent").

29.10.1 The check_mail Rule Set

The MAIL command in the SMTP dialog is used to specify the envelope-sender address:

MAIL From: <sender@host.domain>

If the check_mail rule set exists, it is called immediately after the MAIL command is read. The workspace passed to check_mail is the address following the colon in the MAIL command. That envelope sender address may or may not be surrounded by angle braces.

To illustrate one use for the check_mail rule set, consider the need to reject all incoming mail from the site named spamming.org. [6] One method might look like this:

[6] Also see Section 22.4.1, "Accept/Reject Connections via libwrap.a" for a discussion of how to use the TCP wrapper library from within sendmail.

Scheck_mail
R$*                   $: $>3 $1              focus on the host
R$* <@ $+. > $*       $1 <@ $2> $3           strip trailing dots
R$* <@ $+ > $*        $: $2                  isolate the host
R$* . $+ . $+         $: $2 . $3             strip subdomains
Rspamming.org         $#error $@ 5.7.1 $: "cannot accept mail from spamming.org"

Here, we force rule set 3 to preprocess the address so that any RFC822 comments will be thrown away and so that the host part of the address will be focused. We then strip any trailing dots from the hostname to prevent a trailing dot from wrongly effecting our validation. In the third line we throw away everything but the hostname. In the fourth line we throw away all but the rightmost two components of the hostname to eliminate the host part and any subdomain prefixes. What remains is just the domain name. We then compare that domain name to the hostname spamming.org. If they match, we reject the sender.

After this rule set is installed (and the sendmail daemon had been restarted), all mail from spamming.org will be rejected during the SMTP dialogue like this:

MAIL From: <badguy@spamming.org>
553 <badguy@spamming.org>... cannot accept mail from spamming.org

This is just one possible use of the check_mail rule set. Other uses might be the following:

Rejecting mail from specific users at a given site.
Looking up user@host in a database and rejecting the sender if that lookup succeeds.
Insisting that the host part of the address be canonifiable with the $[ and $] operators.

If you need to base a decision to reject mail on both the sender and the recipient, you may be able to use the check_compat rule set described below.

29.10.2 The check_rcpt Rule Set

The RCPT command in the SMTP dialogue specifies an envelope recipient's address:

RCPT To: <recipient@host.domain>

If the check_rcpt rule set exists, it is called immediately after the RCPT command is read. The workspace that is passed to check_rcpt is the address following the colon. The envelope recipient address may or may not be surrounded by angle brackets and may or may not have other RFC822 comments associated with it.

To illustrate one use for the check_rcpt rule set, consider the need to reject all incoming mail destined for the recipient named fax. One method might look like this:

R$*                     $: $>3 $1               focus on host
R$* <@ $~w > $*         $@ ok                   not @ourhost is okay
R$* <@ $+ > $*          $: $1                   discard host
Rfax                    $#error $@ 5.1.3 $: "cannot send mail to fax"

Here, we first call rule set 3 to focus on the host part of the address and normalize it. The second rule accepts anything that is addressed to any host but our own. That way, mail to fax@another.host will work. The third rule discards the host (our local) part of the address. In the fourth line the remaining user part is compared to the name fax. Any mail to fax is thus rejected:

RCPT To: <fax@ourhost>
553 <fax@ourhost>... cannot send mail to fax

Other uses for this check_rcpt rule set might include the following:

Protecting a user who has become the target of a mail attack. You could create a new account for this user and block incoming mail to the old account. In the #error message you could print a phone number that others may call to obtain the new email address.
Claiming that certain secret users are unknown. These might be the pseudo-users associated with autonomous processes.
Refusing to accept mail that is not addressed to a user who has an active account as represented by the passwd(5) file (see Section 33.8.20, user).
Looking up recipients in a database and accepting mail for them only if they are found in that database. This way, only selected users may be allowed, for example, through a firewall, though the firewall knows all about all users.
Looking up local-looking recipients in a database to see whether they have moved to a new location. If so, advise the other site of the new address with a rejection message. This is similar to the redirect FEATURE (see Section 19.6.21, FEATURE(redirect)), but operates at the RCPT level instead of sending bounced mail.
Turning off unwanted "relaying" through your machine. Requires use of the ${client_name} macro (see Section 31.10.8, ${client-name}).

29.10.3 The check_relay Rule Set

V8.8 sendmail supports two mechanisms for screening incoming SMTP connections. One is the libwrap.a mechanism (see Section 22.4.1); the other is this check_relay rule set. This rule set is used to screen incoming network connections and accept or reject them based on hostname, domain, or IP number. It is called just before the libwrap.a code and can be used if that code was omitted from your release of sendmail.

The check_relay rule set is called with a workspace that looks like this:

hostname $| IPnumber

The hostname and IP number are separated by the $| operator. The hostname is the fully qualified canonical name of the connecting host. The IPnumber is the IP number of that host in dotted-quad form without surrounding square brackets.

One way to use check_relay might be to list offensive sites in a database and to reject any connections from those sites. Consider a database that contains hostnames as its keys and descriptions of each host's offense as its values:

hostA.edu      Spamming site
hostB.com      Mail Bombing site
123.45.6       Offensive domain

Notice that the keys can be hostnames or IP addresses. Such a database might be declared in the configuration file like this:

Kbadhosts dbm -a <> /etc/badhosts

Now, each time a site connects to your running daemon, the following rule set will be called:

Scheck_relay
R$* $| $*            $: $(badhosts $1 $) $| $2             look up host name
R$*<> $| $*          $#error $@ 5.1.3 $: Sorry, $1 denied
R$* $|  $*           $: $2                                 select the IP number
R$-.$-.$-.$-         $: $(badhosts $1.$2.$3 $)             look up domain part
R$*<>                $#error $@ 5.1.3 $: Sorry, $1 denied
R$*                  $@ ok                                 otherwise okay

The second rule looks up the host part in the database. If it is found, the value (reason for rejection) is returned and the two characters <> are appended. The third rule looks for anything to the left of the $| that ends in <> and, if anything is found, issues the error: [7]

[7] Actually, the message is not printed; instead, the SMTP daemon goes into a "reject everything" mode. This prevents some SMTP implementations from retrying the connection.

Sorry, reason for reject denied

Rejected connections are handled the same way as connections rejected by the libwrap.a technique (see Section 22.4.1).

The rest of the rules do the same thing, except that they check for the IP number. If the check_relay rule set returns anything other than a #error delivery agent, the address is accepted.

Note that the rules presented here are not nearly as complex or sophisticated as your site will likely need. It does not, for example, reject on the basis of the domain part of hostnames, nor does it reject on the basis of the individual host IP addresses.

Note that such rule sets cannot be tested in rule-testing mode, because that mode interprets the expression $| (when you enter it at the > prompt) wrongly as two separate text characters instead of correctly as a single operator. To test an address that contains an embedded $| operator, we suggest that you create a translation rule set something like this:

STranslate
R$* $$| $*              $: $1 $| $2                            fake for -bt mode

This rule set changes a literal $ and | into a $| operator so that you can test rule sets such as check_relay from rule-testing mode:

ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
> Translate,check_relay bogus.host.domain $| 123.45.67.89

Here, the comma-separated list of rule sets begins with Translate, which changes the two-character text expression "$|" into the single operator $|. The result, an address expression that is suitable for the check_relay rule set, can then be successfully tested. [8]

[8] Don't be tempted to put this rule directly into the check_relay rule set. You may someday encounter an address that has the two adjacent characters "$" and "|" as a legal part of it. Also beware of such addresses being intentionally sent just to circumvent your checks.

29.10.4 The check_compat Rule Set

Not all situations can be resolved by simply checking the recipient or sender address. Sometimes you will need to make judgments based on pairs of addresses. To handle this situation, V8.8 introduced the check_compat rule set. Unlike check_mail and check_rcpt, check_compat is called for all deliveries, not just SMTP transactions. It is called just after the check for too large a size (as defined by M=; see Section 30.4.7, M=) and just before the checkcompat() routine is called (see Section 20.1).

The check_compat rule set is called with a workspace that looks like this:

sender $| recipient

The sender and recipient address are separated by the $| operator. Each has undergone aliasing and ~/.forward file processing.

As one example of a way to use the check_compat rule set, consider the need to prevent a certain user (here operator) from sending mail offsite:

SGet_domain
R$*                     $: $>3 $1               focus on host
R$* <@ $+. > $*         $1 <@ $2> $3            strip trailing dots
R$* <@ $+ > $*          $: $2                   isolate the host
R$* . $+ . $+           $@ $2 . $3              strip host and subdomains

SGet_user
R$*                     $: $>3 $1               focus on host
R$* <@ $+ > $*          $@ $1                   discard host

Scheck_compat
R$* $| $*               $: $1 $|  $>Get_domain $2       fetch recipient domain
R$* $|  $=w             $@ ok                           local is okay
R$* $|  $m              $@ ok                           local is okay
R$* $|  $*              $: $>Get_user $1                fetch sender user
Roperator               $#error $@ 5.1.3 $: "operator may not mail offsite"

First we set up two subroutines patterned after the code in the previous two sections. The first reduces its workspace to just the domain part of an address. The second reduces an address to just the user part. These two subroutines are called by check_compat.

The first rule in check_compat uses the Get_domain subroutine to convert the address on the right (the recipient) into just a domain name. That right side is compared to the local hosts names ($=w and $m). If the domain is a local one, delivery is allowed (we return anything but #error).

If the domain is an offsite one, we then call Get_user to fetch the user part of the address to the left (the sender). If that user is operator, delivery is denied and the message bounces.

Other uses for the check_compat rule set might be the following:

Creating a class of user who, possibly for security reasons, may send only mail inside the organization, but not outside it.
Screening a particular recipient to prevent that user from receiving objectionable mail from a specific source.
Screening mail based on hostname to prevent outsiders from using your host as a mail relay.

Note that such rule sets cannot be tested in rule-testing mode because that mode interprets the expression $| (when you enter it at the > prompt) wrongly as two separate text characters instead of correctly as a single operator. See Section 29.10.3, "The check_relay Rule Set" for one suggested solution to this problem.


29.9 Rule Set 1		29.11 Pitfalls