[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In its current state (as of Anubis version 4.3) Anubis has proven to be a useful tool for processing plain text outgoing messages. However, its use with MIME messages creates several problems despite of a flexible ruleset supported by the program.
This RFC proposes a new mode of operation that should make processing of MIME messages more convenient.
In general, Anubis processes a message using a set of user-defined rules, called user program, consisting of conditional statements and actions. Both of them may operate on message body as well as on its headers. This mode of operation suites excellently for plain text messages, however it does have its drawbacks when processing multi-part messages.
To begin with, only the first part of multi-part messages is processed, the rest of message is usually passed to the MTA verbatim. Thus, this part can be processed by the user program only if it is in plain text: parts encoded by quoted-printable or, worse yet, base-64 encoding cannot be processed this way. The only way for the user to process non-plaintext multi-part messages is by using some extension procedures (usually external scripts).
A special configuration setting read-entire-body
(see section Basic Settings) is provided that forces Anubis to process the entire body
of a multi-part message (among other effects it means passing entire
body to the external scripts as well). However, it does not help solve
the problem, since no attempt is being made to decode parts of the
message, so the user is left on his own when processing such messages.
The solution proposed by this memo boils down to the following: process each part of the multi-part message as a message on its own allowing user to define different RULE sections for processing different MIME types. The following sections describe the approach in more detail.
When processing a multi part message, Anubis first determines its MIME
type. A user is allowed to define several RULE sections(9) that are supposed to handle
different MIME types. Anubis keeps a type <-> section
association table (a dispatcher table) which is used to
determine the entry point for processing of each particular part. If
the dispatcher table does not contain an entry for the given MIME
type, the contents of the part is passed verbatim. Otherwise, Anubis
decodes the part body and passes it for further processing to the
RULE section. When invoking this particular section, MIME headers
act as a message headers and MIME body acts as its body. After the code section finishes processing of the message
part, it is encoded again(10) and then
passed to the output.
MIME standards allow multi-part messages to be nested to arbitrary depth, therefore the described above process is inherently recursive. This brings following implications:
multipart/*
and message/rfc822
contents must be handled. These entries must be configurable, thus
giving final user a possibility to disable some of
them. Preferably there should exist a way of specifying new recursive
types as well.
The structure of MIME dispatcher table should allow for flexible search of user program entries depending on MIME type of the part being processed. It is important also that it allows for a default entry, i.e. an entry that will be used for processing a part whose type is not explicitely mentioned in the table. The absence of such default entry should be taken as indication that the part must be transferred verbatim.
Thus, each entry of the dispatcher table must contain at least the following members.
type
Specifies regular expressions describing MIME type this entry handles.
For the sake of clarity this memo uses shell-style regular expressions
(see glob(7)
or fnmatch(3)
). However, Anubis
implementation can use any other regular expression style it deems
appropriate.
entry point
Specifies an entry point to the code section that handles MIME
parts of given type. The entry point is either nil
, meaning default processing
(thus the default entry can be represented as ("*" . nil)
at the end of the table),
or one of predefined entry points serving for recursive
procession of message parts, or, finally, it is a code index of
a user-defined rule section.
The dispatcher table can contain several entries matching a given
MIME type. In this case, the entry point
of each of them
must be invoked in turn. For example, consider this dispatcher table:
<text/plain> ⇒plaintext
<text/x-patch> ⇒patchfile
<text/*> ⇒anytext
When processing a part of type text/plain
using this dispatcher
table, first the section named plaintext
is called, then
its output is gathered and used as input for the section named
anytext
. Such approach allows for building flexible structured
user programs.
This memo proposes addition of following configuration entities
to CONTROL
section of Anubis configuration file. These
entries may be used in both system-wide and user-specific
configuration files, the order of their priority being
determined as usual by the rule-priority
statement (see section Security Settings).
This option discards from the dispatcher table all entries gathered so far.
This option adds or modifies entries in MIME dispatcher table. Section-id
specifies the section identifier, i.e. either the name of a
user-defined rule section, or one of the keywords none
and
recurse
. In the former case, Anubis must make sure the named
section is actually defined in the configuration file and issue an
error message otherwise.
Regexp-list is whitespace-separated list of regular expressions specifying MIME types that are to be handled by section-id.
The effect of this option is that for each regular expression re
from the list regexp-list, the dispatcher table is searched for
an entry whose type
field is exactly the same as
re(11). If
such an entry is found, its entry code
field is replaced with
section-id. Otherwise, if no matching entry was found a new
one is constructed:
(re . section-id)
and appended to the end of the list.
For example:
dispatch-mime-type recurse "multipart/*" "message/rfc822" dispatch-mime-type Text "text/*" dispatch-mime-type none "*"
This example specifies that messages (or parts) with types matching
multipart/*
and message/rfc822
must be recursed into,
those of type text/*
must be processed by user-defined section
Text
and the rest of parts must be transferred verbatim. The
section Text
must be declared somewhere in the configuration
file as
BEGIN Text … END
Notice that the very first dispatch-mime-type
specifies a
built-in entry. This memo does not specify whether such a built-in
entry must be present by default, or it should be explicitely declared
as in the example above. The explicit declaration seems to have
advantage of preserving backward compatibility with versions 4.0 and
earlier of Anubis (see COMPATIBILITY CONSIDERATIONS).
Notice also that when encountering the very first
dispatch-mime-type
(or dispatch-mime-type-prepend
, see
below) statement in the user configuration file, Anubis must
remove the default entry (if any) from the existing dispatcher table.
Such entry should be added back after processing user’s CONTROL
section, unless clear-dispatch-table
has been used.
Has the same effect as dispatch-mime-type
except that the
entries are prepended to the dispatcher table.
This memo does not determine how exactly is Anubis supposed to discern
between text and binary messages. The simplest way is by using the
Content-Type
header: if it contains charset=
then it
describes a text part. Otherwise it describes a binary part. Probably
some more sophisticated methods should be implemented.
To avoid dependency on any particular charset, text parts must be decoded to UTF-8. Correspondingly, any literals used in Anubis configuration files must represent valid UTF-8 strings. However, this memo does not specify whether Anubis implementation should enforce UTF-8 strings in its configuration files.
It is possible to specify processing rules for binary MIME
parts. However, Anubis does not provide any mechanism for
binary processing, not is it supposed to provide any. This memo
maintains that the existing external-body-processor
and
guile-process
statements are quite sufficient for processing
any binary message parts.
BEGIN CONTROL dispatch-mime-type recurse "multipart/*" "message/rfc822" dispatch-mime-type plaintext "text/plain" dispatch-mime-type image "img/*" END CONTROL SECTION plaintext modify body ["now"] "then" END SECTION image external-body-processor resize-message END
This example configuration shows the idea of using
external-body-processor
statement for binary part
processing. The following version of resize-message
script uses
convert
program for reducing image size to 120x120 pixels:
#! /bin/sh TMP=$HOME/tmp/$$ cat - > $TMP convert -size 120x120 $TMP.jpg -resize 120x120 +profile '*' out-$TMP rm $TMP cat out-$TMP rm out-$TMP
In the absense of any dispatch-mime-type
statements, Anubis
should behave exactly as version 4.0 did. Specifying
clear-dispatch-table
in the user configuration file should produce the same effect. This
can be useful if system-wide configuration file contained some
dispatch-mime-type
statements.
This specification is believed to not introduce any special security considerations.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] |
This document was generated on January 6, 2024 using texi2html 5.0.