« Ext3文件系统如何应用于Postfix | Main | 有关Postfix 的虚拟用户支持的大讨论 »

版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本声明。
本文网址:http://www.hzqbbc.com/blog/arch/2002/11/qmgr_message_ac.html
 

November 01, 2002

对qmgr_message_active_limit 与 default_process_limit的高级分析

First the summary:
------------------

The deferred messages dominate the queue. In steady state the
"deliverable" messages just sail by and can be ignored. Size the process
limit to accommodate the "slow" messages + a reasonable estimate of active
sessions that the hardware and network will allow. Better yet use a
separate active queue, which for now requires a second Postfix that gets all
deferred mail (via fallback_relay???).

Since the queue manager knows which messages came in fresh from
"incoming/maildrop" and which from "deferred", it could keep two directory
hierarchies "active" and "reactive" and apply different process limits to
each. This would also help the folks with the solid-state disks, who do not
want deferred mail to consume any of the precious storage, and finally
"flush" would be much less tragic, as it would interfere with the delivery
of fresh mail.

Author: Victor Duchovni (Victor.Duchovni@morganstanley.com)

And now the details:
--------------------

The limit on the active queue size is intended to bound the CPU and memory
cost of the queue manager scheduling algorithms. Think of the queue manager
as being similar to an OS scheduler, it has a concurrency limit for running
jobs (here active instances of "smtp" and other transports rather than
physical CPU) and a limit for runnable jobs (at a stretch analogous to the
size of the process table). There is a difference between a running job
(actually being processed by one of your 400 allocated transport slots) and
a runnable job (a message in the active queue). The limit on the total
number of "runnable" jobs should be as large as possible without expending
unreasonable CPU cost in the scheduling algorithm. A large active queue
limit will let the queue manager deal gracefully with bursts of new mail
that briefly overwhelm the sustained output rate the queue.

A healthy MTA is on average able to deliver mail faster than the input rate
(or it dies of congestive collapse). The active queue consists of just
arrived messages that are being tried for the first time (average occupancy
= input message rage * average time to deliver, bounce or defer) and also
deferred messages being retried (average occupancy = deferred queue size *
fraction of time deferred messages spend in active queue).

It turns that for large sites the (previously) deferred messages typically
constitute the majority of the "running" messages and contribute most to the
active queue population. Suppose for example that your message input rate
is 100 messages per second, and you are able to process the average message
with a 1 second delay. This means that the active queue in steady state
will have 100 newly arrived messages.

On the other hand with 100 messages per second your deferred queue may have
20000 or more deferred messages. With a maximum backoff of 4000 seconds,
each deferred message "visits" the active queue every 4000 seconds or so.
As the message was deferred it is likely to be deferred again, typically due
to a multi-minute timeout (say 3 minutes == 180 seconds). This means that
you typically have (180/4000)*20000 = 900 messages in the active queue that
have previously been deferred and are likely to be deferred again. These
900 messages will tie up all the available delivery slots and the queue will
stall. It is very important to make sure that the default process limit
exceeds the expected population of reactivated deferred mail.

There are a couple things you can do to improve throughput on machines that
have large quantities of deferred mail. One (the best for really high
volume sites IMHO) option is to hand off the deferred mail to a different
postfix instance that only handles deferred mail. This keeps the
reactivated deferred mail from competing with the new mail for delivery
slots, but requires a second copy of Postfix which still needs some tuning.
I have not yet had to go to that "extreme". Here are a few simpler things
you can do:

1. If reasonable for your user community reduce "maximal_queue_lifetime"
(I use 2 days rather than 5). This should proportionally reduce the active
queue footprint of the deferred mail.

2. Increase the "maximal_backoff_time" (I use 14400s), this will again
proportionally reduce the average number of reactivated messages.

3. Use "nqmgr" and use a different (in name only, I use "relay") SMTP
transport for your "relay" domains if you forward mail for your domains via
SMTP to an internal gateway. "nqmgr" implements fairness between
transports, and you do not want to round-robin the high inbound message rate
(50% of all traffic for many sites) against all the hundreds of "slow"
domains in the active queue. This can keep inbound mail flowing even when
the active queue is congested.

4. Avoid "flush" (a.k.a. ETRN a.k.a. "sendmail -q") like the plague.
Disable ETRN for all clients except those on the "fast_flush_domains" list.
There is nothing worse than thousands of undeliverable messages in the
active queue (instant DoS). The deferred messages are deferred for a
reason, and while one wants to retry them, doing so all at once is a recipe
for disaster.

5. Size default_process_limit with the expected number of deferred
messages being retried as a baseline. Such messages spend most of their
time idle, so they don't really burn CPU, the actual output is being
performed by the additional delivery slots allocated over and above the
active count of previously deferred messages.

My own experience:
------------------

My initial configuration with the default timers and process limit (of 50)
had anywhere from 40 to 60 "slow" messages in the active queue at a time.
This was not fun to watch, a fast machine completely idle, not delivering
any mail. Scaling the queue age and backoff timers reduced the "slow"
occupancy to around 10, and raising the default_process_limit from 50 to 500
was more in line with the capabilities of the hardware (8 × 400MHz CPU
E4500).

Just as an illustration: right now (the slowest time for mail at my site)
the numbers on one of the relays are as follows:

# find `postconf -h queue_directory`/active -type f -print│ wc -l 
12 
# netstat -an -f inet │ grep SYN_SENT │ wc -l 
9 
# find `postconf -h queue_directory`/deferred -type f -print │ wc -l 
    733 

This shows at least 9 out of 12 active delivery slots going to sites with
connections in SYN_SENT (host not responding). 9/733 is within 4% of
180/14400 (three minutes to time out a TCP connection and then another 14400
second backoff). Your mileage may vary based on OS TCP settings and
differences in the average number of A records corresponding to MX records
for the dead domains.

I hope this helps.

--
Viktor.
>
> This should be an easy one :-) I'm trying to tune a
> MX spooler. It has about 21,000 queued messages, all in
> various stages of delivery. I'm running
> Postfix-20010228-pl01 on it. qmgr_message_active_limit
> defaults to 10,000, and I usually have default_process_limit
> set to something like 400 (dual p3 w/ 512mb ram, so given
> the size of smtp processes, I figured 400 would be nice and
> safe).
>
> Here's the question. If postfix is pulling messages out of
> deferred and incoming, and putting them in active, how are
> 400 smtp process (obviously not all 400 are smtp, but most
> are) going to process 10,000 in a timely manner? shouldn't
> the active queue match the number of processes taking mail
> out of it? Please, be brutal. Tell me where my logic is
> going wrong.
>
>
> --
> Joshua Warchol
> -

Posted by hzqbbc at November 1, 2002 01:12 AM

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)