Table of Contents

OS - BSD - OpenBSD - Mailinglists - misc

OpenBSd or HP-UX? interesting discussion on (Open)LDAP on OpenBSD or HP-UX, on a large or a set of smaller machines, etc.

marc.info - openbsd-misc - Compilers in OpenBSD
marc.info - openbsd-misc - Intel Core 2 talks about processor bugs.

Pearls from misc

Date: Thu, 8 Jan 2009 16:32:41 -0800
From: "Gerald Chudyk"
To: "Peter Kay - Syllopsium"
Subject: Re: Best supported arch/workstation
Cc: "Matt KP60", misc.openbsd.org

> I've never personally had any issues with the sparc port (32 bit), but all
> those machines are old and slow.
>
Hush, my little inetra. I'm sure he wasn't really talking about you.
You are very good at what you do. And you're right: it's not your
fault I haven't found a replacement for your dead nvram battery.

Once I remind you about who you are, and what you should be doing,
you never forget anything until the next unplanned power event.

And besides it's not how much processor power or memory you have or
how big your hard drive that makes you attractive to me. It's the
little things we have shared together over time: dns, dhcp, sendmail,
apache. I could go on and on.

Whew...that was a close one.
Subject: 	Re: nfsv4?
Date: 	Wed, 27 Oct 2010 19:34:15 -0600
From: 	Theo de Raadt
To: 	<...>
CC: 	<...>, misc.openbsd.org
<snip>
shit which comes out of research organizations all tends to suck these
days, doesn't it.  or perhaps it always did (OSI networking, ipv6,
same same).

i have theorized in the past that the problem we face is
that an insufficient number of axe murderers are attending those kinds
of research meetings.

Load

Subject: 	Re: I don't get where the load comes from
Date: 	Wed, 01 Jun 2011 15:41:51 +0200
From: 	Benny Lofgren 


On 2011-06-01 15.12, Joel Wiramu Pauling wrote:
> Load is generally a measure of a single processor core utilization over an
> kernel dependent time range.

No it isn't. You have totally misunderstood what the load average is.

> Generally as others have pointed out being a very broad (not as in meadow,
> as in continent). Different OS's report load very differently from each
> other today.

That one's sort of correct, although I've yet to see an OS where the load
doesn't in some way refer to an *average* *count* *of* *processes*.

> Traditionally you would see a load average of 1-2 on a multicore system (I
> am talking HP-UX X client servers etc of the early 90's vintage). a Load
> average of 1 means a single core of the system is being utilized close to
> 100% of the time.

No, no, no. Absolutely *NOT*. It doesn't reflect CPU usage at all.

And it never have. The load average must be the single most misunderstood
kernel metric there have ever been in the history of unix systems.

Very simplified it reflects the *number* *of* *processes* in a runnable
state,
averaged over some time. Not necessarily processes actually on core,
mind you,
but the number of processes *wanting* to run.

Now, a process can be in a runnable state for a variety of reasons, and
there
is for example nothing that says it even needs to use up its alloted time
slice when actually running, but it still counts as runnable. It can be
runnable when waiting for a system resource; then it consumes *no* CPU
cycles
at all, but it still counts towards the load average.

> On dual core systems a load average of 1 should be absolutely no cause for
> concern.

I routinely see load averages of 30-40-50, upwards of 100 on some of my
systems. They run absolutely smooth and beautiful, with no noticable lag
or delays. The processors may be near idling, they may be doing some work,
it varies, but it is nothing I can tell from the load average alone.

> Linux has moved away from reporting load average as a percentage of a single
> core time in recent days for precisely this reason, people see a load of 1
> and think there systems are esploding.
> 
> In the traditional mold todays processors should in theory get loads of 4-7
> and still be responsive...

I'm sorry to say, but your entire text is based on a misunderstanding of
what
the load average really is, so the above sentences are totally irrelevant.


Regards,
/Benny


> On 31 May 2011 19:10, Joel Carnat <...> wrote:
> 
>> Le 31 mai 2011 ` 08:10, Tony Abernethy a icrit :
>>> Joel Carnat wrote
>>>> well, compared to my previous box, running NetBSD/xen, the same services
>>>> and showing about 0.3-0.6 of load ; I thought a load of 1.21 was quite
>> much.
>>>
>>> Different systems will agree on the spelling of the word load.
>>> That is about as much agreement as you can expect.
>>> Does the 0.3-0.6 really mean 30-60 percent loaded?
>>
>> As far as I understood the counters on my previous nbsd box, 0.3 meant that
>> the
>> cpu was used at 30% of it's total capacity. Then, looking at the sys/user
>> counters,
>> I'd see what kind of things the system was doing.
>>
>>> 1.21 tasks seems kinda low for a multi-tasking system.
>>
>> ok :)
> 
Subject: 	Re: I don't get where the load comes from
Date: 	Wed, 01 Jun 2011 17:26:37 +0200
From: 	Benny Lofgren


On 2011-06-01 15.53, Joel Wiramu Pauling wrote:
> On 2 June 2011 01:41, Benny Lofgren <...
> <mailto:...>> wrote:
> I agree with what you are saying, and I worded this quite badly, the
> frame I was trying to setup was "back in the day" when multi-user meant
> something (VAX/PDP) - the load average WAS tied to core utilization - as
> you would queue a job, and it would go into the queue and there would be
> lots of stuff in the queue and the load average would bumo, because
> there wasn't much core to go around.

Not wanting to turn this into a pissing contest, I still have to say that
you are fundamentally wrong about this. I'm sorry, but what you are saying
simply is not correct.

I've worked in-depth on just about every unixlike architecture there is
since I started out in this business back in 1983, and on every single
one (that employed it at all) the load average concept has worked
similarly to how I described it in my previous mail. (Not always EXACTLY
alike, but the general principle have always been the same.)

The reason I'm so adamant about this is that the interpretation of the
load average metric truly is one of the longest-standing misconceptions
about the finer points of unix system administration there is, and if
this discussion thread can set just one individual straight about it
then it is worth the extra mail bandwidth. :-)

One only needs to look at all of the very confident, yet dead-wrong,
answers to the OP:s question in this thread to realize that it is
indeed a confusing subject. And the importance of getting it straightened
out cannot be overstated. I've long ago lost count of the number of
times I've been called in to "fix" a problem with high system loads
only to find that the only metric used to determine that is... yes,
the load average. I wonder how much money have been wasted over the
years trying to throw hardware on what might not even have been a
problem in the first place...


Regards,
/Benny



> That hasn't been the case for a very very long time and once we entered
> the age of multi-tasking load become unintuitive.
> 
> Point being it's an indication of something today that isn't at all
> intuitive.
> 
> Sorry for muddying the waters even more, my fuck up.
> 
> 
>     > On 31 May 2011 19:10, Joel Carnat <...
>     <mailto:j...>> wrote:
>     >
>     >> Le 31 mai 2011 ` 08:10, Tony Abernethy a icrit :
>     >>> Joel Carnat wrote
>     >>>> well, compared to my previous box, running NetBSD/xen, the same
>     services
>     >>>> and showing about 0.3-0.6 of load ; I thought a load of 1.21 was
>     quite
>     >> much.
>     >>>
>     >>> Different systems will agree on the spelling of the word load.
>     >>> That is about as much agreement as you can expect.
>     >>> Does the 0.3-0.6 really mean 30-60 percent loaded?
>     >>
>     >> As far as I understood the counters on my previous nbsd box, 0.3
>     meant that
>     >> the
>     >> cpu was used at 30% of it's total capacity. Then, looking at the
>     sys/user
>     >> counters,
>     >> I'd see what kind of things the system was doing.
>     >>
>     >>> 1.21 tasks seems kinda low for a multi-tasking system.
>     >>
>     >> ok :)
>     >
Subject: 	Re: I don't get where the load comes from
Date: 	Wed, 01 Jun 2011 09:49:17 -0600
From: 	Theo de Raadt
To: 	Benny Lofgren


> On 2011-06-01 15.53, Joel Wiramu Pauling wrote:
> > On 2 June 2011 01:41, Benny Lofgren <...
> > <mailto:...>> wrote:
> > I agree with what you are saying, and I worded this quite badly, the
> > frame I was trying to setup was "back in the day" when multi-user meant
> > something (VAX/PDP) - the load average WAS tied to core utilization - as
> > you would queue a job, and it would go into the queue and there would be
> > lots of stuff in the queue and the load average would bumo, because
> > there wasn't much core to go around.
> 
> Not wanting to turn this into a pissing contest, I still have to say that
> you are fundamentally wrong about this. I'm sorry, but what you are saying
> simply is not correct.
> 
> I've worked in-depth on just about every unixlike architecture there is
> since I started out in this business back in 1983, and on every single
> one (that employed it at all) the load average concept has worked
> similarly to how I described it in my previous mail. (Not always EXACTLY
> alike, but the general principle have always been the same.)
> 
> The reason I'm so adamant about this is that the interpretation of the
> load average metric truly is one of the longest-standing misconceptions
> about the finer points of unix system administration there is, and if
> this discussion thread can set just one individual straight about it
> then it is worth the extra mail bandwidth. :-)

100% right.  The load average calculation has not changed in 25 years.
Anyone who says otherwise hasn't got a single fact on their side.

What has changed, however, is that the kernel has more kernel threads
running (for instance, ps aguxk, and look at the first few which have
the 'K' flag set in the 'STAT' field.

Some kernels have decided to not count those threads, others do count
them.  Since these kernel threads make various decisions for when to
do their next tasks and how to context switch, the statistical
monitoring of the system which ends up creating load values can get
perturbed.

That's what this comes down to.

On the topic of clustering

-------- Original Message --------
Subject: 	Re: openbsd clusters
Date: 	Sat, 22 Dec 2012 22:43:54 -0500
From: 	Nick Holland


On 12/22/12 07:54, Friedrich Locke wrote:
...
> But for other services i don't have now what i could use. A example: i need
> a file system that must expand by adding more machine in the network in a
> simple way.

in plain English: "I'm not thinking out the design carefully, so I'm
going to rely on fancy shit to haul my ass out of the fire when the
predictable (and not so predictable) happens.

You don't need that for your problem, you need that for the solution you
came up with for your problem.  Your solution is wrong.

You know your needs will change in the future, so build the whole system
around the idea of modular storage and other scalability design features
-- not "unlimited expandable storage".

Chunk your data from the very beginning.  In the case of a mail server,
part of the user's LDAP record indicates the storage unit where it is
stored.

Yes, this is a better design.

I've seen many designs where the answer was "toss it all in one pool,
let some 'advanced technology' keep my ass out of the fire."  They have
all been total shit.  Usual result: the "advanced technology" gathers
the kindling, splits the logs, lights the fire, and tosses your ass on
the pyre before you ever get around to the first "expansion".  If you
wish to argue that your "problem" is special, and requires One Big Pool
of Storage, feel free to tell me about it (off list), maybe someone's
got one.  More likely, you will be telling me about your SOLUTION which
requires one big pool, not the root problem.  (I'm not above learning
new stuff, but I'm done with assuming most people know something I don't
-- that's something that is really annoying to be wrong about, I'm finding).

Your design should incorporate (among other things):
* initial load handling.
* future load handling improvements.
* future storage upgrade.
* future storage REPLACEMENTS (you want to remove your three year old
storage module in favor of a new one ten times the size, but your six
month old one is still quite good)
* future complete solution replacements. (*)
the simplest possible solutions that will accomplish the above within
acceptable business frameworks (i.e., not "we'll have our entire IT
staff working a major multi-day holiday because that's the only way we
can accomplish this")

Nick.


(*) if you ever wish to keep a closed source solution OUT of your
operations, this is your magic weapon to use with responsible, thinking
people.  Every closed source solution is built around the idea of
keeping you a captive customer.  But the fact is, if your business is
run well, in 50 years, it can still be around.  You will almost
certainly have to replace entire systems with competing products "some
day" -- your company's success should not be dependent upon a third
party remaining in business.  So, an exit strategy has to be part of any
good system design (even though it almost never is).  How are you going
to scrape your legacy data off your old system and install it into its
replacement?  When the APIs are proprietary, you won't...  Ask your
prospective vendor "If you go bankrupt or otherwise leave the business
next year, how will we move >OUR< data stored in your system to another
product?"  They will start with "We aren't going anywhere", which you
know they would say if they weren't sure about getting their paychecks
next week.

'course, most people are not thinking about the long-term health of the
company, but the short-term "what can I stuff on my resume on my way out
the door before this blows up"