[smokeping-users] Packet loss calculations in Smokeping

Vern.Dias at VerizonWireless.com Vern.Dias at VerizonWireless.com
Mon Feb 12 16:51:59 CET 2007


Chris, I will attempt to clarify some of your points, but first:

I have no control over the disclaimer. It is added automatically to all
outgoing mail by the companies mail gateway.  Legal will do what they
are required to do,  I apologize in advance, but it's going to be on
anything I send.

> This is actually an RRD question, at least in part, which is not
specific to Smokeping 
>(Smokeping may be feeding the wrong data to rrdtool).

Agreed.  (Yes, I think it is).

> There are NO lost pings in the above record, i.e. all ten pings have
values rather than 
> being NAN.

There are never any NAN values that I have found in the actual latency
fields, regardless of the number of lost pings.  To me this means that
any attempt at having rrdtool calculate an accurate packet loss value is
doomed from the start. Looks to me like rrdtool is attempting to
calculate all latency values.

> Perhaps this row which you give as an example was not really filled in
at all, but
> interpolated between the previous and next rows?  Alternatively, you
do not say whether
> this record is a primary data point (PDP) or one of the archives. 

Here is another example.  Notice by the time stamps that these are
PDP's:

1171260000: NaN 0.0000000000e+00 4.0000000000e-03 3.0780843433e-03
3.0780843433e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03

1171260300: NaN 9.2099200000e+00 4.0000000000e-03 3.0000000000e-03
3.0000000000e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03

1171260600: NaN 2.3430822667e+00 3.9949136000e-03 3.9949136000e-03
3.9949136000e-03 3.9949136000e-03 3.9949136000e-03 3.9949136000e-03
3.9949136000e-03 3.9949136000e-03 3.9949136000e-03 3.9949136000e-03
3.9949136000e-03

1171260900: NaN 0.0000000000e+00 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03 4.0000000000e-03 4.0000000000e-03 4.0000000000e-03
4.0000000000e-03


Seems to me, that in the case of a packet loss situation within a single
poll, it is up to Smokeping to feed actual data from all 10 polls to
rrdtool, even if one or more or all of the polls are timeouts.  I'll
leave the details of that to the developers of SmokePing, however I
would count the number of ping requests that are lost and use that value
for loss, rather than have rrdtool try to calculate it.  Having rrdtool
calculating averages to fill in the holes within a single rrd record
seems to me to be a very bad approach for calculating loss data
(although a good thing for interface traffic data). In the case of a
completely missed poll (due to a system restart or some such event),
then I would have no problem with interpolation being used for latency
values.

> Also, the loss value is not exactly 5 because of this interpolation
and summarisation, and
> because RRD stores floating point numbers rather than exact ones.

I would have no problem with a packet loss value of 8.5 or 9.5.
However, as the example above clearly shows, we really have no idea of
the actual numer of packets lost and therefore the actual packet loss
for this event.  Since we are using Smokeping data for outage reporting,
it is very important to us the loss data be an accurate reflection of
the loss on the network.

Thanks, Vern


The information contained in this message and any attachment may be
proprietary, confidential, and privileged or subject to the work
product doctrine and thus protected from disclosure.  If the reader
of this message is not the intended recipient, or an employee or
agent responsible for delivering this message to the intended
recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
If you have received this communication in error, please notify me
immediately by replying to this message and deleting it and all
copies and backups thereof.  Thank you.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.oetiker.ch/pipermail/smokeping-users/attachments/20070212/ad6c8114/attachment.html 


More information about the smokeping-users mailing list