Linux

UTOSC Day 1

My first day at the Utah Open Source Conference is just about done. I really need to get to bed so I don't fall asleep during my presentation tomorrow. That would be embarrassing. I only made it down at 6:30pm for the dinner and keynotes, which I'm glad I attended. Mac's talk was great and so was Paul's, although it did seem to drag on. Maybe that's because I needed to go to the bathroom. I also had a chance to visit with a few Linux newbies at my table, which is always fun. But the highlight of my day had to be meeting Harleypig. Thankfully, no speedos were involved.

Recovering After a MySQL Replication Failure

I just solved a weird MySQL replication problem and thought I would share with you all. First, the background. I have a master and slave running with one-way replication. The slave just sits by waiting for his time to shine, but otherwise doesn't do anything. Well, last week the master had a problem with the filesystem. I can't recall exactly what it was, out of space or something. It confused the heck out of the master and so it basically shut down. One of my co-workers fixed the problem and got the master running again, but the slave was in a pickle. Here is the error it was showing:

Relay_Master_Log_File: mysql-bin.031
     Slave_IO_Running: Yes
    Slave_SQL_Running: No
           Last_errno: 0
           Last_error: Query 'DELETE FROM foo WHERE bar = 1' caused different errors on master and slave.
                       Error on master: 'Got error %d from table handler' (1030), Error on slave:
                       'no error' (0). Default database: 'baz'
  Exec_master_log_pos: 118871

Because it thought the command failed on the master, it refused to continue. I can't say it's an altogether bad plan since data integrity is generally the main theme of a database (yes, cue the jokes about my using MySQL in the first place). The question became, how do I get the slave to start up again. "SLAVE STOP; SLAVE START" didn't have any effect.

The trick was suggested to me by a post at mysql.com which pointed out a tool new to me, mysqlbinlog. See I figured the simplest thing would be to restart replication at the step just after the "failed" transaction, since I knew that transaction had actually succeeded. But I have no idea how the binlog counters work, so I couldn't just make up numbers. It's some kind of binary offset. Well, mysqlbinlog will show it to you.

# mysqlbinlog mysql-bin.031 -j 118871 |less

Which of course showed me this:

# at 118955
#080605 18:59:09 server id 1  log_pos 118955    Query   thread_id=3218  exec_time=0     error_code=0

So on my slave I restarted replication at offset 118955 and like magic, the slave ripped through the binlogs and caught up in practically no time at all.

VoIP QoS With Wondershaper

Hans and I were discussing QoS the other day, specifically regarding using Wondershaper from the LARTC. I had managed to mess mine up and I subsequently noticed a horrible turn for the worse in my VoIP calls. Wondershaper has to be adapted for use by OpenWRT and in the process I misspelled sch_ingress.o as sch_insmod.o. Too much insmodding that day, I think. The net effect was that download speeds were not shaped at all.

Once I got it corrected, I decided to do a few tests just to confirm that using Wondershaper actually made a difference. I'll cut to the chase for the lazy: it did. I made 45 second calls to music on hold from my softphone, Twinkle. In the background I had Wireshark running. I used the RTP analyzer in Wireshark to look at the statistics after all was said and done. I used both versions of Wondershaper, the CBQ and HTB. I had a single download running the whole time eating up all spare bandwidth.

With no shaping: 4.4% loss (95 packets), 60ms jitter
With CBQ Wondershaper: 0.2% loss (5 packets), 35ms jitter
With HTB Wondershaper: 0.3% loss (6 packets), 28ms jitter

So my unscientific conclusion is that both versions of Wondershaper work about the same and they both make a huge difference. I could easily hear the packet loss on the first call, but not so much on the other two calls.

Twinkle 1.1 Ubuntu Package

I built a Twinkle 1.1 Ubuntu package the other day. I wanted to try out the new buddy lists feature and the currently available version is only 1.0.1. This package is built on Gutsy Gibbon. I make no warranties of its successful functioning on your machine. But it does work just fine for me.

twinkle_1.1-0_i386.deb

More On Net Neutrality

Another great opinion on Net Neutrality which closely (if not exactly) mirrors my own. For those too lazy to go and read for themselves, here's a quick snippet.

We need policy to help cut a path for more competition, rather than protecting incumbents -- a Bandwidth Competition Act of 2008, not bogus net neutrality. All takers should be allowed access to poles or underground conduits. This is where neutrality should be enforced, instead of being a choke point.

As I've long said, a government bureaucracy isn't going to solve the problem. It's going to create less incentive for Internet companies (like mine, full disclosure) to even toss their hat in the ring. Try forming your own telephone system and you'll know what I mean. The rules are ridiculously complicated and it takes an army of lawyers to sort through them. Please please please don't turn the Internet into the phone system.

ZRTP

My bud Hans and I tonight tested out encrypted VoIP with ZRTP. I noticed a while back that Twinkle supports it and have wanted to test it out, but none of my desk phones support ZRTP.

It was fun. When the call terminated, Twinkle displayed a cute message about verifying the SAS (short authentication string). It was 4 character (hprj, if you're curious) that represented our encryption key. It's the way ZRTP verifies that a man-in-the-middle attack is not underway. There was a padlock icon which we both clicked to verify that the SAS was correct. I'm not sure what if anything happened because of that, except that we both verified that our SIP phones have not been tapped by the feds.

In the SDP, ZRTP is advertised with "a=zrtp". It's not a separate protocol per se. The actual codec was selected through the normal means (we used speex/16000). Looking at the RTP data, I see a whole bunch of "AES256", "SHA256" and "DH4096". Presumably that's part of the ZRTP negotiation. I didn't delve further. What I see though is that the encrypted data is simply represented as Speex RTP, but the actual data has been scrambled so it would be meaningless to a passerby.

Based on this testing, I predict good things for ZRTP. It was quite painless to use as a caller. As long as it's enabled by default in the phone, there's really nothing else that a user has to do to use it. The SAS is short and you only have to verify it if you care. Phil Zimmerman says that you don't even have to verify the SAS every time. Just once in a while is good enough. And obviously anytime you're conducting private business (which is not the same thing as illegal business). The simple fact that ZRTP is used every time means that you can't tell whether a call is valuable or not just based on it being encrypted.

The one possible failure of ZRTP is that it doesn't hide any of the signalling data, so a spy would be able to see who you were calling. That problem would be quite hard to solve. I'm not sure of the benefit either as the cost to mask that information is much higher. You pretty much have to know all the routing information ahead of time. Even then, an eavesdropper could still see the two IP addresses involved, which will give away some amount of information. So for now, ZRTP is a good solution.

Chown

Sometime last week I happened upon a handy little shortcut when using the chown command. I mistakenly keyed in the command wrong and it turned out to work, so I investigated. Turns out that by leaving off the group name, but leaving the colon, chown will automatically use the default group of the specified user. That's so handy. What's surprising is how much I really use that trick. Why, I must save literally seconds every other day or so. That's gonna add up, baby.

Here's an example for you impatient, graphical learners:

chown tensai: file.txt

Grudge Match: scp, tar+ssh, rsync+ssh

The question came up today about relative speeds of scp, tar and rsync (the latter two using ssh as a transport mechanism). While anecdotes and rumors are great for defining security policy (think TSA), I wanted some more concrete numbers so I ran a test.

I set up a script to copy a directory 5 times from my laptop to a server on the same subnet. I routinely pull 3MB/s from that server (over wifi), so bandwidth wasn't an issue. I used /var/lib/dpkg as my source directory. It weighed in a 57MB and contained 6896 files. Because rsync will compare changes between source and destination, I made sure to nuke the directory off the server after every run.

Method:            scp  rsync+ssh   tar+ssh
Average Time:  269.75s      33.6s    24.43s
Bandwidth (mbps): 1.69      13.57     18.66

The results are what I expected, at least as far as scp is concerned. It does not do well with large numbers of small files. It copied each file over completely before it started with the next one. Tar of course put the whole thing together and then shipped it off. Rsync read all the files first, then compared them to the server and then shipped them all in one go. Apparently there were some significant I/O savings to be had that way.

One other important item of note is that scp did not handle symlinks the way tar and rsync did. It dereferenced the symlink and copied the contents of that link rather than copying the link itself. That was a problem because I had picked some self-referential directories before I settled on /var/lib/dpkg.

For your reference, here are the commands I ran to test:

for i in 1 2 3 4 5; do time scp -qrp /var/lib/dpkg [server]:/tmp; ssh [server] rm -fr /tmp/dpkg; done
for i in 1 2 3 4 5; do time rsync -ae ssh /var/lib/dpkg [server]:/tmp; ssh [server] rm -fr /tmp/dpkg; done
for i in 1 2 3 4 5; do time tar -cf - /var/lib/dpkg |ssh [server] tar -C /tmp -xf - ; ssh [server] rm -fr /tmp/dpkg; done

SNMP Watch Script

We have some mail servers which occasionally get way behind on their mail queue. I wanted a good way to see the size of the queue in real time, without having to log into the web interface of the machine (it's a proprietary device). So I wrote this script which not only prints out the current value of the SNMP OID, but also tracks the value so you can see if it's increasing and if so, by how much. It could easily be adapted to any numeric SNMP variable.

watch-snmp.pl

Prefix Area Code With SER

I'm setting up a SER server for routing SIP calls to a PSTN gateway. Calls to the PSTN should always be 10 digits, just to remove any confusion, but I want to allow people to dial just 7 digits. I needed a way to look at the caller's phone number and prefix their destination number with their own area code. Hans fought this issue previously and couldn't get it to work just with SER, but instead had to exec a script. Not ideal, but it'll work. He didn't have the script at hand so I wrote my own in Perl.

First, the SER config snippet:

if (uri =~ "^sip:[0-9]{7}@") {
   exec_dset("/usr/local/ser/add_areacode.pl");
}

And the script:

#!/usr/bin/perl

use strict;
my $DEFAULT_AREA_CODE = '307';
my $to = $ARGV[0];
my $from = $ENV{SIP_HF_FROM};

# if it's not a 7 digit number
&output($to) if ($to !~ m/sip:\d{7}@/);

# if no from header. weird, but whatever.
&output($to) if (!$from);

# find area code addr
$from =~ m/<sip:(\d{3})\d{7}\@/;
my $area = $1;

# no area code? just assume one
$area = $DEFAULT_AREA_CODE if (!$area);

$to =~ s/sip:/sip:$area/;
&output($to);

sub output
{
   my $result = shift;
   print $result;
   exit 0;
}
Syndicate content