The Hipster PDA

I've been rocking a variant of the hipster PDA for about a week now. I've been using a small spiral quad-ruled, microperf notebook and a mechanical pencil instead of their suggestion of index cards, binder clips, and a pen.

Since moving to Japan, this has been pretty much invaluable. I can copy down any information I want (in English or Japanese), including making more useful directions out of the routes that google maps gives me (example). Blocks are numbered instead of streets being named here, so Google's algorithm that works so well in Western countries utterly fails to give useful Japanese directions. Since I can't print out the map anyway, I'm left to analyze it and write directions into the notebook that make sense, like "Take the right between blocks 23 and 24." That's usually more useful on the ground anyway. If I overshoot and find myself along block 24, I know that I've gone too far.

Using the hipster PDA also means that I can do random reads and writes without draining the battery. If I were out in the middle of nowhere and didn't know where my next electricity was coming from, this would be an even better thing. But, my dictionary already takes batteries; having one fewer device to charge when I get home is nice. Sometimes, the simplest answer really is the best.




In the spirit of all the *top tools, I wrote a little one for WiFi today. It's based on one I did earlier for measuring throughput using ipfw pipe counters, to see how much bandwidth things like streaming video and minecraft really take (answers: surprisingly little, and unsurprisingly a lot).

What I really need to do is abstract this out into a more generalized topify type of class, so that I can just write the data retrieval and data formatting functions, and then get a top-like auto-updating display out of it. This may only work on xterm, because it uses a specific control code to do the screen clear. This is much easier than installing and configuring the terminal control gems for ruby.

Usage: ./essidtop.rb wlan0. This will require root privileges, and you should probably change the #! line if your ruby is installed into someplace other than /usr/bin. Probably requires ruby 1.9 (because of the case statement).


def get_stats(card)
	vals = {}

	cell = nil
	current = {
		:mac     => '??:??:??:??:??:??',
		:essid   => 'Unknown',
		:channel => '-1',
		:snratio => 0,
		:key     => '???'

	%x[/usr/sbin/iwlist #{card} scan].each_line do |ln|
		ln.gsub!(/^\s+/, '')
		ln.gsub!(/\s+$/, '')

		case ln
			when /^Cell ([[:digit:]]+) - Address: ([A-Fa-f[:digit:]:]+)$/ then
				unless (cell.nil?) then
					vals[cell] = current
					current = {
						:mac     => '??:??:??:??:??:??',
						:essid   => 'Unknown',
						:channel => '-1',
						:snratio => 0,
						:key     => '???'

				cell = $1
				current[:mac] = $2

			when /^ESSID:"(.*)"$/ then
				current[:essid] = $1

			when /^Channel:([[:digit:]]+)$/ then
				current[:channel] = $1

			when /^Quality=([[:digit:]]+)\/([[:digit:]]+)\s/ then
				current[:snratio] = (100 * ($1.to_f / $2.to_f)).to_i

			when /^Encryption key:(on|off)/ then
				current[:key] = $1

				#puts "Unmatched: '#{ln}'"

	unless (cell.nil?) then
		vals[cell] = current


card = ARGV[0]

loop do
	stats = get_stats card

	if ($stdout.isatty) then
		printf "\e[H\e[2J"

	# foreach key in stats
	printf "Cell %-17s Ch SN Enc ESSID\n" % 'Mac'
	stats.each do |k,v|
		# print out line
		printf "%-4s %-17s %02d %02d %3s %s\n" % [k, v[:mac], v[:channel].to_i,v[:snratio], v[:key], v[:essid]]

	sleep 10


AI-generated News Articles

This article in the New York Times yesterday points out that machines are now generating summaries of sports games and financial data. StatSheet (now Automated Insights) has also been doing this for years, apparently. Comments of the singularity approaching abound...

My question is: is this really all that newsworthy?

Yes, it's a leap forward in AI-generated content. It reads fairly well, without the repetition and madlibs style of other generated content. But is it really all that different from what we've already been doing? Algorithmically, of course it is. But, merge-sort isn't any more an AI performing a sort than wait-sort. Sure, it's a faster (and, arguably better) algorithm, but it's still just a divide-and-conquer sorting algorithm. See also encryption: same problem; different algorithms to solve it.

What these companies have really done is taken sports and financial reporting and distilled them into an algorithm that works well in most cases. They're taking games with well defined and understood sets of rules, and analyzing past data to summarize them. It's no more approaching an AI singularity than the madlibs are. This isn't to belittle their success, but I doubt we'd be as impressed if they had these algorithms "sportscasting" checkers. I'm sure it could be done ("In a stunning 18th play move, black managed an astonishing triple-jump to retake the lead"), but it would just be an academic paper, not an industry.

So, the question of success becomes whether or not this is useful. The main two features of auto-generated news are timely reporting and SEO. Timely reporting seems to be a red herring for the sports world; if you care, you'll watch the game or listen to it on the radio. Having a summary up in under a minute is a neat trick, but it's only a side-show to the main event. Which leaves the arms race of SEO. Sure, good SEO is important, but it's only a matter of time until Google and Microsoft revise their algorithms to make these contributions less important. All it would take is making sure that content is generated in a "reasonable" amount of time for a human writer. Then the algorithm will have to have wait time added to it, and you lose the speed argument in favor of auto-generated content. At that point, this becomes solely a cost-cutting measure, and that's generally a race to the bottom. So, eventually we'll have companies pop up with madlib algorithms that are dirt cheap, and it's easy to see watching the game as a better use of time for consumers.

Get me an AI that can write a compelling Science Fiction novel and I'll start paying attention.



Documentation for Fun and Profit with AsciiDoc!

(editor's note: neither fun nor profit guaranteed)

For the last week, I've been writing documentation of various things in preparation for leaving my current job. We have one app in particular that had fairly sparse end-user documentation, and it's safe to say that I'm the only perlaphile in the office, so I started using vim to write a README file for it.

One thing let to another, and vim started syntax highlighting the file on me, so I pulled up a quick :se syntax to see what the heck was going on.

Lo and behold, I found out that vim uses AsciiDoc for its README (and .txt) syntax highlighting. This little gem makes writing documentation in plaintext pretty easy, and before I knew it, I was knee deep in constructing a man page for the scripts that I was documenting.

The basics of AsciiDoc's man syntax are simple and straightforward:

= A_PROGRAM(1) =

a_program - This program does something cool.

a_program [--frob]

  twiddles the frobs (disable frob-twiddling with --no-frob)

Does not actually twiddle frobs.


...and so on. The top line is the document title, and has to include a section number to be valid for man. Then, every section (NAME and SYNOPSIS are required, and must be first) is underlined. The :: syntax is for a definition list, which works great for documenting argument lists. It's so easy and neat to be able to write man pages in something easier than perldoc that I actually was accused of enjoying writing documentation.

The real gem today came in the form of a Makefile, though:

all: a_program.1 a_program.html
%.1: %.txt
	asciidoc -b manpage -d docbook -o - %< | docbook2man -
%.html: %.txt
	asciidoc -b manpage -d html -o %@ %<

Now, with just make, I can generate both the man pages, and HTML documentation for people who are into that sort of thing.



You Cannot Get Ye Telnet

Debugging a mail setup at work (on Windows :(), I tried unsuccessfully to find a program to just open up a TCP connection to a random host and port. But, we do have git installed, which means a full copy of bash. And that means that /dev/tcp and /dev/udp are available, despite /dev not existing.

So, I wrote telnet in bash (extra lines added by... chrome?blogger):



exec 3<>"/dev/tcp/$HOST/$PORT" || exit 1

while read line ; do
        echo $line
done >&3 &

while read line ; do
        echo -en "$line\r\n" >&3

It's not a perfect drop-in replacement, but now I can at least test email:

$ ./telnet.bash alt1.aspmx.l.google.com 25
220 mx.google.com ESMTP c64si6457617yhj.60
HELO localhost.localdomain
250 mx.google.com at your service
MAIL From:<aaron@xxxxx>
250 2.1.0 OK c64si6457617yhj.60
RCPT To:<aaron@xxxxx>
250 2.1.5 OK c64si6457617yhj.60
354 Go ahead c64si6457617yhj.60
Subject: test
Date: Mon, 08 Aug 2011 16:16:29 -0700
From: "Aaron Fellin" <aaron@xxxxx>
To: "Aaron Fellin" <aaron@xxxxx>

This is a test email.

I <3 bash
250 2.0.0 OK 1312845535 c64si6457617yhj.60
221 2.0.0 closing connection c64si6457617yhj.60



Now where did I put that file...

Hacking on the backup scripts at work, we realized that all the scripts had hard-coded paths to the installation directory. I could have just made the changes on a new branch (actually, I did), but anything that can be done can be overdone.

Original scripts:


. /path/to/here/config

In PHP, I would have just used something along the lines of dirname(__FILE__). Bash kind-of has an analog of __FILE__ with $0, which works great, except that this script is meant to be run from cron:

ln -s /usr/local/bin/backup /etc/cron.daily/

In this instance, $0 is /etc/cron.daily/backup, which obviously isn't going to work correctly.

Turns out, there's a utility readlink for exactly this purpose. If the argument is a symbolic link, it prints just the target to stdout. Otherwise, it exits with error code 1. So, getting dirname(__FILE__) in bash makes the script start out:


. "$(dirname $(readlink $0 || echo $0))/config"

Either $0 is a symlink, and we grab the path to where it points, or it's a real file, and we just echo it. Either way, what gets passed to dirname is a real file, and so we get the correct path whether the script runs from cron, or logged into the shell.



Building fusecompress on Centos 5.5

Setting up a backup solution using rdiff-backup, I realized that we would quickly run out of space without compression. Enter fusecompress: a fuse module to transparently compress the files in the filesystem. There's no RPM available for CentOS, but the current version of the code should behave properly on CentOS 6.

First, there are some build dependencies to install. Since lzma compression in fusecompress is so breaky, I decided to compile without it. Development Tools is roughly the equivalent of Debian's build-essential.

yum groupinstall 'Development Tools'
yum install boost boost-devel boost141-iostreams
yum install fuse fuse-devel zlib-devel bzip2-devel lzo-devel

Now get the source from github and compile it; should be easy, right...

su - build
git clone git://github.com/tex/fusecompress.git
cd fusecompress
./configure --with-z --with-bz2 --with-lzo2 --without-lzma

Of course, there are complaints. CentOS 5.5 ships with a 2.6.18 kernel, and Boost v1.33.1. Fusecompress uses syscalls from the 2.6.22+ kernel, and Boost v1.44+.

The first step is to backport the Boost v1.44 issues:

git revert 9d5137d7d067151a9822b40e3687b0f645b33937

Then, I coded up implementations of futimens(3) and utimensat(2). These allow nanosecond control of file mtime and ctime attributes. The old versions are now deprecated, and "only" have microsecond precision. These implementations just use the fact that 1ns = 1/1000µs.

#include <fcntl.h>
#include <sys/time.h>

int futimens(int fd, const struct timespec times[2]) {
   struct timeval tv[2];
   tv[0].tv_sec  = times[0].tv_sec;
   tv[0].tv_usec = times[0].tv_nsec / 1000;
   tv[1].tv_sec  = times[1].tv_sec;
   tv[1].tv_usec = times[1].tv_nsec / 1000;

   return futimes(fd, tv);

int utimensat(int dirfd, const char *pathname, const struct timespec times[2], int flags) {
   struct timeval tv[2];
   tv[0].tv_sec  = times[0].tv_sec;
   tv[0].tv_usec = times[0].tv_nsec / 1000;
   tv[1].tv_sec  = times[1].tv_sec;
   tv[1].tv_usec = times[1].tv_nsec / 1000;

   return futimesat(dirfd, pathname, tv);

Then some hacking of the makefiles to include my new .hpp and .o files in the outputs, and I have a working fusecompress for CentOS 5.5.



Upcasting Failure: PHP and String Insanity

One of the most awesome bugs to bite me in PHP is how it handles comparisons. For any two strings, everything's fine. For any two ints, the comparison is what you expect. But, what happens when you compare a string to an int?

if ('astring' == 0) {
        echo 'insane';

This happly little script will print the fact that PHP is, in a word, insane. Clearly, 0 and 'astring' are not the same thing; not even close. 'astring' isn't even falsy.

What's happening here is that PHP is casting both to an integer, instead of to a string. And the rules in PHP say that, since string 'astring' has no leading numerals, it's converted to the number 0. Clearly not what anyone coming from a strongly typed language would expect. The "correct" way is, easily enough, to use === instead of ==.

I thought the insanity stopped there, but then I ran into another gem. We have a select box in one app that allows the user to choose hours, days, weeks, and months:

    <option value="1/24">Hourly</option>
    <option value="1">Daily</option>
    <option value="7">Weekly</option>
    <option value="31">Monthly</option>

In the app, we handled this with a switch. It worked beautifully until the time came to add in support for the 'Hourly' option. Javascript kept anyone from actually selecting 'Hourly' until we were done with the backend to support it, but the client wanted to see where the option would be in the final interface.


switch ($time) {
    case 31:
        // monthly processing

    case 7:
        // weekly processing

    case 1: default:
        // daily processing

    case '1/24':
        // @TODO: add support for this

PHP compares the case labels using == logic. This means everything was coming up daily instead of hourly. Instead, it's better to just leave everything in the switch as a string. Since $input comes in through the $_POST vars, it's a string regardless, and will compare correctly.

But really, why does 0 == 'astring' in the first place, PHP?



searching for a value in an array in javascript

Just a little thing I happened to whip up for work the other day. I can't believe that this wasn't around in quite this format on the internets, yet.

One of the guys at work has a fondness for ruby because of how easy it makes it to deal with collections and arrays. I agree that's one of the hallmarks of a good language; there's a reason nobody stays sane long doing arrays in bash.

Another strong feature of a language is its extensibility, especially with built-in types. Ruby and javascript have this down to an art. It seems to me like Java really missed the boat with its mix of Objects and built-ins (int, float, etc.).

It's trivial to check whether a key exists in an object or array in javascript. You can exploit this to great effect to make associative arrays.

'undefined' === typeof(arr[key])

But, for some reason, there's nothing inherent in the language to check for a value in an array. What I was really searching for was ruby's awesome Array.include? method.

So, I added one.

Just extend array.prototype and you can add methods to every array object. Add them to Number.prototype, and you can do crazy things like 5.times(function() { ... });.

Note the use of === for comparisons. Not only does using it religiously keep jslint happier, it means that this will work properly everywhere, even if the array contains 0 or ''. Unfortunately, I couldn't come up with any fanciness better than searching (up to) the whole array.

 * Checks whether or not key is contained in this array.
 * This function compares elements with ===.
 * @param Object key
 * @return bool
array.prototype.include = function(key) {
    for (var i = 0; i < this.length; ++i) {
        if (this[i] === key) {
            return true;
    return false;

I only wish ? were valid in identifiers so I could have named the function correctly.