Networking


As promised, a much better cron script for putting the computer to sleep. It’s actually two scripts – I split the hibernating functionality away from the “is idle” functionality. The hibernate script is simple and just includes my final result from this post. I stored this as /root/hibernate.

#!/bin/sh
#
# Hibernate this machine for WOL with unicast and magic packets.
# 2009 Nathan Blythe
#

ethtool -s eth0 wol ug
pm-hibernate

The script that does the idle checking is stored to /root/useridle:

#!/bin/sh
#
# Execute command if system is "user-idle"
# 2009 Nathan Blythe
#
# Run without arguments for usage.
#

# Filepath of temporary file used by this script.
#
FIL=/tmp/useridle.tmp


# Get the command line arguments.
#
idl=$1
cmd=$2


# Invalid arguments.
#
if [ "$idl" != "reset" ] && [ -z "$cmd" ]; then
  echo "Usage:"
  echo "  useridle [reset]"
  echo "    Reset the system idle time."
  echo
  echo "  useridle  "
  echo "    If the system has been userless for idl seconds, execute"
  echo "    cmd and reset."
  echo
  exit
fi


# Determine the current time.
#
newtime=$(date +%s)


# Reset.
#
# Write the current time to the temporary file.
#
if [ "$idl" = "reset" ]; then
    echo $newtime > $FIL
# Ordinary operation.
#
else
  # If the temporary file does not exist or at least one user is
  # logged in, write the current time to the temporary file.
  #
  if [ ! -e $FIL ] || [ "$(users)" ]; then
    echo $newtime > $FIL


  # Otherwise if it's been at least idl seconds since the time in the
  # temporary file, execute cmd and then reset the temporary file.
  #
  else
    oldtime=$(( $(cat $FIL) + $idl ))

    if [ $newtime -ge $oldtime ]; then
      $cmd
      echo $newtime > $FIL
    fi
  fi
fi

This is a simple script (most of that is just fuzz) that keeps track of the time at which users were seen on the system. When you run it, it checks to see if there are users logged in; if so, it stores the current time to a file. If there are no users logged in, it checks the file to see when was the last time users were logged in. If the desired amount of time has gone by, the provided command is executed.

I use this script by adding the following lines to /etc/crontab.

@reboot         root    /root/useridle reset
*  *    * * *   root    /root/useridle 180 /root/hibernate

This runs my script with the “reset” argument when the computer starts up – this is to make sure it doesn’t hibernate immediately, since the file might have the time from before the machine was shut down. Every 60 seconds it also runs the script with the arguments “180” and my hibernate script. Thus every 60 seconds the machine checks if there haven’t been any users logged on in the past 3 minutes. If this is the case, it hibernates until it is woken by a remote login.

All that needs to be done to configure the script is to edit /etc/crontable. There I can change the first “*” on the second line to, for instance, “*/5” or “*/15” to have the script run every 5 or 15 minutes, respectively, and adjust the “180” argument. So long as the idl parameter (180 seconds, currently) is significantly longer than the period of the cron job (60 seconds, currently) it will be somewhat accurate.

Pretty nifty, huh? I haven’t solved the problem of SSH only waking the machine (on the first try) and having to try again to log in. I think that might just not be doable without changing how SSH attempts to connect. I may forward another port to the server so I can use some common command to boot the system. Either way, I’m happy with how it works.

I will post a summary of all this information at some point in case anybody ever wants a similar set up. It sure is power efficient, especially for doing things like a remote SSH file transfer for backups. I may forward rsync to the machine too… a backup server that (more or less) transparently boots up when needed and shuts down when finished… very nifty!

Advertisement

Sorry Hani, I couldn’t resist that title!

Yesterday I summarized my adventures with Linux and ACPI and WOL, culminating in a machine that would wake up from hibernation when a user initiated an SSH session from a remote (outside my apartment) network. I ended the post saying that there were two things left to do. Today I got those two things working!

First, the second thing. I said that when the computer in question, Stork booted there was a long (1 to 3 minutes) delay. It was something to do with the network card, becaues it only occurred when the card was installed. Different slots and combinations of BIOS settings made no difference.

Today I noticed that the 2.6.30 kernel has made it into unstable (it must have gone in during the past few days, as when I installed the machine three days ago it wasn’t available yet). I went ahead and installed it and rebooted and sure enough, the long delay is gone. I have a suspicion this is more to do with building a new initrd, and that I could probably have solved the problem by rebuilding the ram image for my previously installed kernel. I may go back and try this, but it’s working now anyway. Boot up time on a cold-start is less than 45 seconds, and resume from hibernation shaves another 20 off that. That includes a 4 second GRUB wait, which I may trim down.

The first problem was a desire to have the machine go into hibernate automatically, as a result of no activity. I decided that I would approach this with a cron script (which may or may not be the best short-term approach). I wrote a very simple script as below.

#!/bin/sh
#
# Hibernator cron job
# 2009 Nathan Blythe
#

# Delay to give the user time to log in.
#
sleep 60

# Anybody logged in?
#
if [ -z "$(users)" ]; then
  # Put the NIC in WOL modes u and g.
  #
  ethtool -s eth2 wol ug

  # Hibernate.
  #
  pm-hibernate
fi

There you have it; doesn’t get much simpler than that. I stored the file as /root/hibernator. I added a line to /etc/crontab that looks like this:

*/5  *  *  *  *  root  /root/hibernator

This tells cron to run the script every 5 minutes. So why the sleep 60 command in the script? Well I found that when the machine resumes it has a tendency to run the cron job immediately. This means that if you don’t have a bit of a delay between when the script starts and when it actually does the deed, the user won’t have time to finish logging in. The SSH packet arrives, the machine wakes, and before they can finish entering their password, it hibernates again.

A better cron job may be in the works. Running the script every 5 minutes gets the job done, but it means that when a user logs out there could be a 5 minute delay until the machine goes down or just a few seconds, depending on the time. A better system would run every 5 minutes or so and store the time at which it saw users to a file. Then if it doesn’t see users it checks if a certain amount of time has gone by (since the time in the file). If so, then it hibernates.

Furthermore, a better script would also do some lsof or netstat magic to determine if a user is logging in at the moment (remotely anyway… local users still have to beat the clock). I may work on such a script later.

A new problem has come up, however. I’ve realized that when I attempt to connect remotely, the SSH packet does indeed wake the system, but then the connection is not made until SSH is tried again. The packet that wakes the system is thrown away, and for whatever reason SSH doesn’t try again. I’ll have to think of a way around this. It could be I just use some port as a “wakeup” port… for instance, port 80. Issuing an HTTP request to my network wakes up the machine, after which SSH works. I’d prefer a more smooth system, but we’ll see.

Stay tuned for a better cron script!

This is my first post (excuse me, frst psot) on this blog and so I think it’s fitting to leave the title at the WordPress default.  I did change it to my preferred style of “Hello, world!” (capital H, lowercase w, comma and exclamation point, however.  I’ve seen some Australians use “G’day, world!” which is silly, as your first day using a new programming language is likely to be anything but good.  “Frustratin’ day, world!”, maybe.

This post is about Linux and ACPI and WOL.  Before I describe what I’m trying to do, let’s look at my network. The solid lines are wired 10/100 Ethernet connections; the dashed lines are 802.11 b, g wireless connections.

Nate's network

Nate's network

The router is standard Netgear fare running DD-WRT (which warrants a post of its own).  In this post I will italicize hostnames and IP addresses. I will bold commands, programs, and services.

Pelican is my media center computer (it’s monitor is a big TV, hence the broadcast television connection). Vista2 is my work computer (and does not warrant a cool bird name). Bluebird is an Acer Aspire One netbook.

My goal is to connect to Stork remotely (i.e. from the Internet) using SSH. I want the machine to be either suspended (ACPI mode S3), hibernated (ACPI mode S4) or turned off entirely. When I attempt to connect via SSH I want the computer to seamlessly resume/boot and accept the connection. When I’m done and the machine is idle I want it to return to suspend or power down.

Stork is running Debian GNU/Linux (Sid/unstable branch) with a 2.6.26 based kernel. It is currently running an SSH server; besides that it’s not doing much at all. It’s not running X.

My first goal was to get Stork suspending properly. In BIOS I enabled practically every ACPI option relating to suspend modes. I did leave the options for waking up on keyboard, mouse, etc. disabled as I didn’t want the machine to accidentally wake while I was moving things around. Booting into Debian I issued echo mem > /sys/power/state to put the system in suspend mode (S3).

It went down ok, everything powered off. Punching the power button brought it back up very quickly, except for the video card. Video cards are notorious trouble spots in ACPI implementations as they have to have their registers reprogrammed from scratch at resume; usually Linux relies on the BIOS to do this at boot (as video card makers are secretive about how it’s done), but as a resume from S3 does not use the BIOS for this purpose it’s not unusual for the video card to fail to resume properly.

After playing around with it I decided to install the pm-* suite of tools – in particular I was planning on using pm-suspend and pm-hibernate. These tools make use of the uswsusp suite (which provides s2disk, s2ram, and s2both). pm-* knows a lot of methods for putting finicky devices to sleep; without any further work on my part it seemed to handle the video card ok. Hibernate worked fine also.

At this point I could suspend or hibernate the machine by issuing either pm-suspend or pm-hibernate, and resume by pressing the power button. Very exciting!

The next step was to find a network card that supports wake-on-lan, or WOL. As it happened I found an old 8139-based card specifically labeled as a WOL device; I popped it in and Debian had no trouble loading the 8139too module to support it. I configured my router to give Stork the static IP address 192.168.1.2 and forwarded SSH to this IP.

Let’s stop and say a word about WOL. As I mentioned I disabled the options in BIOS that would allow the keyboard or mouse to wake up the sleeping computer. I did, however, enable the option that would allow a WOL compliant network card to wake the computer. What this means is that if I get Linux to configure the card properly before I suspend/hibernate, the card can wake up the computer (or turn it on from scratch) when it sees something of interest on the network.

Now, I’m using a router (as opposed to a switch) so the only thing that will possibly get through to Stork is unicast (to Stork) or broadcast packets. I don’t have a multicast network (does anyone?). The reason I’m mentioning this is because a network card that supports WOL can wake up the computer in a number of situations. Some cards support different options; here are the available ones.

  • p: Any activity at the physical level will wake the machine. I found that this means the machine wakes as soon as it goes to sleep; not very practical.
  • u: Unicast packets directed to the MAC address of the network card will wake the machine. This seems to be what I want; when a packet arrives to initiate an SSH session the card will wake the machine.
  • m: Multicast packets that include the MAC address of the network card will wake the machine. Again, I don’t have a multicast network.
  • b: Broadcast packets will wake the machine. I don’t want this, because this means things like pings and DHCP queries will wake the machine up. I only want it to wake when it is specifically addressed.
  • a: ARP requests that the network card should answer (requests to find the MAC address that goes with a specific IP address) will wake the machine. As I found out, this is exactly what I want, but unfortunately my card (nor any else I’ve seen) supports it. More on this soon.
  • g: Magic packets sent to the network card’s MAC address will wake the machine. This is the most commonly supported type of WOL. From a computer in the same subnet you run a command (supplying the MAC address of the target machine); when that special packet reaches the sleeping machine the network card wakes it up. I decided to use this type as well for testing.
  • s: Same as g, but includes a password. I don’t know of any cards that support this.

Whew, ok. So in Debian I ran ethtool eth0. It informs me that my network card supports WOL modes p, u, m, b, and g. I enabled modes u and g by issuing the command ethtool -s eth0 wol ug.

I then issued pm-suspend to put the system in S3. Walking over to Bluebird (also running Debian, but it doesn’t really matter) I issued the command ssh 192.168.1.2. Unfortunately the machine did not wake up; instead I receive no route to host on Bluebird. I then tried sending a magic packet to Stork‘s MAC address using the etherwake program. Again, no luck, Stork remained asleep. I then woke the machine up by pressing the power button.

I tried the same process after issuing pm-hibernate, but had the exact same lack of success. However, after some research online I found a suggestion to change the mode by which pm-hibernate hibernates – from “platform” to “shutdown” (whatever that means). This is accomplished by editing the file /etc/uswsusp.conf (remember, pm-* uses the uswsusp tools):

# /etc/uswsusp.conf(8) -- Configuration file for s2disk/s2both
...
shutdown method = shutdown
...

After making that change, I issued the command pm-hibernate again. The machine went down and I trotted back across the room. ssh 192.168.1.2 still wouldn’t wake the machine (again, no route to host) but now etherwake sent its magic packet and BAM – the machine turned on, booted up, and resumed from hibernation! Very cool!

So why wasn’t SSH waking the machine? Well, Stork was not in Bluebird‘s ARP cache; that is to say, Bluebird did not know how to take the IP address 192.168.1.2 and come up with Bluebird‘s MAC address (since they are on the same subnet, Bluebird talks to Stork directly by MAC address). To find this MAC address it broadcast an ARP request. However, ARP requests are broadcast packets and Stork‘s network card was asleep with the rest of the system, waking only on unicast packets. Bluebird never got a reply and so couldn’t construct a route for the SSH connection. A magic packet, like the SSH packet, is sent directly to a MAC address, but it must be keyed into the etherwake program manually, thus avoiding the need for an ARP request.

Now, if the network card supported the WOL mode a, the ARP request would wake the system up. This would be perfect… except the network card does not support the WOL mode a. Oh well. If I were building a new system for this purpose I would definitely buy a network card that supports that mode.

The end result is that there is no easy way to wake Stork from Bluebird using unicast packets. There are some techniques for having the router act as a proxy ARP and spoof replies as if Stork was replying, but that’s not really what I want. I decided that if I’m in the same room I can just walk over and turn the machine on; after all, I did specify that I wanted the machine to turn on when accessing it remotely, from the Internet.

The problem isn’t solved completely, though… the router still needs to know Stork‘s MAC address! When the SSH packet arrives from the Internet and is forwarded to 192.168.1.2, the router has to perform an ARP query. To remedy this I modified the router firmware’s (DD-WRT) start-up script to install a permanent entry in the ARP cache (permament while the router is on, anyway). Now the router always knows that 192.168.1.2 is associated with Stork‘s MAC address and can forward any incoming SSH packets directly to the network card.

Interestingly, you might think that having established a static DHCP entry for Stork, which associates Stork‘s MAC address with 192.168.1.2 I have already achieved this permanent association. However the processes that control DHCP and ARP are seperate and do not share this information.

I rebooted the router, put Stork‘s network card into WOL modes u and g again, and once more put the machine into hibernation. This time I used Bluebird to connect to a remote machine (a server at my undergrad university) and from there attempted to SSH to my apartment. Down came the packet to my router. The router picked it up, saw that it was destined for the SSH port and redirected it to 192.168.1.2. It then checked it’s ARP table to see which MAC address goes with that IP address. It found the permanent entry I installed and forwarded the packet to Stork‘s network card. When the packet arrived the clouds parted, angels sang, and the machine turned on. Hooray!

So to summarize, I now have a machine that can wake from hibernation when either a magic packet is sent from the local network or an SSH connection is initiated from a remote network (the Internet). There are two things left to do.

First, I want the machine to automatically go into hibernation when idle. There is an easy way to do this in Linux: the sleepd daemon. Unfortunately the sleepd daemon considers “idle” to be “no keyboard or mouse action for a set amount of time”. That’s not very practical for me since I don’t intend to have any keyboard or mouse hooked up at all! I would prefer something along the lines of “no hard drive access for a set amount of time” or “no unicast network packets for a set amount of time”. I have yet to find an existing solution, so I may have to rig something up to do this. (What we really want is a simple DBus based daemon that puts the system to sleep in a variety of configurable ways, and a bunch of little clients, each one of which that sends the daemon a message when it sees the system as idle according to some classification. The existing behavior would be mimicked by loading an “idle daemon” that watches the keyboard and mouse; my desired behavior would be implemented with an idle daemon that watches the hard drive, and so on. But I’m getting side-tracked here!)

The second thing I still need is something I skipped over earlier. I didn’t mention it, but since I installed the network card Debian stalls about 1 to 3 minutes on boot. This occurs when loading the kernel and so delays resuming from hibernation. Either I need to fix this or get WOL working during suspend and use that instead. I prefer hibernation because I don’t need to worry about killing the power to my network at night, so the best situation would be to fix the delay. I’m not sure why it happens (it doesn’t depend on the PCI slot or interrupts used) but I haven’t looked into it too much.

Alright, that’s it for this long, boring post about technologies practically nobody uses. But how cool will it be when I can SSH to a completely turned-off system and have it boot up just for me (and shut back down again when I’m done)? Very cool, that’s how cool.