Netconsole target recovery: my first non-trivial kernel patch

Earlier this year I got my first non-trivial kernel patch merged. It was a series of 7 commits and all in all it took over 4 months and 11 revisions to get it into good shape to be merged. This wasn’t my first kernel contribution: I had previously landed a small patch introducing new tests, and written about the kernel development workflow.

Before talking a bit about the details of the patch, let’s introduce Netconsole for those that are not familiar with it.

Netconsole

Netconsole is a kernel module that forwards kernel log messages (printk) over UDP to a remote machine. It is very useful to debug kernel panics, system crashes and boot issues, especially when one lacks physical serial console access.

Since Netconsole is expected to work very early and when everything else may be failing, it sits low in the networking stack and bypasses most of it (it uses netpoll to send packets directly through the driver). Operators configure Netconsole “targets” via boot parameters (or configfs entries), specifying details such as the interface name or MAC address to use when sending UDP. As the netconsole module loads, targets are set up and their interfaces are brought up so that logging can flow.

Netconsole registers a netdev notifier and reacts to events on the interfaces its targets are bound to (such as NETDEV_CHANGENAME when an interface is renamed by user space, or NETDEV_UNREGISTER when one is removed). This lets it keep targets consistent as the underlying interfaces change.

Target recovery for Netconsole

The gap I wanted to address was the following. There are two ways a target can end up unusable: (1) Netconsole fails to bring up the interface while setting up a new target, or (2) an active interface is later deactivated due to a hardware issue. In either case, logging is permanently disabled and the operator has to recreate the targets (or reboot the host) once the underlying issue is resolved.

My patch series addresses this issue by reworking how deactivated targets are represented internally (in order to distinguish targets disabled by user action from those deactivated by other events) and then carefully watching for netdev events in order to detect when a deactivated target can be resumed. The main commit is here.

While working on this patch I’ve learned quite a bit about kernel development as well as details about netconsole and the net subsystem.

Target recovery works as follows. When Netconsole receives a NETDEV_REGISTER event for an interface that matches a previously deactivated target, it schedules the resume action on a dedicated workqueue. Why a workqueue? Bringing the interface up can sleep and requires IRQs enabled. Neither is allowed inside a netdev notifier callback, so the work has to be deferred to a separate thread.

The relevant snippet from the notifier callback:

if ((event == NETDEV_REGISTER || event == NETDEV_CHANGENAME) &&
	deactivated_target_match(nt, dev))
	/* Schedule resume on a workqueue as it will attempt
		* to UP the device, which can't be done as part of this
		* notifier.
		*/
	queue_work(netconsole_wq, &nt->resume_wq);

Targets are matched according to how they were initially bound, either by MAC address or interface name.

On a separate thread, work eventually gets picked up by process_resume_target. Some preparation happens to guarantee that it is safe to attempt the resume. First, we acquire dynamic_netconsole_mutex, which protects targets from concurrent modification. Then we take the target_list_lock spinlock via spin_lock_irqsave, which also disables local IRQs. This is necessary because the same lock can be taken from interrupt context, and holding it with IRQs enabled would risk a deadlock. With the locks held, we re-check that the target is still deactivated, since it could have been disabled by the user while the resume was queued.

Here’s the opening of the function, up to that recheck:

/* Process work scheduled for target resume. */
static void process_resume_target(struct work_struct *work)
{
	struct netconsole_target *nt;
	unsigned long flags;

	nt = container_of(work, struct netconsole_target, resume_wq);

	dynamic_netconsole_mutex_lock();

	spin_lock_irqsave(&target_list_lock, flags);
	/* Check if target is still deactivated as it may have been disabled
	 * while resume was being scheduled.
	 */
	if (nt->state != STATE_DEACTIVATED) {
		spin_unlock_irqrestore(&target_list_lock, flags);
		goto out_unlock;
	}

The actual resume is performed by the resume_target helper, which calls netpoll_setup to bring the interface up. Since netpoll_setup can sleep and requires IRQs enabled, the spinlock is released (which re-enables IRQs) before calling it. The target is also removed from target_list first, so it can be safely manipulated outside the lock.

Continuing in the same function:

	/* resume_target is IRQ unsafe, remove target from
	 * target_list in order to resume it with IRQ enabled.
	 */
	list_del_init(&nt->list);
	spin_unlock_irqrestore(&target_list_lock, flags);

	resume_target(nt); // THIS IS WHERE netpoll_setup is called

After that, the necessary cleanup runs and the target is added back to target_list.

And finally, re-acquiring the spinlock to re-link the target:

	/* At this point the target is either enabled or disabled and
	 * was cleaned up before getting deactivated. Either way, add it
	 * back to target list.
	 */
	spin_lock_irqsave(&target_list_lock, flags);
	list_add(&nt->list, &target_list);
	spin_unlock_irqrestore(&target_list_lock, flags);

out_unlock:
	dynamic_netconsole_mutex_unlock();
}

Closing thoughts and acknowledgements

Having meaningful code landed in the kernel has always been a goal of mine and although achieving that is great, the journey was also great. I’m looking forward to doing more of this in the future.

I’d like to thank Breno Leitao (netconsole maintainer) for the mentoring and all the advice given, as well as the netdev maintainers for the review!