In association with heise online

The return of ToyBox

The H: So what got ToyBox out of mothballs?

Rob Landley: Tim (Bird, Staff Engineer at Sony Network Entertainment and Architecture Group Chair, CE Workgroup of the Linux Foundation) proposed a new "Bentobox" project to extend toolbox and push the changes upstream to Google. Toolbox is to busybox what bionic is to uClibc: it's the in-house Google project Android currently uses and is just barely enough to launch Dalvik and run Java. Tim's plan was the complement of his "Android Mainlining Project" that's extending Linux by pushing Android changes upstream to Bentobox was aimed at extending Android by writing new code and pushing it upstream into Toolbox.

I considered Tim's suggested course of action futile: there's really nothing in toolbox worth extending, and Google really doesn't have a mechanism in place for interacting with external developers they haven't got a contractual relationship with. The Android developers release source (after the fact), but the source comes from a cathedral, not from a bazaar.

When Tim pointed out to me how weak toolbox is and the widespread desire to replace it, I suggested not just fixing the obvious gotchas but going beyond full POSIX/SUSv4 support and explicitly targeting a self-hosting system that could rebuild the whole OS from source natively on the phone hardware. I'm the one who brought up my mothballed Toybox project, and I'm the one who pointed out that I could issue a new BSD license to my own copyrights. At this point Tim wandered off and did his own thing, and I got down to coding [the relicensed ToyBox].

The H: The purpose of ToyBox seems to have changed over the last few years. Who is it aimed at now?

RL: The smartphone is replacing the PC the same way the PC replaced the minicomputer and mainframe before it. You don't need the computer on your desk when you have a computer in your pocket. Existing off-the-shelf technology (USB docking stations, such as the Toshiba Dynadock) can provide a phone with a full-sized keyboard, mouse, multiple monitors, speakers, and so on. The rest is just software and Moore's Law.

Expecting PC operating systems and software to move down to the smartphone is like expecting minicomputer software to take over the PC. This is a textbook disruptive technology. The new one has a protected niche the old technology doesn't apply to, and has the capability to expand into the old technology's niche via obvious incremental improvements.

If the smartphone is to repeat the mainframe -> minicomputer -> microcomputer it needs to become self-hosting, which means it needs to rebuild itself under itself from source code. Android's "no GPL in user space" policy means it needs a BSD-licensed build environment to do this. Busybox is the obvious solution here, but BusyBox predates Android by years. If they're not shipping it now, it's pretty clear they're never going to. "No GPL in userspace" is the official policy of Android, and between the BusyBox licence enforcement lawsuits and GPLv2 being tarred with the same brush as GPLv3, they seem pretty serious about it. I can't say I blame them; the one thing GPLv3 accomplished was to massively undermine GPLv2.

Of course Android could just port existing BSD command line utilities to get this, but those aren't really aimed at either Linux or the embedded world, and I think I can do a better job with ToyBox.

That still leaves Android in need of a native compiler to become self-hosting, but lots of people are working on that. Apple's response to GCC going GPLv3 was to sponsor LLVM/Clang. The Open and Net BSD guys responded by reviving the old Portable C Compiler from the 1970's and turning it into a modern C99 compiler. Other companies like Qualcomm decided they liked Open64 instead. I myself was working on a fork of TinyCC for a while, and if I had spare time I'd chop off tcc's front-end and glue QEMU's TCG on as the code generator to get a tiny "qcc" compiler supporting all the targets QEMU does. There are a lot of choices there.

The H: Are there flaws in BusyBox that you think ToyBox addresses?

RL: Sure. But there were flaws in BusyBox that new versions of BusyBox addressed. That's why we came out with new versions. Almost all of the commands in BusyBox have been implemented before, most of them many times. The whole point is to do a better job.

Back when I worked on BusyBox I was constantly rewriting existing code: cleaning it up, simplifying it, making it smaller and faster and providing more features with less code. I rewrote the mount command three times before I was happy with it, and one of my early contributions was to rewrite the bunzip2 code from this to this.

When I was maintaining BusyBox there was tons of stuff I wanted to rewrite, and my goals were to make the code simple, small, fast, and full-featured. In that order. Simple was more heavily weighted than any other concern, each increase in speed and features or reduction in size had to justify the added complexity. I treated complexity as a cost, and wanted to get the best bang for the buck.

BusyBox has wandered away from that, simplicity is now less important than small size, increased speed, or added features. They've kept the size under a megabyte, but the code's full of magic symbols and #ifdefs. The entry point to the whole program is now buried in a subdirectory near the end of a large file inside the #else case of an #ifdef, and that entire main() function doesn't contain a single line of code that isn't inside one of four other #ifdef blocks.

One of my goals with ToyBox is that if you're just learning C programming, reading ToyBox should be a reasonable real-world introduction to the language.

I also want to know where to stop. A big deal with my Aboriginal Linux project is I could clearly say what it doesn't do. I think about a third of the commands in the current BusyBox really shouldn't be there.

The H: Can you describe what the major differences are between BusyBox and ToyBox for a developer?

RL: Back when I was trying to push ToyBox's code and design ideas into BusyBox, I wrote a post comparing the two. My compulsive documenting also means I wrote up a design page and a source code walkthrough for ToyBox early on in the project, and have tried to keep them updated.

Both projects are moving targets. The main difference right now is that there's a lot less of ToyBox than there is of BusyBox, although a big part of that is ToyBox isn't finished yet.

One failure mode BusyBox has sometimes suffered from is nailing an existing project on and painting the word "busybox" on it and calling it done. Things like ash, mke2fs, and fdisk have been real eyesores at times. I spent most of my BusyBox coding career cleaning up existing code, but I didn't remotely get through all of it. For ToyBox I do a lot of iterative cleanup too, but mostly because I think of a better way to do things. I really try not to let too much code I'm not happy with get into the tree in the first place.

Toybox is code I can be proud of. If I can't be proud of it, I don't want it in the tree until I figure out how to do something I can be proud of. Of course I try not to block third party contributions just because they're not perfect, but I tend to go through and clean them up promptly. I haven't gotten much else done recently, it's all merge, review, and untangle other people's stuff right now.

Next: ToyBox future

Print Version | Permalink:
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit