Lessons from the Debian compromise

December 10, 2003

This article was contributed by Robert Bernier

It's been said that what doesn't kill you makes you stronger: if so then Debian must be very strong these days. The recent attack on Debian's servers is well known. It has been well documented and explained in detail. What remains to do is to consider the aftermath; have lessons been learned?

Recall the sequence of events. In the month of November, an unknown person developed a crack that exploited that now famous kernel flaw and found his way into a Debian developer's machine. Although it's not known if the attack was focused on the developer's machine in particular, it was quickly understood by the attacker that this PC presented a means to accessing the Debian servers. He installed the requisite tools that took over the machine and sniffed out the passwords. The attacker then obtained the password that enabled him to compromise a Debian project server. In quick succession, he penetrated a number of machines which spanned North-America and Europe.

It must be understood that up to this point the attack had not been detected. The machines were penetrated and had been successfully subverted. The attacks were executed in such a manner that none of the installed security mechanisms caught the activity. So why didn't the archives get compromised? And how was it that the attack, was even discovered?

The hand-crafted kernel exploit was not perfect. According to a group of Debian contributors who were interviewed at a recent Linux User's Group meeting (LUG), the exploit worked on all of the Intel machines but failed against one Sparc system, which is where the archives happen to reside. Another crack imperfection was that it generated strange messages in the log files which led to the attack's discovery. It turns out that one of the system administrators became uneasy as he was looking through the log files of one of his machines. He quickly understood that the messages were not normal and the other machines were checked out in short order. This is how the attack and its point of entry (the developer's compromised machine) were discovered.

What are the lessons learned?

Crackers can make bad code: the existence of those log messages indicates a lack of professionalism and sloppiness that eventually led to the attack's discovery.
The bio-diversity of mixed environments defeats mono-culture weaknesses: it's easy to criticize the fragility attributed by the dominance of a Microsoft centric work environment. But we seemed to have missed the fact that a Linux-only environment is monoculture too. Things could have been worse if it wasn't for the inherent differences between Intel and Sun System architectures.
Good people make a difference: a sharp brain and active curiosity are a great combination. Given the time and resources, all exploits can be caught.

Has anything been learned from this event that can help us formulate a more proactive policy? That answer depends on how much we, the open source community, are willing to work to eliminate these violations. These kinds of people can exploit a hundred machines before they stumble over one that can really hurt us. And that's the irony, for every attack that is noticed there are ten more that are unseen. By increasing the diversity of our systems and the alertness of our administration, we improve our chances of detecting and shutting down this sort of attack before it does real damage.

Index entries for this article
GuestArticles	Bernier, Robert

log checkers

Posted Dec 11, 2003 3:39 UTC (Thu) by sweikart (guest, #4276) [Link]

> Another crack imperfection was that it generated strange messages
> in the log files which led to the attack's discovery. It turns out
> that one of the system administrators became uneasy as he was
> looking through the log files of one of his machines.

Note that a simple log checking program might have resulted in
much quicker detection.

Unless the attacker was clever enough to disable outgoing mail
(and then clean the logs).  Then you would need remote logging (as
available with syslog-ng), with the log checker running on the
logging server (and the logging server needs to be the most secure
server, e.g. only accessible to a few individuals who run very
secure workstations).

-scott

Lessons from the Debian compromise

Posted Dec 11, 2003 10:12 UTC (Thu) by climent (guest, #7232) [Link]

The system administrators found some machines having kernel Oopses. That lead to the suspicion that something was happening.

Lessons from the Debian compromise: backups.

Posted Dec 11, 2003 11:32 UTC (Thu) by hensema (guest, #980) [Link] (2 responses)

So after the fire at Twente University and the recent crack, Debian still fails to learn the most important lesson: make backups!

It cannot be stressed enough how important this is. Backups are far more important than relying on the mistakes a cracker makes.

Lessons from the Debian compromise: backups.

Posted Dec 11, 2003 20:44 UTC (Thu) by doogie (guest, #2445) [Link]

There are backups. Debian machines backup their critical config data to other backup machines.

Having backups does not mean compromises don't exist, and that when they do, they should just be ignored, and a backup deployed.

Lessons from the Debian compromise: backups.

Posted Dec 13, 2003 18:57 UTC (Sat) by giraffedata (guest, #1954) [Link]

And more than one copy too. Most people keep one copy of everything, just as protection in case a disk breaks or gets suddenly wiped out. So your backup contains the cracked version of your files.

But what you really need to protect you from a crack, or various other forms of losses, is a multilevel rotation scheme. Have one copy from last night, but also a copy from each of the last 7 nights, and one copy from each of the last 7 weeks, and one copy from each of the last 49 weeks, etc.

That way, when you find out that someone put a backdoor in your system a few weeks ago, you can restore the original files (which probably hadn't changed for years before that).

Too weak

Posted Dec 11, 2003 12:31 UTC (Thu) by walles (guest, #954) [Link] (3 responses)

IMO this article was far too weak. "Crackers can make bad code" is not something you should rely on. I agree on the bio-diversity point. "Good people make a difference" might be true, but it is too weak. How do you get "good people" to run your system? Especially if you are a home user?

What should be learned from this is stuff that people can actually *do* something with. Like the bio-diversity, the attack was obviously somewhat contained by Debian using more than one hardware platform.

I think the real question that should be asked (I don't have the answer unforturnately) is:

"Imagine there is an unknown, exploitable bug in the kernel's brk() implementation. What *technical measures* (other than discovering + fixing the bug) could prevent that problem from being exploited?"

Answer that, and this won't happen again.

BTW, I haven't heard anything about the Stanford Checker lately, could something like that have found the bug in the first place? If so, that program should be run every time somebody checks something in into BK.

Too weak

Posted Dec 11, 2003 16:36 UTC (Thu) by RobSeace (subscriber, #4435) [Link] (2 responses)

> "Imagine there is an unknown, exploitable bug in the kernel's brk()
> implementation. What *technical measures* (other than discovering + fixing
> the bug) could prevent that problem from being exploited?"
>
> Answer that, and this won't happen again.

Forbiding all remote user access would do it... But, may be too extreme
for many... ;-) People seem to forget, it's NOT the kernel brk() bug that's
ultimately to blame here, as I see it: that was just a local exploit, which
allowed the attacker to escalate their privs once they'd already broken into
a normal user account... The REAL problem is that they broke into a normal
user account, in the first place! Do you imagine the impact of that, even
without root access, to be a minor issue?? (I mean "you" in the generic
sense here, not attacking you personally...) If this person whose account
was compromised was a developer (and, if not, why do they have an account
on those machines??), then all an attacker would NEED is their normal user
access in order to plant trojan horses in any software the developer had
access to... Plus, by pretending to be that user, they could perhaps
social-engineer others into giving them enough info to do further damage,
elsewhere...

So, all this continued focus on the kernel brk() bug really bugs me... (No
pun intended... ;-)) It's completely missing the point to lay the blame
there, and give THAT all of the focus and attention... It would be much
more important to focus on how the person got access to that user's account
in the first place... THAT is what needs to be prevented in the future;
and, that's FAR more important, IMHO, than any local-only root exploit...
If no one untrusted has remote access, then all local exploits become
totally irrelevent... And, even in the case where there were no local
exploits at all, letting anyone untrusted have remote access to a legit
user's account is STILL a very, very BAD thing... So, as I say, I think
everyone is focusing on the wrong problem in this whole mess... ;-/

brk() bug was the real problem

Posted Dec 13, 2003 19:05 UTC (Sat) by giraffedata (guest, #1954) [Link] (1 responses)

I take the opposite view. An unauthorized user being able to log into a system as a nonprivileged user is a small deal. Being able to escalate to a privileged user is a big deal.

That's because there are all kinds of legitimate reasons for having a system that untrusted people can log into as an unprivileged user. We should not therefore squander our attention on stopping people from logging in, but rather allocate it to stopping privilege escalations.

How do you figure??

Posted Dec 13, 2003 19:49 UTC (Sat) by RobSeace (subscriber, #4435) [Link]

How on Earth can you say that someone gaining access to a legit developer's
user account is "a small deal"???? And, at the same time, that going from
there to root is a "a big deal"???? I can't comprehend your perspective...

By having access to the developer's user account, the attacker can pretend
to be the developer, including doing such things as checking in code under
his/her name, communicating with others to obtain info intented only for the
developer, and basically anything and everything that legit developer could
do... Is this really "a small deal" to you???? Do you realize the damage
that could be done this way?? Think about it... Chances are, no one would
ever have found out about such a user-only attack: no system files would
have changed, no kernel oopses as warnings, nothing to tip anyone off that
anything was up... So, this person could have undected access for as long
as they wanted, and do anything they wanted in the real developer's name,
without arousing anyone's suspicions... They could get a hidden backdoor of
some sort worked into some major bit of software which they know will be
run by everyone that uses the distro, and as soon as it gets distributed
widely, they'd have access to thousands of machines around the world, with
no one being the wiser (until they eventually slip up and get caught)...
You don't think this represents a scary scenario?? Yet, you find the mere
escalation from an authorized remote user account to root the end of the
world???? I'm totally baffled by that...

Gaining root, when you already have the above power, is just unecessary
overkill, really... If the person who pulled this off were smart, they
would've done as I describe above, and stayed hidden, and been able to do
lots of nasty things for a LONG time, before anyone ever caught them...
By being greedy and going after root, they got caught quickly... And, what
did it gain them?? Not much... Seriously, tell me: what are you so afraid
they could've done as root, that they couldn't have accomplished far more
stealthfully as the developer?? It seems to me the big danger is planting
some kind of back-door/trojan into the source, right? Why do that with
noisy root access, when you can do it with stealthy developer access, and
arouse no one's suspicions?? Sure, as root they can sniff everyone's
passwords and monitor everyone's communications, etc... But, so what?
What is the ultimate danger from doing that: that they'll be able to find a
way to poison the source code, right? Or, are you worried about some OTHER
danger that I'm not seeing??

Lessons from the Debian compromise

Posted Dec 11, 2003 13:02 UTC (Thu) by copsewood (subscriber, #199) [Link] (1 responses)

When I was involved in proving the existence of a then unknown M$ virus it
made sense to shut the system down and apply an integrity check on a static filesystem from a known clean boot environment. Doing this periodically will of course result in regular scheduled downtime. This may be a price which has to be paid for a more secure environment, unless those engaged in root-kit detection mechanisms can somehow guarantee the integrity of their check operating from within the compromised environment. As I don't realistically see any such guarantee as being realistic is 15-30 minutes downtime a day something we may need to accept for a higher integrity environment ?

Lessons from the Debian compromise

Posted Dec 19, 2003 16:48 UTC (Fri) by helgehaf (guest, #10306) [Link]

There's no need to schedule downtime for this.

Take advantage of the fact that a scsi disk is accessible from several machines at once (by connecting two host adapters.)

The disk that is "main disk" for one machine is mounted read-only for checking by another machine. The other machine always boot cleanly because
it boots from a unwriteable cdrom. It can tell if something bad happens to files on the "shared" disk.

Lessons from the Debian compromise

Posted Dec 11, 2003 20:40 UTC (Thu) by doogie (guest, #2445) [Link] (2 responses)

> It must be understood that up to this point the attack had not been
> detected. The machines were penetrated and had been successfully subverted.
> The attacks were executed in such a manner that none of the installed
> security mechanisms caught the activity. So why didn't the archives get
> compromised? And how was it that the attack, was even discovered?

This is not correct.

I was the one who had noticed one of the machines(master) kernel oopsing. We thought it might be hardware, so a quick reboot was done.

Soon after reboot, the oops continued.

Then, it was discovered that another machine(murphy) was also having oopsen. Additionally, a non-debian machine started having the same oops. At this time, other admins(I'm just a local admin for master and murphy) were checked, and the breakin was acknowledged.

As for the intrusion programs not detecting anything; they did. AIDE was installed on several machines, and did report file changes. However, one of the debian admins thought another had done a change, and he(the first) hadn't gotten around to asking the other about it yet.

Also, it's interesting to note that not all the infected machines were having kernel oopses.

Lessons from the Debian compromise

Posted Dec 11, 2003 22:02 UTC (Thu) by wolfrider (guest, #3105) [Link] (1 responses)

This downtime is getting ridiculous though. packages.debian.org is STILL down as of this writing (Thu 2003-12-11) and nothing's coming thru apt-get upgrade.

When will things be back to "normal"?

Lessons from the Debian compromise

Posted Dec 11, 2003 22:47 UTC (Thu) by jordi (guest, #14325) [Link]

Getting the services back involves auditing all the scripts that run the services. There are many services at debian.org, and there's a priority to get stuff fixed. The Debian admins are doing a good job, and the most critical services were restored quite quickly after the exploit was found. For example, the developers already have access to one of the most important boxes, and have their accounts unlocked in general, which means Debian's pulse, the packages stream, is already in movement once again. And yes, this means there are package updates. You should have been getting new packages for some days already. Maybe your mirror is stale...

Use grsecurity on critical machines!

Posted Dec 12, 2003 14:58 UTC (Fri) by emk (subscriber, #1128) [Link] (2 responses)

The grsecurity patch to the Linux kernel does two highly useful things:

1) It breaks most exploits by heavily randomizing memory layouts, PIDs, and anything else it can find to randomize. It also makes quite a few things non-executable, even on Intel architectures.

2) It optionally allows you to set up advanced role-based ACLs, which allow you to ruthlessly strip privileges away from various processes on your server. In particular, you can drop unneeded capabilities from root processes, prevent fork/exec of all but a specified list of executables, and hide all but a tiny part of the filesystem.

If you use grsecurity in addition to your regular system hardening, you can make life very difficult for the crackers.

Use grsecurity on critical machines!

Posted Dec 13, 2003 7:58 UTC (Sat) by penguinroar (guest, #14460) [Link] (1 responses)

I agree with the parent poster, its time to harden the kernel a bit to keep ahead of the crackers. I dont meen that bugs should be downplayed but to have both belt and straps is by my own opinion a good thing. There are several implementations of hardened kernels but i havent seen any broad use of them yet.

Intrusion detection is a harder nut to crack since a to vicious one will cry wolf to much. Some kind of self check of the kernel against a hash only readable and written once at boot maybe?

hardening the kernel

Posted Dec 13, 2003 19:16 UTC (Sat) by giraffedata (guest, #1954) [Link]

Some kind of self check of the kernel against a hash only readable and written once at boot maybe?

Maybe, but that wouldn't be a lesson learned from this incident. The kernel wasn't modified. (The problem is that the cracker was able to read kernel memory).