[opensuse-es] Linus on Linux: Reciente entrevista a Linus Torvalds parte 1

23 Apr 2009

      http://www.linux-mag.com/id/7313/1/
Esta vez les pego el texto original en ingles, y supongo que habra
algun voluntario para traducirlo:

Linus reflects on 18 years of working on Linux, the developer
ecosystem and his goal for Linux on the desktop.

Don Marti
Wednesday, April 22nd, 2009

Linus Torvalds has led the development of the Linux operating system
since its inception nearly 20 years ago. In that time Torvalds has had
the opportunity not only to witness the positive cultural and economic
changes brought about by Linux but has also been a direct participant
in making those changes a reality. And though many things have changed
greatly since 1991, one thing remains constant: Linus is still at the
helm.

In this interview Torvalds looks back on the operating system he
created, the impact of new hardware, and the ubiquitous OS on
everything from cellphones to desktops to supercomputers.

Linux Magazine: You$B!G(Bve been doing Linux for about 18 years now. That$B!G(Bs
not a long time by the standards of academic research, but it is a
long time by the standards of the software industry. Many of the core
contributors have stuck with Linux even as the industry has changed
and they have changed employers. Is it good for the project to have
the same people able to stick with it? Do you plan to?

Linus Torvalds: I don$B!G(Bt think it$B!G(Bs good for a project if it$B!G(Bs only the
same people who stick with it, and I$B!G(Bd be very worried about Linux if
we had too much of a $B!H(Bcore long-term people$B!I(B approach. But there
really are a lot of developers who are fairly recent, and most
importantly there$B!G(Bs a really long tail of lots of people who dip their
toes into kernel development if only to send in a really small patch.
Most of them will never do anything more, but some of them will
eventually be major developers. And we need that.

At the same time, I think everybody is also happier with some
stability. There$B!G(Bs actually a number of people who have been around
for quite a long time. People like Ted Ts$B!G(Bo, who showed up very early
on and is still involved and still commits code.

So it$B!G(Bs not an either-or$B!=(Bwe want to have both. And yes, I$B!G(Bll stick
with it as long as I think I can do a good job and nobody better comes
along (or put another way: $B!H(Bas long as I can subvert whoever is better
to work with me$B!I(B ;)

And by the way, talking about changing employers: one thing I think is
very healthy is how kernel developers are kernel developers first, and
work for some specific company second. Most of the people work for
some commercial entity that obviously tends to have its own goals, but
I think we$B!G(Bve been very good at trusting the people as people, not
just some $B!H(Btechnical representative of the company$B!I(B and making it
clear to companies too.

The reason I bring that up is that I think that ends up being one of
the strengths of open source - with neither the project nor the people
being too tied to a particular company effort at any one time.

LM: Before Linux, nobody would have believed that the same kernel
would be running supercomputers and cell phones. Do you think you$B!G(Bll
always be able to maintain one codebase that works on phones and other
tiny devices and on very large servers, and just let people configure
it at build time?

LT: Personally I wouldn$B!G(Bt even say $B!H(Bbefore Linux$B!I(B. For the longest
time $B!H(Bafter Linux$B!I(B I told people inside SGI that they should accept
the fact that they$B!G(Bd always have to maintain some extra patches that
wouldn$B!G(Bt be acceptable to the rest of the Linux crowd just because
nobody else cared about scaling quite that high up.

So I basically promised them that I$B!G(Bd merge as much infrastructure
patches as possible so that their final external maintenance patch-set
would be as painfree to maintain as possible. But I didn$B!G(Bt really
expect that we$B!G(Bd support four-thousand-CPU configurations in the base
kernel, simply because I thought it would be too invasive and cause
too many problems for the common case.

And the thing is, at the time I thought that, I was probably right.
But as time went on, we merged more and more of the support, and
cleaned up things so that the code that supports thousands of CPU$B!G(Bs
would look fine and also compile down to something simple and
efficient even if you only had a few cores.

So now, of course, I$B!G(Bm really happy that we don$B!G(Bt need external
patches to cover the whole spectrum from small embedded machines to
big thousand-node supercomputers, and I$B!G(Bm very proud of how well the
kernel handles it. Some of the last pieces were literally merged
fairly recently, because they needed a lot of cleanup and some
abstraction changes in order to work well across the board.

And now that I$B!G(Bve seen how well it can be done, I$B!G(Bd also hate to do it
any other way. So yes, I believe we can continue to maintain a single
source base for wildly different targets, ranging from cell phones to
supercomputers.

Of course, one of the interesting issues is how even the low end has
been growing up. Ten years ago SMP was uncommon on the desktop, these
days we$B!G(Bre looking at SMP systems even in very tiny embedded
environments. So we have moved the goalposts up a bit for what we
consider $B!H(Bsmall$B!I(B. Those cell phones tend to have way more computing
power than the original PC that I started Linux on had.

LM: You posted a very positive blog entry about your new Intel SSD.
$B!H(BThat thing absolutely rocks.$B!I(B On the other hand, some of the other
SSDs on the market don$B!G(Bt, and some Linux users have pretty bad taste
in hardware. Will the OS be able to get decent write performance and
lifespan out of a bad SSD, or are users going to hate life if they buy
the wrong one?

LT: It depends a lot on your usage case. For example, even a bad SSD
can work wonderfully well as a secondary drive that gets 99.9% just
read activity, since even the bad ones tend to read really well and
have low latency and good random-read performance.

Of course, the size and price tends to make that then a hard trade-off
to make easily. It$B!G(Bs not worth it for big files that you usually just
stream, since rotational disks are cheaper and perfectly fine for
streaming behavior. Very few among us really know the true access
patterns we actually have.

And hey, even the Intel SSDs aren$B!G(Bt perfect. If all you do is work
with big files and read and write a lot of contiguous data, a regular
disk will be much cheaper and bigger, and won$B!G(Bt be any slower for
those cases.

But for me, the disk tended to always be the weakest part in the
system. I can make up for some of it with just adding more memory, but
while caching obviously is a huge issue and hides the disk performance
in 95+% of the cases, it just makes the remaining few cases even more
noticeable.

Just as an example: I$B!G(Bm used to doing $B!H(Bgit grep something$B!I(B in my
kernel tree to find where some function is used, or something similar.
It takes me all of half a second, so it$B!G(Bs basically instant.

Except when I have just rebooted, or have just done enough other
things that my tree isn$B!G(Bt in cache any more (ok, so that$B!G(Bs pretty
rare, but it does happen ;). And then the half second was a minute or
two with a perfectly reasonably high-end desktop SCSI drive.

So my average latency was great. If I get 0.5 seconds 99% of the time,
and then very seldom have to wait a minute just because it reads all
those small files off the disk, I should be happy, right?

Wrong. The average may be great, but that just makes the bad cases
feel even worse. I$B!G(Bm used to things being instantaneous, so now that
minute feels really really bad. And it really is mostly seeking$B!=(Bthe
median file size in the kernel is about 4kB, so it$B!G(Bs reading all those
directories and all those 25,000+ small files, and while the total
size of it all may be just a few megabytes, because of seek times it
takes half a minute.

Enter the Intel SSD, and the cached $B!H(Bgit grep$B!I(B still takes the same
half second, but now the bad case takes me ten seconds (it used to be
less, but those staging drivers really added a lot of crap. Some
people would blame the Intel SSD degrading, but sadly, it$B!G(Bs all my own
fault ;)

So my average access time hardly changed, and I can still tell when
I$B!G(Bm disk-limited, but oh boy, it makes such a huge difference. Now
even the slow case is no longer two orders of magnitude slower. Yes,
even SSD disks are slower than RAM caches, but they don$B!G(Bt have that
horrible $B!H(Bfall off the cliff$B!I(B behavior when having to seek around for
the data.

And that$B!G(Bs why I dislike a lot of the bad SSD$B!G(Bs. They have an even
worse $B!H(Bfall of the cliff$B!I(B behavior. It$B!G(Bs for a very specific case
(random small writes), and people will argue that it$B!G(Bs even less
common than the case I describe above (random small reads), and it$B!G(Bs
true. It$B!G(Bs not that common. But it$B!G(Bs common enough that when you hit
it, it just hurts all the more.

This is why I don$B!G(Bt like $B!H(Bthroughput$B!I(B measures. You do want
throughput, but latency variation is what you notice most. You can get
used to slow machines and try to make your workflow match the $B!H(BOomph$B!I(B
of the hardware, but you cannot ever get used to fast machines that
then occasionally are really slow. Those just drive you wild.

As an aside, that$B!G(Bs also very noticeable in CPUs. I had the biggest
complaints with Intel$B!G(Bs $B!H(Bnetburst$B!I(B (aka $B!H(BP4$B!m(B) architecture for some
rather similar reasons: it had absolutely great $B!H(Bbest case$B!I(B behavior,
and then it had some cases that it just stumbled horribly at, and
which I happened to care deeply about.

The P4 was like a greased bat out of hell for loads it liked, but when
it started missing in its tiny L1 cache, or when you had to serialize
the pipeline for locking or for system calls, it turned into something
more like a CPU two or three generations old.

And again$B!=(Bit$B!G(Bs actually more irritating to have something that is
really good at some things and then really bad at others, than have
something that is just consistently middle-of-the-road.

LM: On a system level, $B!H(Breally good at some things and then really bad
at others$B!I(B sounds like a lot of the Linux-based products out there.
Take a workstation and strip off some of the parts to make a dedicated
cluster node or a NAS appliance or a PVR. Do you get a good
general-purpose kernel by building something that works on the
desktop, and letting people configure it to get customized builds for
their own needs?

LT: Yes. To me, Linux on the desktop has always been the most
interesting goal. The primary reason for that is simply that it$B!G(Bs
always been what I want (I$B!G(Bve never wanted a server OS$B!=(BI started out
writing Linux for my own PC, not to be some file server), but also
because all the interesting problems always end up being about desktop
uses.

All other uses tend to be very constrained. You have one thing (or a
few things) you need to do, and you can just optimize and simplify the
problem for those particular issues.

The desktop, in contrast, is all about a wide variety of uses. Huge
variety in hardware, huge variety in software, and tons of crazy users
doing things that no sane person would ever even think of doing.
Except, it turns out, those crazy users may be doing odd things, but
they do them for (sometimes) good reasons.

So aiming for the desktop always forces you to solve a much more
generic problem than any other target would have forced us to look at.

Of course, Linux then becomes extra general-purpose because it$B!G(Bs not
just meant to be a desktop OS. If we only cared about the desktop we$B!G(Bd
never have worked on other architectures or worried about scalability
to thousands of cores. So it$B!G(Bs not sufficient to just be a desktop,
you do have to also look at other niches, but generally the desktop
problems really do get you 90% of the way, and then solving
scalability problems etc. is the frosting on the cake.

LM: Speaking of different platforms, what computers do you have now?
Any of them non-x86, or set up as a server, media player, or other
special-purpose machine?

LT: I don$B!G(Bt tend to use a lot of computers, actually. I don$B!G(Bt like
having a $B!H(Bmachine room$B!I(B, and my goal is to have just one primary
workstation and do everything on that. And that one has been x86-based
for the last few years (basically since I decided that there was no
long-term desktop survival for PowerPC - when Apple switched away it
became clear that the only thing that could possibly challenge x86 on
the all-important desktop was ARM).

I$B!G(Bve got a few other machines (mainly laptops) and there$B!G(Bs a couple of
other machines for the family (one for Tove, one for the kids) but
they are also all x86-based. I$B!G(Bm going to be very interested to see if
I$B!G(Bll grow an ARM machine this year or the next, but it will require it
to be a good netbook platform, and while the potential is there, it$B!G(Bs
never quite happened yet.

Other architectures tend to be available in form factors that I$B!G(Bm not
that interested in (either rack-mountable and often very noisy, or
some very embedded thing) so they$B!G(Bve never found their way into my
home as real computers.

Of course, I do have a couple of Tivos, and they run Linux, but I
don$B!G(Bt really think of them like that. I don$B!G(Bt tinker with them$B!=(Bthat$B!G(Bs
kind of against the point$B!=(Band they are just devices. And there$B!G(Bs the
PS/3, but it$B!G(Bs more interesting for games than to use as a computer
(I$B!G(Bve got faster and better-documented regular computers, thank you).

The most interesting machines I tend to have are pre-release hardware
that I can$B!G(Bt generally talk about. For example, I had a Nehalem
machine before I could talk about it, and I may or may not have
another machine I can$B!G(Bt talk about right now.

LM: Is there anything in the pipe from hardware designers that you
think will have a major impact on Linux$B!G(Bs architecture? Ben Woodard
wonders about increasingly complicated memory heirarchies that go
beyond just traditional caching and NUMA, as well as newer
sychronization primitives such as hardware transactional memory.

LT: I don$B!G(Bt see that being very likely, and one reason is simply that
Linux supports so many different platforms, and is very good at
abstracting out the details so that 99% of all the kernel code doesn$B!G(Bt
need to care too deeply about the esoterics of hardware design.

To take the example Ben brought up: transactional memory is unlikely
to cause any re-architecting, simply because we would hide it in the
locking primitives. We$B!G(Bd likely end up seeing it as a low-cost
spinlock, and we migth expose it as such, using ($B!I(Bfastlock() /
fastunlock()$B!I(B) for short sequences of code that could fit in a
transaction.

So we$B!G(Bd never expose transactional memory as such$B!=(B because even if it
were to become common, it wouldn$B!G(Bt be ubiquitous.

(In fact, since transactional memory is fundamentally very tied to
various micro-architectural limits, if it actually does end up being a
success and gets common, I would seriously hope that even hardware
never expose it as such, but hide it behind a higher abstraction like
a $B!G(Bspinlock$B!G(B instruction with transaction failure predictors etc. But
that$B!G(Bs a whole different discussion).

We$B!G(Bll see. Maybe the hardware people will surprise me with something
that really makes a huge architectural difference, but I mostly doubt
it.

LM: It$B!G(Bs been almost a year since you got David Woodhouse and Paul
Gortmaker signed up as embedded maintainers. How has having developers
responsible for embedded changed the kernel development process?

LT: Hmm$B!D(B So I can$B!G(Bt say that I personally have seen any major changes
in the embedded area, but I also have to admit that if everything is
working well then I wouldn$B!G(Bt expect to see it much. It$B!G(Bs more the
other side of the equation (the embedded develpoers) who you should
ask.

The problem with the embedded space was (is?) always that they$B!G(Bd go
off and do their $B!H(Bown thing$B!I(B, and not try to feed back their work or
even much talk about their needs and their changes. And then when they
were ready$B!=(Boften several years later$B!=(Bthe kernel they based their work
on simply isn$B!G(Bt relevant to mainstream kernel devlopers any more. And
then the cycle starts all over again.

And there isn$B!G(Bt so much we can do at our side of the development$B!=(BDavid
and Paul were never meant to help me. They are about trying to help
the embedded people to learn how to interact with the development
community.

And if that ever happens (happened?) then I hopefully would never
notice, since by then the embedded developers look just like any other
developer.

But if you want my honest opinion, then quite frankly, I don$B!G(Bt think
having $B!H(Bembedded maintainers$B!I(B really ever solves the issue, and I$B!G(Bm
actually hopeful that the whole dynamics in the embedded world will
change. I think, for example, that projects like Android might be
instrumental in bringing the embedded people more into the open,
simply because it makes them more used to a $B!H(Bbig picture$B!I(B development
model where you don$B!G(Bt just look at your own sandbox.

And by the way, I would like to point out that we do try to do better
on $B!H(Bour side$B!I(B of the equation too. The whole $B!H(Bstable$B!I(B vs $B!H(Bdevelopment$B!I(B
kernels (2.4.x vs 2.5.x) was our fault, and I$B!G(Bll happily admit that we
really made it much harder than it should be for people who weren$B!G(Bt
core kernel developers to get stuck on an irrelevant development
branch.

So I don$B!G(Bt want to come off as just blaming the embedded people. They
really have their reasons for going off on their own, and we
historically made it very hard for them to be even remotely relevant
to kernel development.

In other words, I am hoping that it$B!G(Bs now easier for an embedded
developer to try to stay more closely up-to-date with development
kernels, and that we$B!G(Bll never have to see the $B!H(Bthey are stuck at
2.2.18 and can$B!G(Bt update to a modern kernel because everything has
changed around their code since$B!I(B kind of situation again.

LM: A recent development tatic the Kernel has adopted is the
drivers/staging subdirectory. These are for the so-called $B!H(Bcrap$B!I(B
device drivers $B!=(B which mostly seem to work $B!=(B and have users, but which
don$B!G(Bt pass the mainstream kernel code quality standards. Is having
drivers in the kernel tree, in staging, better for getting them up to
mainstream quality than waiting to bring them in until they$B!G(Bre cleaned
up?

LT: Well, the people involved (like Greg) do seem to feel it$B!G(Bs a
success, in that it does help get drivers into better shape.

And I have to say, I$B!G(Bve personally hit a few machines where they had
devices in them that didn$B!G(Bt have good drivers, and the staging tree
had an ugly one that worked, so I was happy.

So it saves people from at least a few of the incredibly annoying
out-of-tree development efforts. When a driver is out-of-tree, it$B!G(Bs
not just that you have to fetch it separately, you have to find it
first, and then it$B!G(Bs likely a patch against some three-month-old
kernel and hasn$B!G(Bt been updated for the trivial interface changes in
the meantime, yadda-yadda-yadda.

It$B!G(Bs been working from what I can tell. Do I wish we just had better
drivers to begin with? Yes, along with a mountain of gold. It$B!G(Bs not an
optimal situation, but it$B!G(Bs better than the alternatives.

LM: The Linux approach to fixing security-related bugs seems to be
just fix them in the mainstream kernel, and if a distributor needs to
put out an advisory for their vendor kernel, they do. Are users
getting a more or less secure kernel that way than if the upstream
kernel participated in what you called the $B!H(Bsecurity circus?$B!I(B

LT: Hey, I$B!G(Bm biased. I think it$B!G(Bs much better to be open and get the
advantages of that (which very much includes $B!H(Bfaster reaction times$B!I(B
both because it makes people more aware of things, and because that
way the information can much more easily reach the right people).

And it seems to be working. The kernel is doing pretty well security-wise.

That said, anybody who really wants more security should simply try to
depend on multiple layers. I think one of the biggest advantages of
various virtual machine environments (be they Java or Dalvik or
JavaScript or whatever) is the security layering aspects of them -
they may or may not be $B!H(Bsecure$B!I(B, but they add a layer of indirection
that requires more effort to overcome.

So I think we$B!G(Bre doing pretty well, and I obviously personally think
that the Linux kernel disdain (at least from some of us ;) for that
$B!H(Bsecurity theater$B!I(B with all the drama is a good thing and is working.
But I would always suggest that regardless of how secure you think
your platform is, you should always aim to have multiple layers of
security there. Anybody who believes in $B!H(Babsolute security$B!I(B at any
level is just silly and stupid.

_____________________________________________________________________________-

Don Marti was an organizer of Windows Refund Day, Burn All GIFs Day,
and the Free Dmitry movement. He is conference chair for OpenSource
World and a freelance writer and media manipulator

_________________________________________________________

Uno de los links que aparecen en el articulo, apunta a:
http://www.sgi.com/developers/technology/linux/

Quiza Rafa pueda dar algunos detales sobre ellos.

Salu2
-- 
Para dar de baja la suscripción, mande un mensaje a:
   opensuse-es+unsubscribe@opensuse.org
Para obtener el resto de direcciones-comando, mande
un mensaje a:
   opensuse-es+help@opensuse.org

[opensuse-es] Linus on Linux: Reciente entrevista a Linus Torvalds parte 1

Juan Erbes