Mailinglist Archive: yast-devel (80 mails)

< Previous Next >
Re: [yast-devel] Refactoring madness: chapter three
On Mon, 29 Feb 2016 17:31:46 +0100
Stefan Hundhammer <shundhammer@xxxxxxx> wrote:

On 29.02.2016 08:07, Ancor Gonzalez Sosa wrote:
[...]

Oh, my, where to begin, where to end...

Let me start by saying that there are different styles of working,
different approaches to do things, even considering that it all
revolves around an object oriented approach.

We don't need to discuss that we all want code that not only works
(this is the most basic requirement, of course), but that is also
easy to use, easy to maintain, adequately documented and testable. I
think we all agree on those points.

There seem to be different schools of thought, though, how to achieve
that.


1) RSpec as a design validator

I like the RSpec approach to unit testing and I consider the RSpec
tests of a class like the detailed documentation about its
behavior.

RSpec is one approach to testing, and it makes a claim - a very bold
claim - that it also serves as a substitute for a formal
specification.

While I agree that it is a test tool that does its job reasonably
well, I dispute the claim that it can serve as documentation. It just
doesn't. Try reading somebody else's RSpec code without having read
documentation of the class that it tests, without having read an
example. You will not figure out what the hell that class does. You
may find out what some fringe cases are all about, but there is just
no way to get a reasonable overview what the class is all about. You
get swamped by a flood of details before you get to see the overall
picture.

So, IMHO RSpec fails miserably on its second claim - acting as
specification. The trouble is that it forces the developers to be a
lot more verbose during writing the tests than should be necessary;
the rationale for that is the "it's also a spec" part which IMHO just
does not materialize.

But maybe that's just me; I'd be interested in other opinions about
that.

Please notice that I am all in favour in writing and having tests.
This is a must-have to ensure covering all major use cases and a lot
of fringe cases, and it's a prerequisite for avoiding regressions in
the future. It's costly, but it will (hopefully) be a worthwhile
investment.

Well, I think it is matter of style. E.g. for me "examples" is how it is
originally intended to work and unit tests is how it really work as it
is regularly checked. I agree with you that it really depends how you
write rspec to make it easier to get from it how it behave. E.g. also
structuring rspec statement from common to more rare usage cases helps
a lot, also moving out stubbing and focus mainly on usage. So for me
rspec is part where is code review really important if tests make sense
if someone outside read it.
So we will see if investment pays off, I hope yes, but you cannot be
ever sure.


But there is a caveat: However hard we try, there will never ever be
100% test coverage. There will always be untested scenarios. This is
not to argue against unit testing - to the contrary; it's just to
avoid believing that this will solve or avoid all possible problems.

For example, take that DiskSize class from libstorage-ng. It's a
pretty small class that can handle disk sizes in MiB, GiB, TiB etc.,
and I added an RSpec test suite prette early while writing it. I
thought it was pretty complete and had pretty good coverage.

"Pretty" is the commanding word here. "Pretty" means "not 100%", "not
perfect". And not 100% perfect it was: Only when using that class
extensively I found that there were still some aspects missing,
despite all that testing. For example, I got output formatted as
"1024.0 GiB" when I expected "1.0 TiB"
- the classic off-by one error when going up to the next-higher unit
- comparing for < 1024 when it should have been <= 1024.

Now, that was easy to fix, but it demonstrates what I wrote above:
However complete you think your test suite is, it will always miss
some important test cases. You learn that over time while using a
class. A test suite will always be work in progress; it will never be
complete as long as the code it tests is in use.


I once read that ideal test coverage is 90%...if you have 99% you are
too confident that it catch everything and if it is below, then it is
too weak. I think it is reasonable rule as from my experience there is
always some parts, where automatic tests are too complex and too
fragile, that it is easier to test manually then always fixing
automatic ones and such 90% advice is good start.


2) Another note about OOP design

In our new code there are some things that I consider kind of
"design smells". I can be, as always, wrong but I feel like some of
our code is misusing OOP as a mere way of encapsulate
functionality. Going back to concrete real examples, we had this:

fake_probing = Yast::Storage::FakeProbing.new
devicegraph = fake_probing.devicegraph
factory = Yast::Storage::FakeDeviceFactory.new(devicegraph)
factory.load_yaml_file(input_file)
yaml_writer = Yast::Storage::YamlWriter.new
yaml_writer.write(devicegraph, output_file)

The fact that we had a "new" every couple of lines is a design
smell to me. The adopted solution was creating class methods, so we
can do things like this.

fake_probing = Yast::Storage::FakeProbing.new
devicegraph = fake_probing.devicegraph
Yast::Storage::FakeDeviceFactory.load_yaml_file(devicegraph,
input_file) Yast::Storage::YamlWriter.write(devicegraph,
output_file)

Just a cosmetic fix for an issue that I consider deeper: we are
writing good old procedural code dressed up as OO code. Every class
has the only responsibility of implementing one step in a process
and only one public method to execute that step (manipulating a set
of I/O arguments).

I agree (and as a matter of fact, I even wrote that in that new
functions' description) that it's just cosmetic to use a class method
(or, in Ruby lingo, "singleton" method) to avoid the one-trick-pony
use.

I do not agree that this in any way violates the object-oriented
approach - to the contrary.

(more about this below)

I agree that in ruby is not so clear as also class is object and it can
have its own states. So class methods can be object-oriented and it is
easy to e.g. add to factory new behavior like file processing cache if
needed.



Is just one example of solutions that are perfectly correct but look
quite unfamiliar to a Ruby developer's eyes and are quite hard to
express as a set of RSpec behavior-oriented specs. Not saying they
are wrong, just that they don't look like "the ruby way". Like this
alternative which (although not perfect) is a much more "rubyist"
way of writing the same:

factory = Yast::Storage::DevicegraphFactory.new
devicegraph = factory.create_from_file(input_file)
File.open(output_file, 'w') {|f| f.write devicegraph.to_yaml }

Again, this is an approach I don't like at all because it focuses on
the wrong thing: All that stuff is not about creating device graphs.
We are not in the business of making or selling device graphs. Device
graphs are just a tool for what we really intend to do. Concentrating
on an irrelevant (or, to put it less harshly, behind-the-scenes) tool
is most misleading for anybody looking at that code.

Worse, putting the burden of file handling to the caller IMHO is not
desirable; why would I as a caller want to bother with that file? I
want a function that does that for me, including error handling and
whatnot.

I agree. I prefer otherway around. using IO streams which is more
flexible and allow e.g. also streming for example. So something like

device_graph = DeviceGraph.deserialize(input_file)
device_graph.serialize(output_file)

where you can have more flexible behavior like having input/output file
also streams beside strings.


Worse still, you need to obey a call protocol when you handle it like
that; you introduce boilerplate code that everybody will have to copy
verbatim from some other place where it works.

Back in the early 1990s when I started programming X11 and OSF/Motif,
we had to write tons of boilerplate code. Change any little thing,
and you are doomed; it fails in some crazy way, and you'll never know
why. And you couldn't even comment out stuff to find out when it
starts to fail; every little bit of that boilerplate was needed to
make it work.

I agree that boilerplate should be reduced and ruby is quite good with
it, so there is only a bit of syntactic sugar needed.


Been there. Done that. Got the f*ing T-shirt. As a matter of fact,
got a whole dressing cabinet full of those T-shirts. Want any? ;-)
("I was sent to software never-never land, and all I got is this
lousy T-shirt").


Back to the point I promised earlyer: Object oriented programming.

What constitutes an object, a class?

- Some data (however minor) it keeps around to do its thing

data should be hidden a.k.a "tell, do not ask"


- Some service it provides based on those data


Thats actually the most important actions it provide. E.g. see class as
electrician, you are not interested what pliers he use, what kind of
tape or cabels (s-)he use as long as he do its job and it working and
we should look in same way to objects. Tell him what you want and only
if you feel brave and really need it tell him what he should use.

- Encapsulating internal stuff to ensure outside independency of
implementation details


I can say it more high level - class provide its level of abstraction
keeping you away from more low level stuff.

- Optionally, being able to inherit from that class and to reimplement
(overwrite) certain aspects of how it works.

And now we have those "one-trick-ponies", those classes that perform
only one job, the classes that can in the most common use case be
hidden in a single function. Is that a contradiction to the above?

what is usually problem with those functions is that it need a lot of
data and usually it should not be job of such function to keep that
data consistent. But of course if such method just delegate methods
from high level to low level, it is fine.

So method like

def Handyman.fix_toilet(toilet)
toilet.ensure_water_flowing
toilet.ensure_flushing_working
toilet.ensure_shit_flowing
end

is good as it just provide high level fix and it is separated to low
level fixes.

now bad example

def Handyman.fix_toilet(toilet)
if !toilet.water_comming?
if toilet.input_pipe.clean?
toilet.input_pipe = InputPipe.new
else
toilet.input_pipe.clean
end
end

if !toilet.lever_working?
toilet.lever = toilet.lever.class.new
end

...
end

what is wrong is that if you e.g. introduce dry toilet then you have to
change all toilet users, not just handyman.

So for first example ensure_water_flowing and ensure_lever_working is
always true as there is no lever and always all water flowing.
But in second example for dry toilet answer for water_comming? is
false, as in dry toilet no water comming. There is also no input pipes,
so whole logic now looks broken. And what is worse if there is other
users it have to be also adapted.

So that is risk of single function classes, that you too much depend on
specific data instead of just saying what you want.


Well, IMHO no. For one thing, it's not really only that one thing it
does; that one thing is just the most common use case. But if such a
class is written well, it also gives you access to other methods that
perform just one aspect of its main task, or that lets you perform
that main task in different ways, or that lets you specify additional
parameters to that main task or to any of the sub-tasks.

Encapsulation is the main thing. IMHO it's the fine art of OOP to do
most things with useful parameters without having to write tons of
that boilerplate code mentioned above. Only when you want to do the
nonstandard use case you should have to specify parameters, or at
least most of them (there is not always a good default; filenames are
a prime example).

I agree that reasonable defaults helps a lot. E.g. even filename can
have default if it is e.g. log or standart cache location, but not
always.


Take the YaST UI, for example. By default, you don't have to specify
anything; just get the widget factory from the UI, and then you can
merrily begin creating widgets. No boilerplate code needed at all.
But if you want a special setup, use different init calls and then
get the widget factory and create widgets. To the outside, this looks
like procedural code, but it's really not. It carries its state
inside, and it uses reasonable defaults for the most common use cases.

And this is similar to the classes in yast-storage-ng. It might not
always show, but this is the philosophy behind it. The defaults
should always be reasonable, but still it leaves you room to tweak a
lot of things. And, more importantly, it hides the gory details from
the outside. In many cases, however, those gory details are
accessible enough for the caller or for derived classes.

For example, we had this discussion on IRC last week that Josef
wanted minimalistic YAML files for those device trees and
auto-generate partitions and file systems because his use case just
doesn't care about most of that. While this would not be desirable
for the normal case IMHO, I pointed out to him that it's trivial to
add that functionality by inheriting from the standard
FakeDeviceFactory (which can read YAML files) and reimplement one or
two methods to add that functionality. This is a lot harder with
purely procedural code.

A lot of things are happening behind the scenes in those classes, and
you shouldn't need to know about that when you just want to use them.
But you can if you want, and there is support for nontrivial use
cases. But the main use case is and should be simple. This is an
important design goal IMHO.


I can just agree, that main goal of good design is to have API easy to
use and simple to achieve common use cases.


3) Write every piece of code as something to maintain

This is more related to Scrum than to Ruby or OOP. Since we are
using Scrum to create YaST, I expect most PBIs to result in a YaST
client that can be invoked using /sbin/yast2 or in a set of RSpecs
that will be executed in every "rake test:unit" and used as a
reference documentation in the future.

By all means, yes.


I consider tests in the "examples" directory to be dead code.

I pointed that out to you before: Those examples are not tests. They
are examples. They serve as documentation to show how a class is
used, and they are an important development tool to work on that
stuff, to have easy access to those classes without going through the
hassle of starting an installation (possibly with creating a package
and a driver update disk with that package and integrating that DUD
into an installation ISO and starting an installation in a virtual
machine).

I wrote those examples mostly for myself as development tools. If you
take them away, all you will achieve is that I will simply no longer
share them and keep them to myself. But since we are a team, I
figured that I'd like to share all development tools to make life
easier.

Life as a YaST developer is hard enough with all those fragile and
over-complex tools we have to use on an everyday basis. We can very
well use some simple tools to make life easier.

I know I repeat myself, but those examples are not intended as a
substitute for unit tests. We need unit tests on all levels we can
get. But we also need simple examples to have a realistic chance to
work with our own code.

So it is more manual tests then examples? What is usually problem, is
that everything need someone maintain otherwise it will quickly became
obsolete and broken. Check how many tools are in yast2-devtools and how
much of it is no longer working, just because noone maintain it. So if
it is needed and someone use it and maintain it is fine, just do not
expect it will work forever unless it is regularly checked if it still
work.



> I consider that moving code from /test or /src/clients into
> /src/examples is the first step to have it broken with nobody ever
> noticing.

Anything in src/clients that is not simply invokable by itself is
dead code indeed. If I need yet another external binary to invoke
something written in a scripting language, something is very wrong.
Maybe we should reconsider that approach and try to get rid of that
additional binary.

Well, it is probably goal for yet another hackweek or workshop to make
working `ruby src/client/foo` which do some reasonable UI
initialization.



A have a similar feeling about code used for debugging.

As long as a piece of code is not 100% perfect (which will never
happen), there will be the need for debugging code. And since
debugging code tends to solve a temporary problem that is only valid
for a very limited time, this is very much a moving target.

In such case such debugging code should be removed before merge and
code review.


If...

> It should be "in
the right place" design-wise. Don't think in the debugging code as
something temporary. If you are needing it today, you will probably
need it (or something very similar) also in the future. Make it
part of your design...

...this is the claim, I'd no longer call it debugging code. Then it
becomes part of the interface, and it needs the same amount of
testing amd maintenance, too.

e.g. if it is logging I don't think it need so much testing and
maintenance.


But look at it from another side: If we take the "agile" approach
seriously, things like debugging code are not graven in stone. It
will change as development progresses. What is useful debugging code
today, it might be leftover cruft tomorrow because tomorrow we'll
have a lot better tools to replace it.

For example, that "dump_disks" function we talked about lately was
just a crude approach to get any useful output, and yes, it was
pretty much misplaced where it was at first, but it was there, it was
available, and it served its purpose - until the next development
step. We not can use much more useful YAML output instead because we
have that nice new class that does that for us.

This is what being agile is all about: Do things, improve them,
iterate until the stakeholders are satisfied (which always means
"good enough", never "perfect").

well, maybe sometimes helps a bit to stop and thing if it is really
single purpose specific task or if it can be reusable for more methods
or can help with other approach. E.g. regarding that dump_disks it can
help thinking if it is just debug tool to have some disk dump or if it
can be e.g. reused for autoyast, as standard serialization, etc.

I worry I see too much in past this approach of just do it somehow and
it will work...and result is old bootloader or network module, which
somehow just work, but have a lot of problems.



and use mechanisms that would be useful when that code is
executed as an RSpec test or yast client.

q.e.d.


CU

Josef
--
To unsubscribe, e-mail: yast-devel+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: yast-devel+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups