[yast-devel] Refactoring madness: chapter two
This is a follow-up on my highly opinionated mail about software design. Be warned, it's much longer than chapter one. Sorry for that. This mail contains concrete examples from the recent yast2-storage rewrite in order to touch some solid ground (simplistic improvised examples didn't work very well in chapter 1). Please, please, PLEASE, don't take it as an attack to the old code or their writers. Once again, this is just as an explained dump of my very own mental checklist of things I care when writing or reviewing code. Meant as a starting point for discussion... but not via mailing list, since we want to get to a conclusion at some point. :-) 1) Responsibility distribution We all agree this is a key aspect of OOP design. But we seem to usually disagree on the result. Single responsibility principle is a nice guideline here... when not taken too far. Going for examples, the original SpaceMaker had the obvious responsibility of resizing and deleting partitions to make space, but also of holding the list of free slots (something I moved to Devicegraph) and of deciding/knowing what was the target size (:desired vs :minimum) for the following steps (something I moved one level up to the Proposal class). So the resulting SpaceMaker was more dumb... while other classes became... well, less dumb. We also had a separate DiskAnalyzer class that other classes queried to know stuff like what's the list of existing windows partitions, which media is being used for installation (we don't want to install on top of ourselves) and that kind of stuff. I "hided" the DiskAnalyzer class behind Devicegraph. So now other classes query those questions directly to the devicegraph object. After all those changes, somebody could argue that Devicegraph responsibility became to broad, against the single responsibility principle. For me it makes sense that devicegraph knows its own list of free slots or which one of its own disks are being used for installation. Simply because it makes sense(TM) and I "feel" it's its responsibility. I want my API to hide the fact that some of those calculations are delegated to a DiskAnalyzer class. The main advantage is that the "users" of Devicegraph don't have to know several other classes just to ask things about the system layout... that leads me to the next point. 2) Keep the dependencies between your classes to a minimum. Another topic in which one could thing that we are going to agree easily... but was not the case. The best thing I can do in that regard is just recommending the chapter titled "Managing dependencies" from the Sandi Metz book[1]. She simply put into words my view about the topic. Including a fairly obvious but still useful checklist on when an object has a dependency. That happens when the object knows: - The name of another class - The name of a message that it intends to send to someone other than self. - The arguments that a message requires and/or the order of those arguments Everytime I see that in a class, my dependencies alarm rings and I start asking myself: Does it make sense? How can avoid that? I really try hard to fight every dependency until I accept it must be there. And lately I find myself using dependency injection more and more. 3) Don't let singleton objects pollute your APIs In some situations, the singleton pattern is a necessary evil. But please restrict the evilness as much as possible. The original code had StorageManager.instance.probed (or similar stuff) in too many places. Several classes (DiskAnalyzer, PartitionCreator, SpaceMaker and StorageProposal) knew were to find the probed devicegraph by the holy grace of Singleton. The modified code hides that singleton behind the Devicegraph.probed class method and reads the probed devicegraph only once. Devicegraph is the only class that knows there is a singleton StorageManager. The devicegraph returned by it is then simply passed as argument to any other class needing it. Thus, DiskAnalyzer (and others) doesn't know anymore about the existence of StorageManager.instance, it just receive a devicegraph to analyze (the fact that this devicegraph is always the probed one is not DiskAnalyzer's business). The example in (4) has also to do with singleton objects. 4) Messages-driven design I can also become quite picky about who sends messages to who (like I did in [2]). In fact, that is an example of two thing I consider wrong. One is sending the messages to the wrong receiver. Another one is pollution caused by singleton. You send messages to an instance of FakeProbing to produce an effect on StorageManager.instance.probed, but there is no reference to StorageManager during that interaction. It just works because FakeProbing and the caller both know about StorageManager being singleton. Talking about messages, emitters, receivers and and all that can look too philosophical. But for me is the key factor when defining an API (I even draw UML interaction diagrams sometimes!). And for me, ending up with a sane API is the key reason for having code reviews. 5) A minor one: don't abuse well known names That's something I do all the time, but I'm trying to improve that. It really hurts the reviewer mind when something is called "factory", "singleton", "strategy", "presenter" and so on but doesn't follow the corresponding Factory, Strategy, WhatsNot patterns. To stop doing that when writing code is one of my new year's resolutions (because I suffer it when reading other's people code). 6) Single responsibility principle... again In (1) I said I can be considered slightly relaxed when applying the single responsibility principle to objects, at least to a purist eyes. I must clarify that I really observe that principle when writing methods within a class, so I end up with a lot of small methods that rely on each other. Not sure if that makes my code easier or more difficult to follow. It depends on the reader, I guess. Well, I think this is enough for a second mail. I guess I have already covered my most important reasons to "slow down" the delivery of already working code in the yast2-storage rewrite. Cheers. [1] http://www.sandimetz.com/products [2] https://github.com/yast/yast-storage/pull/193#discussion_r53643452 -- Ancor González Sosa YaST Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On Wed, 24 Feb 2016 15:29:28 +0100 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
This is a follow-up on my highly opinionated mail about software design. Be warned, it's much longer than chapter one. Sorry for that.
This mail contains concrete examples from the recent yast2-storage rewrite in order to touch some solid ground (simplistic improvised examples didn't work very well in chapter 1). Please, please, PLEASE, don't take it as an attack to the old code or their writers.
Once again, this is just as an explained dump of my very own mental checklist of things I care when writing or reviewing code. Meant as a starting point for discussion... but not via mailing list, since we want to get to a conclusion at some point. :-)
Well, consider my response as someone outside of storage team, just external view, who have only rough overview of new libstorage.
1) Responsibility distribution
We all agree this is a key aspect of OOP design. But we seem to usually disagree on the result. Single responsibility principle is a nice guideline here... when not taken too far.
Going for examples, the original SpaceMaker had the obvious responsibility of resizing and deleting partitions to make space, but also of holding the list of free slots (something I moved to Devicegraph) and of deciding/knowing what was the target size (:desired vs :minimum) for the following steps (something I moved one level up to the Proposal class). So the resulting SpaceMaker was more dumb... while other classes became... well, less dumb.
We also had a separate DiskAnalyzer class that other classes queried to know stuff like what's the list of existing windows partitions, which media is being used for installation (we don't want to install on top of ourselves) and that kind of stuff. I "hided" the DiskAnalyzer class behind Devicegraph. So now other classes query those questions directly to the devicegraph object.
After all those changes, somebody could argue that Devicegraph responsibility became to broad, against the single responsibility principle. For me it makes sense that devicegraph knows its own list of free slots or which one of its own disks are being used for installation. Simply because it makes sense(TM) and I "feel" it's its responsibility. I want my API to hide the fact that some of those calculations are delegated to a DiskAnalyzer class. The main advantage is that the "users" of Devicegraph don't have to know several other classes just to ask things about the system layout... that leads me to the next point.
For me DeviceGraph is the most obvious risk point for being "god" object. So from your examples. Free slots is something that can be part of proposal as there is not much other users. It can be also computed dynamically outside of DeviceGraph class. Regarding disk used in installation I think it make more sense to have object Installation, that knows it and keep DeviceGraph really only graph of devices without other knowledge ( like that there is something like installation as libstorage can be used also outside of installation or e.g. for kiwi which work with teoretic disk only, so you can later specialize Installation for more use cases ). So too sum it I agree with simple responsibility purpose and in some case more then one responsibility is fine unless class became too big, but I really disagree with mixing different abstraction levels in one class and for me holding devices, searching ( or caching ) free slots and knowing what disk use installation as very different abstraction level.
2) Keep the dependencies between your classes to a minimum.
Another topic in which one could thing that we are going to agree easily... but was not the case.
The best thing I can do in that regard is just recommending the chapter titled "Managing dependencies" from the Sandi Metz book[1]. She simply put into words my view about the topic. Including a fairly obvious but still useful checklist on when an object has a dependency. That happens when the object knows:
- The name of another class - The name of a message that it intends to send to someone other than self. - The arguments that a message requires and/or the order of those arguments
Everytime I see that in a class, my dependencies alarm rings and I start asking myself: Does it make sense? How can avoid that? I really try hard to fight every dependency until I accept it must be there. And lately I find myself using dependency injection more and more.
I can just agree. It also really helps with testing ( in C++ even more then in ruby )
3) Don't let singleton objects pollute your APIs
In some situations, the singleton pattern is a necessary evil. But please restrict the evilness as much as possible. The original code had StorageManager.instance.probed (or similar stuff) in too many places. Several classes (DiskAnalyzer, PartitionCreator, SpaceMaker and StorageProposal) knew were to find the probed devicegraph by the holy grace of Singleton.
The modified code hides that singleton behind the Devicegraph.probed class method and reads the probed devicegraph only once. Devicegraph is the only class that knows there is a singleton StorageManager. The devicegraph returned by it is then simply passed as argument to any other class needing it. Thus, DiskAnalyzer (and others) doesn't know anymore about the existence of StorageManager.instance, it just receive a devicegraph to analyze (the fact that this devicegraph is always the probed one is not DiskAnalyzer's business).
The example in (4) has also to do with singleton objects.
Also fully agree. Singleton is for me something that have to be ensured it is really single. The best example is logger, where you want to get it in one file, without race conditions and also changed globally, never having own specific logger. Even often used configuration in yast cause troubles as you have system one, user modified, proposed one, etc. And we need it as e.g. config mode works on proposed one, installation works on modified configuration etc. so you need pass it around.
4) Messages-driven design
I can also become quite picky about who sends messages to who (like I did in [2]). In fact, that is an example of two thing I consider wrong. One is sending the messages to the wrong receiver. Another one is pollution caused by singleton. You send messages to an instance of FakeProbing to produce an effect on StorageManager.instance.probed, but there is no reference to StorageManager during that interaction. It just works because FakeProbing and the caller both know about StorageManager being singleton.
Talking about messages, emitters, receivers and and all that can look too philosophical. But for me is the key factor when defining an API (I even draw UML interaction diagrams sometimes!). And for me, ending up with a sane API is the key reason for having code reviews.
few principles regarding messages is "Tell, do not ask", maximum - 1 level of abstraction ( so never skip any level of abstraction ) and also try to keep communication interface minimum working. I agree that code reviews should be mainly about API and code quality. Overlooks is something that test suite should catch.
5) A minor one: don't abuse well known names
That's something I do all the time, but I'm trying to improve that. It really hurts the reviewer mind when something is called "factory", "singleton", "strategy", "presenter" and so on but doesn't follow the corresponding Factory, Strategy, WhatsNot patterns. To stop doing that when writing code is one of my new year's resolutions (because I suffer it when reading other's people code).
Agreed, it just cause confusion.
6) Single responsibility principle... again
In (1) I said I can be considered slightly relaxed when applying the single responsibility principle to objects, at least to a purist eyes.
I must clarify that I really observe that principle when writing methods within a class, so I end up with a lot of small methods that rely on each other. Not sure if that makes my code easier or more difficult to follow. It depends on the reader, I guess.
For me it is very important for reading code as it should reflect level of abstraction. So public method say it top level speech what it do and used methods use another and so on and the lowest one. It is easier for me to read something like def read read_service_status read_config read_proc_value end then something like def read @service = Yast::Service.Read(MY_SERVICE) @data = Yast::SCR.Read(path(.foo.bar)) @help_data = @data["foo"] ? true : false # and parsing proc logic end just my 2c Josef
Well, I think this is enough for a second mail. I guess I have already covered my most important reasons to "slow down" the delivery of already working code in the yast2-storage rewrite.
Cheers.
[1] http://www.sandimetz.com/products [2] https://github.com/yast/yast-storage/pull/193#discussion_r53643452
-- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On 02/24/2016 05:19 PM, Josef Reidinger wrote:
On Wed, 24 Feb 2016 15:29:28 +0100 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
1) Responsibility distribution
We all agree this is a key aspect of OOP design. But we seem to usually disagree on the result. Single responsibility principle is a nice guideline here... when not taken too far.
Going for examples, the original SpaceMaker had the obvious responsibility of resizing and deleting partitions to make space, but also of holding the list of free slots (something I moved to Devicegraph) and of deciding/knowing what was the target size (:desired vs :minimum) for the following steps (something I moved one level up to the Proposal class). So the resulting SpaceMaker was more dumb... while other classes became... well, less dumb.
We also had a separate DiskAnalyzer class that other classes queried to know stuff like what's the list of existing windows partitions, which media is being used for installation (we don't want to install on top of ourselves) and that kind of stuff. I "hided" the DiskAnalyzer class behind Devicegraph. So now other classes query those questions directly to the devicegraph object.
After all those changes, somebody could argue that Devicegraph responsibility became to broad, against the single responsibility principle. For me it makes sense that devicegraph knows its own list of free slots or which one of its own disks are being used for installation. Simply because it makes sense(TM) and I "feel" it's its responsibility. I want my API to hide the fact that some of those calculations are delegated to a DiskAnalyzer class. The main advantage is that the "users" of Devicegraph don't have to know several other classes just to ask things about the system layout... that leads me to the next point.
For me DeviceGraph is the most obvious risk point for being "god" object. So from your examples. Free slots is something that can be part of proposal as there is not much other users. It can be also computed dynamically outside of DeviceGraph class.
I forgot to add that it is actually something specific to the proposal, because those new methods are indeed added to devicegraph using a refinement that is only used during proposal-related code. So: a_devicegraph.candidate_spaces #=> NoMethodError module Example using Proposal::RefinedDevicegrah a_devicegraph.candidate_spaces #=> A valid array of spaces end
Regarding disk used in installation I think it make more sense to have object Installation, that knows it and keep DeviceGraph really only graph of devices without other knowledge ( like that there is something like installation as libstorage can be used also outside of installation or e.g. for kiwi which work with teoretic disk only, so you can later specialize Installation for more use cases ).
Hmmm, that could make sense... but at some point in entered in "just finish this" mode and stopped thinking further ahead.
So too sum it I agree with simple responsibility purpose and in some case more then one responsibility is fine unless class became too big, but I really disagree with mixing different abstraction levels in one class and for me holding devices, searching ( or caching ) free slots and knowing what disk use installation as very different abstraction level. -- Ancor González Sosa YaST Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
participants (2)
-
Ancor Gonzalez Sosa
-
Josef Reidinger