I'm interested in your claim that extending classes from the outside makes for code that is easier to test. I've often heard that the experience from languages where this practice is prevalent (Ruby and JavaScript, specifically) that this makes code harder to understand and sometimes causes unfortunate complications, especially when you mix several libraries that extend core classes in subtly incompatible ways.
Python, with which I'm more personally familiar, allows you to both extend classes (although it doesn't provide pretty syntax for it, and the community generally frowns upon "monkey-patching") and to replace static methods (as well as global functions) temporarily during the run of a particular test. The second practice is considerably more common, despite having certain shortcomings (e.g. lack of thread-safety).
Incidentally, both extending and replacing methods of an existing class fall under the definition of "monkey-patching", and both are strongly frowned upon for uses in production code, while standards for test code are somewhat more lax. I suspect it's because the environment during the test run is more controlled and errors have a lower potential cost.
I would be interested in your thoughts about temporarily replacing static methods as a testing techinque in languages that allow it.
I disagree with the static method / global method advice. To quote Scott Meyers, "Strive for class interfaces that are minimal and complete". To me, that means that if a function is a class member, it is either a virtual function, or it operates directly on private data. For example, toCamelCase isn't virtual, and doesn't operate on private data. While it does operate solely on the provided string, it can do so through the public interface.
If there were an access tag like "super-public", that would be fine too. Until then, I will stick with statics, so long as they don't touch global state.
I mostly agree with your idea about static methods. But what about operations that depend of both arguments equally? For example Math.min or Math.max. Should we really have a 5.min(3)? It sounds counter intuitive to me. I think it is better to have a static method called min(3,5). In an ideal language, this would be a global function. Notice that since it has no global state, no testability is lost.
While I generally agree with your comments about the static methods, I'm still not sure where methods like say sum(Integer...) would belong. Or if we can generalize this and ask were functions applied to specific type collection would belong.
The ability to extend standard classes is a cool feature when you work by your self, or in small team where you control all your code. Imagine all Java developers in Google (you probably have thousands of them) start using Ruby, and everyone start injecting new utility methods in String class. I don't think you will appreciate that.
Nice refactoring in the second example, but do you really want a User to authenticate itself? Do you always trust your users? You mentioned serialization. How about if I give the following user across a wire: class AlwaysLoginUser extends User { Cookie login(Ldap ldap) { // don't care about ldap return new Cookie(user); } }
I think that authentication logic should be centralized and not spread out in User subclasses. If you want or need to make authentication code more OOP you can have different authenticators based on user type.
I really don't buy the "code lives with data" thing. With this approach, after a while you get the octopus of a User class, aware of everything going on in your application -- authentication, the fact it's a web app, transaction management, parsing of itself from xml, marshalling itself into xml... where do you stop applying this "code lives with data" principle? And, actually, the classic separation of concerns, together with testability-oriented thinking, produce a lot of nice small classes, logically self-contained, easy to understand, use, and, yes, reuse.
Sure, if this is a tiny singular occurrence in your app (say, the only place in a tiny app authentication is ever dealt with), go ahead and put this method as a convenience. But surely we're not talking about such cases?
High quality post. For the User/Ldap dependency you can also consider extracting an interface: Cookie login(AuthAdapter adapter) { ... } Ldap implements AuthAdapter { ... } to make User dependent only on an interface, which will untangle the dependencies graph a bit.
As you pointed out, the question of whether User should depend on LDAP is not an easy one.
I would imagine that in most case the problem is that the User class comes from some framework, or is generated from the database schema, or whatever, and you simply can't put the login method at the User level anyway. Would subclassing solve the trick "cleanly" in those case ? What happens when you need to add a new kind of authentication mechanism ?
Also, suppose your User class *is* actually part of a library. Do you impose dependency on LDAP ? Do you absctract LDAP behind some kind of IAuthenticator interface, with an LDAPAuthenticator and an OpenIDAuthenticator or something ?
I agree with yours comments for this example. You cited that in OO code and data live together. But there are other OO principles that could affect this one. For example SRP (Single Responsability Principle) which says that a class should have only one responsibility and is related with cohesion. With this in mind you could have the two following classes:
User - responsability: it should contain and validate the user data
Autenticator - responsability: it should authenticate the logins attempts.
I once did a refactoring that is just the opposite one showed here.I extracted a Authenticator class from a User class.
The problem is: when to choose one or another aproach? The answer I think it depends on the context. In your example I probably would do the same thing. Let me show a another context in which the login will return true if the user can and false otherwise :
/* Please it only an example to give an idea. It may contain errors. It is not done with TDD. :) */ Class User { String name; String adress; Date birthDate; //Other user data ... String login; final int MAX_LOGIN_ATTEMPTS = 3; final long ONE_SECOND = 1000; final long MINUTE = 60*ONE_SECOND; final long TIME_TO_BLOCK_FAILED_LOGIN = 5*ONE_MINUTE; string password; int numberOfLoginAttempts; boolean blockedToLogin = false; ...
public boolean canVote(){ return getAge() >= 16 //Brazil }
public boolean canDrive(){ return getAge() >= 18 } // various other uses methods ...
public boolean login(String password){ updateBlockedStatus();
if (blockedToLogin) throw new LoginBlockedExpetion("User is blocked to login. Try again later");
if (this.password.equals(password)) return true; else numberOfLoginAttempts++; if (exceededNumberOfLoginAttempts()){ blockedToLogin = true; LoginBlockedExpetion("You exceeded the number of attempts to login. Try again later"); else LoginInvalidException("login/password is incorret. Try again.") } }
private void updateBlockedStatus(){ if (!blockedToLogin) return; if (isTimeToUnblock()) blockedToLogin = false; }
In the code above you can see two blocks of variables and functions. One block is concerned with data and functions related to user: name; adress; canVote(); canDrive()..., and other block with variables and functions related with autentication: MAX_LOGIN_ATTEMPTS; numberOfLoginAttempts;login(); isTimeToUnblock(); updateBlockedStatus()...
In this case the User class has low cohesion. I think that, in this case, it worth to create a new class Autentication and extract everything related to login to it. In doing it we have two small classes with focused responsabilities and high cohesion. It is good to maintainability time.
Again, in your example I would do the same thing as you show. I am only showing another example where is the case to do just the opposite. It is more for unwary readers.
As a final note, design is not a easy task but some years ago I found TDD and since then I became a big fan of it. Growing the design evolutionarily by creating tests before even the code exists and refactor to improve the code and do this all the time in baby steps help me so much in such work.
So in the example probably I would start simple and would grow the design organically to achieve the current needs in the simplest way that I can think. Remembering a Eistein quote:
"Make everything as simple as possible, but not simpler."
PS- sorry any mistake. English is not my first language.
Wouldn't the last change make it more difficult to promote a User to SuperUser?, I mean, wouldn't the viability of this change be dependant on stuff like this?
I cannot agree more. Static state is terrible! People often say that it is not the case, but once it bites you in the behind, you will never think the same again.
I have a question, what would be a possible solution to a global class that returns some state when a get() (yes, static) is called on it? I would love to refactor the code in such away that get() is no longer needed. Any advice would be greatly appreciated... this has been annoying me for a long long time.
I disagree with you in that I think static methods are just fine as long as they do not maintain global state. This is mainly due to the DRY principle.
Given a function of two differently typed arguments, let's call them X and Y, it would be bad design to implement X(Y) and Y(X) in different places in your code as whenever X(Y) had to be updated, so would Y(X). This may seem like a corner case but many variations on this theme frequently occur in large projects.
Doesn't the Database example depend whether or not cache is considered an object or a data structure? Bob Martin'll say that if cache is a data structure then there is no demeter violation, i.e., it's supposed to expose its internals.
That User/SuperUser/AgentUser thing is beautiful in theory, but hard to achieve in practice.
What if you can change the type of a user, from 'agent' to 'super'? Most languages won't let you do it. ORM frameworks (Hibernate) won't let you do it. At least not in a nice, clean way. Also, what about the 'composition over inheritance' principle?
About the static methods rant, how to avoid an explosion in the number of methods in basic classes, like string and number? String.toJDOMElementTree() would be a candidate?
I don't disagree with you, in that people could do much, much better.
It's just funny. People discuss design principles as if they were a cohesive, coherent, single body of absolute truth, when in reality most of them are disconnected, contradictory, context-specific rules, that are all correct individually, but nearly impossible to be fully followed together.
For example, how to maximize encapsulation of a class, without violating the 'S' in SOLID? Or, how to maximize reuse (DRY) without raising coupling, while keeping it simple?
What I'm trying to say is, to design is to make choices and to balance trade-offs. And accepting the limitations of your environment. As Eric Evans said in his book, don't fight your tools, don't fight your frameworks.
I would like to congratulate you for inspiring folks like me to understand the value of writing testable code. I really liked the way you have presented in your blog. I also had a chance to explore your new cloud technology (getAngular).
I understand that getAngular is still in its Beta. I would like to check with you if we could do any business logics like what we do using our programming languages. For instance I was just trying to simulate a simple invoice functionality using getAngular. I had no problems in simple operations such as adding updating and searching an invoice indent. Now, I have business logic where I should not be creating an invoice request if the quantity of existing invoice requests per customer and per calendar year reaches a limit. Is this achievable in getAngular?
I'm interested in your claim that extending classes from the outside makes for code that is easier to test. I've often heard that the experience from languages where this practice is prevalent (Ruby and JavaScript, specifically) that this makes code harder to understand and sometimes causes unfortunate complications, especially when you mix several libraries that extend core classes in subtly incompatible ways.
ReplyDeletePython, with which I'm more personally familiar, allows you to both extend classes (although it doesn't provide pretty syntax for it, and the community generally frowns upon "monkey-patching") and to replace static methods (as well as global functions) temporarily during the run of a particular test. The second practice is considerably more common, despite having certain shortcomings (e.g. lack of thread-safety).
Incidentally, both extending and replacing methods of an existing class fall under the definition of "monkey-patching", and both are strongly frowned upon for uses in production code, while standards for test code are somewhat more lax. I suspect it's because the environment during the test run is more controlled and errors have a lower potential cost.
I would be interested in your thoughts about temporarily replacing static methods as a testing techinque in languages that allow it.
I disagree with the static method / global method advice. To quote Scott Meyers, "Strive for class interfaces that are minimal and complete". To me, that means that if a function is a class member, it is either a virtual function, or it operates directly on private data. For example, toCamelCase isn't virtual, and doesn't operate on private data. While it does operate solely on the provided string, it can do so through the public interface.
ReplyDeleteIf there were an access tag like "super-public", that would be fine too. Until then, I will stick with statics, so long as they don't touch global state.
I mostly agree with your idea about static methods. But what about operations that depend of both arguments equally? For example Math.min or Math.max. Should we really have a 5.min(3)? It sounds counter intuitive to me. I think it is better to have a static method called min(3,5). In an ideal language, this would be a global function. Notice that since it has no global state, no testability is lost.
ReplyDeleteWhile I generally agree with your comments about the static methods, I'm still not sure where methods like say sum(Integer...) would belong. Or if we can generalize this and ask were functions applied to specific type collection would belong.
ReplyDeleteWhat about value objects and code?
ReplyDeleteThe ability to extend standard classes is a cool feature when you work by your self, or in small team where you control all your code. Imagine all Java developers in Google (you probably have thousands of them) start using Ruby, and everyone start injecting new utility methods in String class. I don't think you will appreciate that.
ReplyDeleteNice refactoring in the second example, but do you really want a User to authenticate itself? Do you always trust your users? You mentioned serialization.
How about if I give the following user across a wire:
class AlwaysLoginUser extends User {
Cookie login(Ldap ldap) {
// don't care about ldap
return new Cookie(user);
}
}
I think that authentication logic should be centralized and not spread out in User subclasses. If you want or need to make authentication code more OOP you can have different authenticators based on user type.
I really don't buy the "code lives with data" thing. With this approach, after a while you get the octopus of a User class, aware of everything going on in your application -- authentication, the fact it's a web app, transaction management, parsing of itself from xml, marshalling itself into xml... where do you stop applying this "code lives with data" principle? And, actually, the classic separation of concerns, together with testability-oriented thinking, produce a lot of nice small classes, logically self-contained, easy to understand, use, and, yes, reuse.
ReplyDeleteSure, if this is a tiny singular occurrence in your app (say, the only place in a tiny app authentication is ever dealt with), go ahead and put this method as a convenience. But surely we're not talking about such cases?
High quality post. For the User/Ldap dependency you can also consider extracting an interface:
ReplyDeleteCookie login(AuthAdapter adapter) { ... }
Ldap implements AuthAdapter { ... }
to make User dependent only on an interface, which will untangle the dependencies graph a bit.
Damn good! Thanks!
ReplyDeleteAs you pointed out, the question of whether User should depend on LDAP is not an easy one.
ReplyDeleteI would imagine that in most case the problem is that the User class comes from some framework, or is generated from the database schema, or whatever, and you simply can't put the login method at the User level anyway. Would subclassing solve the trick "cleanly" in those case ? What happens when you need to add a new kind of authentication mechanism ?
Also, suppose your User class *is* actually part of a library. Do you impose dependency on LDAP ? Do you absctract LDAP behind some kind of IAuthenticator interface, with an LDAPAuthenticator and an OpenIDAuthenticator or something ?
Anyway, thanks for the articles !
I agree with yours comments for this example. You cited that in OO code and data live together. But there are other OO principles that could affect this one. For example SRP (Single Responsability Principle) which says that a class should have only one responsibility and is related with cohesion. With this in mind you could have the two following classes:
ReplyDeleteUser - responsability: it should contain and validate the user data
Autenticator - responsability: it should authenticate the logins attempts.
I once did a refactoring that is just the opposite one showed here.I extracted a Authenticator class from a User class.
The problem is: when to choose one or another aproach? The answer I think it depends on the context. In your example I probably would do the same thing. Let me show a another context in which the login will return true if the user can and false otherwise :
/*
Please it only an example to give an idea. It may contain errors. It is not done with TDD. :)
*/
Class User {
String name;
String adress;
Date birthDate;
//Other user data
...
String login;
final int MAX_LOGIN_ATTEMPTS = 3;
final long ONE_SECOND = 1000;
final long MINUTE = 60*ONE_SECOND;
final long TIME_TO_BLOCK_FAILED_LOGIN = 5*ONE_MINUTE;
string password;
int numberOfLoginAttempts;
boolean blockedToLogin = false;
...
public boolean canVote(){
return getAge() >= 16 //Brazil
}
public boolean canDrive(){
return getAge() >= 18
}
// various other uses methods
...
public boolean login(String password){
updateBlockedStatus();
if (blockedToLogin) throw new LoginBlockedExpetion("User is blocked to login. Try again later");
if (this.password.equals(password))
return true;
else
numberOfLoginAttempts++;
if (exceededNumberOfLoginAttempts()){
blockedToLogin = true;
LoginBlockedExpetion("You exceeded the number of attempts to login. Try again later");
else
LoginInvalidException("login/password is incorret. Try again.")
}
}
private void updateBlockedStatus(){
if (!blockedToLogin) return;
if (isTimeToUnblock())
blockedToLogin = false;
}
private boolean isTimeToUnblock(){
return getTimeElapsedSinceBlocked() >= TIME_TO_BLOCK_FAILED_LOGIN;
}
In the code above you can see two blocks of variables and functions. One block is concerned with data and functions related to user: name; adress; canVote(); canDrive()..., and other block with variables and functions related with autentication: MAX_LOGIN_ATTEMPTS; numberOfLoginAttempts;login(); isTimeToUnblock(); updateBlockedStatus()...
In this case the User class has low cohesion. I think that, in this case, it worth to create a new class Autentication and extract everything related to login to it. In doing it we have two small classes with focused responsabilities and high cohesion. It is good to maintainability time.
Again, in your example I would do the same thing as you show. I am only showing another example where is the case to do just the opposite. It is more for unwary readers.
As a final note, design is not a easy task but some years ago I found TDD and since then I became a big fan of it. Growing the design evolutionarily by creating tests before even the code exists and refactor to improve the code and do this all the time in baby steps help me so much in such work.
So in the example probably I would start simple and would grow the design organically to achieve the current needs in the simplest way that I can think. Remembering a Eistein quote:
"Make everything as simple as possible, but not simpler."
PS- sorry any mistake. English is not my first language.
Let me inform you!
ReplyDeleteThis sign is not zero zero, but Olaf Olsen!
If that answer enough for you?
If not forget my page forever; OK!?
THX
Typo:
ReplyDeleteweather => wether
Wouldn't the last change make it more difficult to promote a User to SuperUser?, I mean, wouldn't the viability of this change be dependant on stuff like this?
ReplyDeleteI cannot agree more. Static state is terrible! People often say that it is not the case, but once it bites you in the behind, you will never think the same again.
ReplyDeleteI have a question, what would be a possible solution to a global class that returns some state when a get() (yes, static) is called on it? I would love to refactor the code in such away that get() is no longer needed. Any advice would be greatly appreciated... this has been annoying me for a long long time.
I disagree with you in that I think static methods are just fine as long as they do not maintain global state. This is mainly due to the DRY principle.
ReplyDeleteGiven a function of two differently typed arguments, let's call them X and Y, it would be bad design to implement X(Y) and Y(X) in different places in your code as whenever X(Y) had to be updated, so would Y(X). This may seem like a corner case but many variations on this theme frequently occur in large projects.
Doesn't the Database example depend whether or not cache is considered an object or a data structure? Bob Martin'll say that if cache is a data structure then there is no demeter violation, i.e., it's supposed to expose its internals.
ReplyDeleteHow can we reconcile: "OO says that code and data live together" with: "objects export behavior, not data. An object has hidden data and exposed behavior. Data structures, on the other hand, have exposed data, and no behavior"
That User/SuperUser/AgentUser thing is beautiful in theory, but hard to achieve in practice.
ReplyDeleteWhat if you can change the type of a user, from 'agent' to 'super'? Most languages won't let you do it. ORM frameworks (Hibernate) won't let you do it. At least not in a nice, clean way. Also, what about the 'composition over inheritance' principle?
About the static methods rant, how to avoid an explosion in the number of methods in basic classes, like string and number? String.toJDOMElementTree() would be a candidate?
I don't disagree with you, in that people could do much, much better.
It's just funny. People discuss design principles as if they were a cohesive, coherent, single body of absolute truth, when in reality most of them are disconnected, contradictory, context-specific rules, that are all correct individually, but nearly impossible to be fully followed together.
For example, how to maximize encapsulation of a class, without violating the 'S' in SOLID? Or, how to maximize reuse (DRY) without raising coupling, while keeping it simple?
What I'm trying to say is, to design is to make choices and to balance trade-offs. And accepting the limitations of your environment. As Eric Evans said in his book, don't fight your tools, don't fight your frameworks.
But, you should always try to do your best! :)
Hi Misko,
ReplyDeleteI would like to congratulate you for inspiring folks like me to understand the value of writing testable code. I really liked the way you have presented in your blog. I also had a chance to explore your new cloud technology (getAngular).
I understand that getAngular is still in its Beta. I would like to check with you if we could do any business logics like what we do using our programming languages. For instance I was just trying to simulate a simple invoice functionality using getAngular. I had no problems in simple operations such as adding updating and searching an invoice indent. Now, I have business logic where I should not be creating an invoice request if the quantity of existing invoice requests per customer and per calendar year reaches a limit. Is this achievable in getAngular?
Thanks,
Chandra