[Bug 1078111] New: Tryton Server: Impact mitigation for DDoS attack
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111 Bug ID: 1078111 Summary: Tryton Server: Impact mitigation for DDoS attack Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.3 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Security Assignee: security-team@suse.de Reporter: axel.braun@gmx.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Dear Security Team, I would like to point you to this discussion: https://groups.google.com/forum/?hl=de#!topic/tryton-dev/4fWWO0C5hvA For openSUSE Packages I have implemented the patch that was discussed in https://bugs.tryton.org/issue5375 , but not implemented by the Tryton maintainer. Now he asks to remove the patch from the openSUSE packages. as the patch '...weakens the protection of the user password against brute force attack.' I would kindly ask for your advise on the above topic, and a recommendation what to do. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c1
Andreas Stieger
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c3
Luis Falcon
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c6
Nicolas Évrard
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c8
--- Comment #8 from Nicolas Évrard
Hello Nicolas,
Hello Matthias,
We also have to keep in mind that trytond can run in multiple environment: Linux, BSD and there might be even few people using Windows ; of course this doesn't concern openSUSE but it concerns us.
I understand. This is easily forgotten but I know myself how difficult cross platform development can be.
Would you accept a configurable approach? Keep the default as it is but allow users or integrators that run Linux to select a different behaviour that avoids the discussed effect on the databse?
Well, for us the main issue with the patch applied is the removal of the brute force protection. And according to us, this protection is more important than the protection against a potential DDOS (because against DDOS you have to act on multiple levels thus somehow we are less concerned about that). And of course, if SUSE wants to add a patch that protects its users better without hindering the base protection that we give. This is fine for us too :).
A solution to mitigate the growth of the LoginAttempt table might be to keep track of the IP making the attempt and keeping at most X attempts from the same IP. In fact after some research it seems that it is the solution that drupal chose:
If it works for Drupal then it may be the right thing to do. A quick research suggests there are also some pitfalls here like multiple users sharing the same IP or using proxies to circumvent the protection.
Cédric created a bug regarding this issue: https://bugs.tryton.org/issue7110 When we discussed it we talked about a configuration parameter that would allow to define the size of the subnet that would be banned. The number of failed attempts will also be a configuration parameter. In fact providing a patch for this issue will only reduce slightly the attack surface but I am afraid that against DDOS trytond can not handle everything by itself.
I'm sure all of you can work out a solution that addresses both concerns.
I sure hope that we can. Thank you very much for stepping in into this debate it helps to have the opinion of other (somehow less concerned) developers.
I think both parties can agree upon that there is an attack surface here but also that it can't be fixed so simply (with regard to the proposed patch). While OS level DoS protection is certainly best practice, an improved implementation would benefit tryton and serve as a defense in depth measure.
We all agree on that I think. But putting the cursor on the right level of protection is difficult and a subject of heated debate as you saw. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c9
Axel Braun
(In reply to Matthias Gerstner from comment #7)
We also have to keep in mind that trytond can run in multiple environment: Linux, BSD and there might be even few people using Windows ; of course this doesn't concern openSUSE but it concerns us.
I understand. This is easily forgotten but I know myself how difficult cross platform development can be.
Would you accept a configurable approach? Keep the default as it is but allow users or integrators that run Linux to select a different behaviour that avoids the discussed effect on the databse?
Well, for us the main issue with the patch applied is the removal of the brute force protection. And according to us, this protection is more important than the protection against a potential DDOS (because against DDOS you have to act on multiple levels thus somehow we are less concerned about that).
Here we had the agreement that we keep the existing functionality of failed login timeout of 2^(failed_logins - 1) as default, and add a parameter that allows the admin to override this with a default of n seconds.
And of course, if SUSE wants to add a patch that protects its users better without hindering the base protection that we give. This is fine for us too :).
A solution to mitigate the growth of the LoginAttempt table might be to keep track of the IP making the attempt and keeping at most X attempts from the same IP. In fact after some research it seems that it is the solution that drupal chose:
If it works for Drupal then it may be the right thing to do. A quick research suggests there are also some pitfalls here like multiple users sharing the same IP or using proxies to circumvent the protection.
Drupal is probably one of the few examples that allows an anonymous write to the database, which is in general the wrong approach - see comment 4. I have not seen this in one of the big ERP implementations like SAP or Oracle, and I bet those (Oracle) DB experts know their job. This is the main performance obstacle that is seen in the current implementation, and where we should get rid of. For sure x-platform development makes live harder, but all of the mentioned systems have logging facilities, may it be systemd, /var/log/messages or c:\TEMP\trytond.log. Most likely the different ways are already handled in the python system abstraction layer.
Cédric created a bug regarding this issue: https://bugs.tryton.org/issue7110
When we discussed it we talked about a configuration parameter that would allow to define the size of the subnet that would be banned. The number of failed attempts will also be a configuration parameter.
That looks exactly like https://bugs.tryton.org/msg24643 and https://bugs.tryton.org/issue5375 , with which the discussion started. So we should close 7110 as duplicate, reopen 5375, and in this bug remove the DB logging, enable the exponential brute-force protection and add the configuration parameter.
In fact providing a patch for this issue will only reduce slightly the attack surface but I am afraid that against DDOS trytond can not handle everything by itself.
True. But we should target a minimal impact in case the system-measures fail
I'm sure all of you can work out a solution that addresses both concerns.
I sure hope that we can. Thank you very much for stepping in into this debate it helps to have the opinion of other (somehow less concerned) developers.
I think both parties can agree upon that there is an attack surface here but also that it can't be fixed so simply (with regard to the proposed patch). While OS level DoS protection is certainly best practice, an improved implementation would benefit tryton and serve as a defense in depth measure.
We all agree on that I think. But putting the cursor on the right level of protection is difficult and a subject of heated debate as you saw.
Indeed, that was very helpful to get an external view of independent security expert. Thanks again! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c10
--- Comment #10 from Luis Falcon
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c11
Nicolas Évrard
Hi all,
thanks for the contributions to this topic. I feel we have reached a common understanding, let me try to summarize
Hello Axel, I though we've reached a consensus but after reading your message and Luis' message I realized we didn't. Maybe I was not clear enough about what would be an acceptable solution.
We also have to keep in mind that trytond can run in multiple environment: Linux, BSD and there might be even few people using Windows ; of course this doesn't concern openSUSE but it concerns us.
I understand. This is easily forgotten but I know myself how difficult cross platform development can be.
Would you accept a configurable approach? Keep the default as it is but allow users or integrators that run Linux to select a different behaviour that avoids the discussed effect on the databse?
Well, for us the main issue with the patch applied is the removal of the brute force protection. And according to us, this protection is more important than the protection against a potential DDOS (because against DDOS you have to act on multiple levels thus somehow we are less concerned about that).
Here we had the agreement that we keep the existing functionality of failed login timeout of 2^(failed_logins - 1) as default, and add a parameter that allows the admin to override this with a default of n seconds.
That not what I meant. We won't accept a patch that hinders the brute force protection. Adding a configuration to use a default of n seconds and thus bypass the current mechanism is a patch that we won't accept.
And of course, if SUSE wants to add a patch that protects its users better without hindering the base protection that we give. This is fine for us too :).
A solution to mitigate the growth of the LoginAttempt table might be to keep track of the IP making the attempt and keeping at most X attempts from the same IP. In fact after some research it seems that it is the solution that drupal chose:
If it works for Drupal then it may be the right thing to do. A quick research suggests there are also some pitfalls here like multiple users sharing the same IP or using proxies to circumvent the protection.
Drupal is probably one of the few examples that allows an anonymous write to the database, which is in general the wrong approach - see comment 4.
I have not seen this in one of the big ERP implementations like SAP or Oracle, and I bet those (Oracle) DB experts know their job. This is the main performance obstacle that is seen in the current implementation, and where we should get rid of.
If you can come up with a mechanism that is both cross platform, able to scale with multiple instances on different machine then I think we will welcome it.
For sure x-platform development makes live harder, but all of the mentioned systems have logging facilities, may it be systemd, /var/log/messages or c:\TEMP\trytond.log. Most likely the different ways are already handled in the python system abstraction layer.
Of course they do, but there is no code in Tryton for this. Moreover there is the use case of the multiple tryton instance on different computers.
Cédric created a bug regarding this issue: https://bugs.tryton.org/issue7110
When we discussed it we talked about a configuration parameter that would allow to define the size of the subnet that would be banned. The number of failed attempts will also be a configuration parameter.
That looks exactly like https://bugs.tryton.org/msg24643
Not at all. So what you're proposing is to have an option to disable the exponential wait and instead use a fixed amount of time for each login. This is not the same as a configuration parameter that would define the number of failed attempts from the same subnet. Secondly the patch allows to guess the existing logins. So even if we agreed on your proposal the patch would have to be rewritten.
So we should close 7110 as duplicate
In fact I think it should be marked as 'deferred' as it's the job of the Tryton Foundation to decide what should be on the website. Until we've reached an agreement in the foundation we should keep it like that.
In fact providing a patch for this issue will only reduce slightly the attack surface but I am afraid that against DDOS trytond can not handle everything by itself.
True. But we should target a minimal impact in case the system-measures fail
Choices have to be made: We prefer not to loose password against a brute force attack (I know some of my customer passwords, it is scary as hell how oblivious of the security they are sometimes) then having being subject to a coordinated attack. This is due to the fact that we're managing the user password on the other hand the network stack and all other infrastructure means possible to minimize the effect of a DDOS are not of our responsibility. Thus the principal security focus must be put on what we can change.
I'm sure all of you can work out a solution that addresses both concerns.
And now, we've reached the deadlock again ;). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c13
--- Comment #13 from Luis Falcon
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c14
--- Comment #14 from Nicolas Évrard
Dear Mathias, dear all
Let me re-enforce the fact that today we have a security problem, with an open and widely known vulnerability on the tryton server.
IMO, we should take a two-stage :
** Stage 1 (present): Needs immediate action **
Mitigating _current_ vulnerability of flooding with anonymous writes the DB engine (see comment 3 for details https://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c3)
I agree that this vulnerability should be better handled (and we do that there has been patches limiting the behaviour you describe) But if there was a simple way to do it, we wouldn't be talking about it right now.
A simple patch was written with the following functionality:
a) Avoids anonymous writes to the DB, and b) Imposes a default 3-second penalty to any failed login attempt, mitigating also the brute-force attack.
IMO, we take care of a) and b).
Mathias, to answer one of your question, it implements a simple, yet effective password brute-force protection in place without allowing anonymous writing on the DB. That said, I am positive it can be improved / optimized in upstream.
It will need to be improved since you're not waiting at all when the login is successful. A clever attacker will know that if its process waits 0.5 seconds then it's probably not the right password, thus he can drop the connection and start with another one. Thus your patch must wait unconditionally 3 seconds. So even for successful login attempt users will have to wait 3 seconds. This is a regression for people that know their password (which should fortunately be the usual case). That's why I still prefer the exponential wait way.
This can be implemented in scenarios with multiple application servers and different operating systems. I know SAP, and SAP deals with with authentication, password quality control, auditing and logging with multiple application servers, allowing mixing operating systems in the same SID, without letting the DB being flooded by anonymous writes. A similar solution is feasible in Tryton too.
Could you please describe the solution SAP is using in the multiple application server case? I have some ideas about it but all the ideas I have are always involving other systems complicating deployment, IPC calls or something. About the use case, I know one company that is doing it. Maybe others are doing it also (I think about semilimes) but they didn't share the details of their infrastructure with us. Storing the login attempts per machine is a subject of discussion, because then (assuming they reply using a round-robin algorithm) you're dividing the waiting time by the number of machine answering requests. On the other hand the exponential will always grows faster then the linear function so it might be seen as OK so I think we should discuss this further.
What we can not, and will not accept is to leave the current system security compromised as it is today, just because a discussion on which brute force attack protection is better among two existing and _valid_ alternatives. So the patch will remain in place until we get a satisfactory solution from upstream.
I must say that you can do whatever you want in GNU Health. It's your project you're free to apply all the patches you want. Another solution would be to have two packages in openSUSE (I am used to debian so I will use their terminology): - A virtual package trytond-server - Two packages: - GNU Health trytond: a package providing trytond-server with Luis' patch - Vanilla trytond: a package also providing trytond-server without Luis' patch - Tryton modules should depend on trytond-server This is a burden for Axel because he will have to create this the first time but I guess that once it's done the process can be completely automated (IIRC the talks you gave in Buenos Aires).
In addition, and as a side note, we can not cover the sun with one finger. For example, I find contradictory rejecting a solution that mitigates a vulnerability because it uses a different -yet valid- approach to brute force protection, while allowing things such as
"123456789" as a valid password in Tryton. This is ranked number 2 most used password (http://www.telegraph.co.uk/technology/2017/01/16/worlds-common-passwords- revealed-using/)
I proposed using cracklib to harden passwords (https://bugs.tryton.org/msg24736) , but it was not taken. I must admit it gets a bit frustrating after proposing features / patches and not seeing them implemented. Luckily, this is Free Software, so we can apply them in GNU Health, when consensus with upstream is not reached.
I don't like rules on password and I was against the inclusion of the patches to define rules on them (it's a bit more complicated than that as I was not against everything but as you can see my ideas are not always taken into account either). There is plenty of litterature that explain why rules for passwords are bad. Yet we have configuration options that allow to define the strength of the password and by tweaking them you can prevent user to use simple passwords (one of them allows to define a file with the list of unallowed passwords). About your frustration, I can understand it. But again: bloating Tryton with options is not in our design philosophy (another reason why I hated the password rules), so having an option to activate PAM (because it does not work on openbsd and windows and we value cross-platform support) was not in our plan. I am sorry that it generated frustration but we also have our freedom to decide what should go inside core tryton or not.
** Stage 2 **
Improving / hardening overall security in Tryton.
Of course, I agree that we should discuss security issue and have open-minded talks about the solutions. But I don't think this bug is the right place to do so (and we already digress way too much). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c15
--- Comment #15 from Luis Falcon
I agree that this vulnerability should be better handled (and we do that there has been patches limiting the behaviour you describe) But if there was a simple way to do it, we wouldn't be talking about it right now.
A simple patch was written with the following functionality:
a) Avoids anonymous writes to the DB, and b) Imposes a default 3-second penalty to any failed login attempt, mitigating also the brute-force attack.
IMO, we take care of a) and b).
Mathias, to answer one of your question, it implements a simple, yet effective password brute-force protection in place without allowing anonymous writing on the DB. That said, I am positive it can be improved / optimized in upstream.
It will need to be improved since you're not waiting at all when the login is successful. A clever attacker will know that if its process waits 0.5 seconds then it's probably not the right password, thus he can drop the connection and start with another one. Thus your patch must wait unconditionally 3 seconds. So even for successful login attempt users will have to wait 3 seconds. This is a regression for people that know their password (which should fortunately be the usual case).
That's why I still prefer the exponential wait way.
Agree, although both the exponential and the constant approaches are deterministic. We can use a more non-deterministic approach to failed logins, using random delays with sane ranges. Something like the interval [1,5] secs can be done replacing the default 3 sec sleep with this line: time.sleep (random.randint(1,5)) We could use the same "failed_login_timeout" as upper limit or change the param name to "failed_login_max_delay". With this approach, we should have everything (or mostly) in place. Avoid anonymous writes and a random delay function.
This can be implemented in scenarios with multiple application servers and different operating systems. I know SAP, and SAP deals with with authentication, password quality control, auditing and logging with multiple application servers, allowing mixing operating systems in the same SID, without letting the DB being flooded by anonymous writes. A similar solution is feasible in Tryton too.
Could you please describe the solution SAP is using in the multiple application server case? I have some ideas about it but all the ideas I have are always involving other systems complicating deployment, IPC calls or something.
It has to do the way the application servers deal with the DB, Central Instance and Message Server. Of course, SAP Kernel is in binary mode only, but the communication way and params are pretty clear. We can discuss / debate in "stage 2" in other area not specific to just security.
About the use case, I know one company that is doing it. Maybe others are doing it also (I think about semilimes) but they didn't share the details of their infrastructure with us.
Storing the login attempts per machine is a subject of discussion, because then (assuming they reply using a round-robin algorithm) you're dividing the waiting time by the number of machine answering requests. On the other hand the exponential will always grows faster then the linear function so it might be seen as OK so I think we should discuss this further.
What we can not, and will not accept is to leave the current system security compromised as it is today, just because a discussion on which brute force attack protection is better among two existing and _valid_ alternatives. So the patch will remain in place until we get a satisfactory solution from upstream.
I must say that you can do whatever you want in GNU Health. It's your project you're free to apply all the patches you want.
Another solution would be to have two packages in openSUSE (I am used to debian so I will use their terminology):
- A virtual package trytond-server - Two packages: - GNU Health trytond: a package providing trytond-server with Luis' patch - Vanilla trytond: a package also providing trytond-server without Luis' patch - Tryton modules should depend on trytond-server
I think we should use, whenever possible, the default tryton server. It will make life easier for everyone. Of course, if there are no consensus, then it could be an alternative.
This is a burden for Axel because he will have to create this the first time but I guess that once it's done the process can be completely automated (IIRC the talks you gave in Buenos Aires).
In addition, and as a side note, we can not cover the sun with one finger. For example, I find contradictory rejecting a solution that mitigates a vulnerability because it uses a different -yet valid- approach to brute force protection, while allowing things such as
"123456789" as a valid password in Tryton. This is ranked number 2 most used password (http://www.telegraph.co.uk/technology/2017/01/16/worlds-common-passwords- revealed-using/)
I proposed using cracklib to harden passwords (https://bugs.tryton.org/msg24736) , but it was not taken. I must admit it gets a bit frustrating after proposing features / patches and not seeing them implemented. Luckily, this is Free Software, so we can apply them in GNU Health, when consensus with upstream is not reached.
I don't like rules on password and I was against the inclusion of the patches to define rules on them (it's a bit more complicated than that as I was not against everything but as you can see my ideas are not always taken into account either). There is plenty of litterature that explain why rules for passwords are bad.
Yet we have configuration options that allow to define the strength of the password and by tweaking them you can prevent user to use simple passwords (one of them allows to define a file with the list of unallowed passwords).
About your frustration, I can understand it. But again: bloating Tryton with options is not in our design philosophy (another reason why I hated the password rules), so having an option to activate PAM (because it does not work on openbsd and windows and we value cross-platform support) was not in our plan. I am sorry that it generated frustration but we also have our freedom to decide what should go inside core tryton or not.
Make it modular. We can integrate it in those Operating Systems (GNU/Linux and most *NIX support PAM), which probably cover the 99% of the spectrum. In those who don't then we can look for their alternatives. But we can talk about this and all the stuff in "stage 2" in other place.
** Stage 2 **
Improving / hardening overall security in Tryton.
Of course, I agree that we should discuss security issue and have open-minded talks about the solutions. But I don't think this bug is the right place to do so (and we already digress way too much).
We agree on this one. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c18
--- Comment #18 from Nicolas Évrard
Regarding trytond and the cross platform support it indeed sounds like a plugin architecture would be the way to go here in the long term. It could allow users to develop platform specific modules that use native features without the core project being cluttered.
In fact since the LoginAttempt Model is a usual Tryton model, GNU Health could override this object in their core module, override the "add" method in order not to write in the database at all and voilà … nothing goes in the database. IIRC we already suggested this solution. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c19
--- Comment #19 from Nicolas Évrard
(In reply to Matthias Gerstner from comment #16)
Regarding trytond and the cross platform support it indeed sounds like a plugin architecture would be the way to go here in the long term. It could allow users to develop platform specific modules that use native features without the core project being cluttered.
In fact since the LoginAttempt Model is a usual Tryton model, GNU Health could override this object in their core module, override the "add" method in order not to write in the database at all and voilà … nothing goes in the database. IIRC we already suggested this solution.
I made a simple patch implementing this suggestion: https://codereview.appspot.com/335550043/ -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c20
--- Comment #20 from Luis Falcon
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c21
Korbinian Preisler
Indeed the discussion somehow circled back to the beginning :-/
Nicolas made his position quite clear. So I think the gist is that an approach needs to be found that keeps the brute force password protection in place while also avoiding the unwarranted database writes.
For me this is the point that nailed it down. If we can find a way how to implement this point in a proper way then the issue could be solved to everybody's satisfaction. When implementing this the multi instance/multi plattform scenario needs to be taken into account. IMHO as the current patch fixes one vulnerability but also weakens the system at the brute force password protection I agree with Matthias and Nicolas that it would be better to drop the current version of the patch and to implement another one. Maybe there could be written a patch that fixes the vulnerability by the anonymous database entries without weakening the brute force password protection. I think that this way of handling this issue would respect both sides in an acceptable manner. Just my 2ct. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111
http://bugzilla.opensuse.org/show_bug.cgi?id=1078111#c22
--- Comment #22 from Luis Falcon
participants (1)
-
bugzilla_noreply@novell.com