Of hammers and screwdrivers: When not to use Hystrix

tl;dr

Check if the power of Hystrix ( or any other tool ) fits your requirements. If it feels some kind of weird - it probably is.

the players

Hystrix and its friends ( like Eureka & Co. ) are part of the shiny new world of ‘companies hand out their great tools for free!’. And indeed they are - not only because they are production tested real world tools but also due to the effort the community puts into documentation and community management - thanks :) On the other hand one always should keep in mind: Those tools are not only made by but as well made for that very company. Thus libraries like Hystrix might not be designed as general purpose solutions but to solve more or less specific issues of the company / team they have been created by.

the mission

The given scenario is the following:

send data to an external service of one or multiple banks
aggregate the results
return that aggregation

That’s all there is to it. Nothing fancy right? ( Yay - a microservice it is ) And that is actually one of the reasons why we chose that very service to gather experience with Hystrix. The other one ( that triggered the whole story in first place ) is the fluctuating stability of the services we are talking to: Some have an increasing error quota under high load - others are simply unreachable during maintenance. Thus we wanted to use Hystrix to detect and react in those cases.

the promising start

As the project is small we had hope to implement Hystrix fast we just went for it and tried hands on. The very first steps in implementing Hystrix are fast indeed. We added the spring boot dependency to our gradle build config, some annotations and voilá: You have implemented an up-and-running circuit breaker! Some additional configuration of timeouts, fallbacks and individual command keys for the different external resources and that’s it, right? Right?! We came to know: Nope.

the fail

At first we tried to implement individual command keys. We wanted to use one key per external service so we can track and react differently in respect to each service. The first issue we stumbled upon is the inability to have dynamic ( read: runtime set ) properties in java annotation properties. One has to provide constant values which meant one method per API. To encapsulate this we tried the following:

public class MyClass {

  public Object doSomething(Object input, String provider) {
    switch (provider) {
      case "providerA":
        return doSomethingPrivateA(input);
      case "providerB":
        return doSomethingPrivateB(input);
    }
  }

  @HystrixCommand(commandKey = "providerACommandKey")
  private Object doSomethingPrivateA(Object input) {
    return foo(input);
  }

  @HystrixCommand(commandKey = "providerBCommandKey")
  private Object doSomethingPrivateB(Object input) {
    return bar(input);
  }
}

We wondered why our tests failed … and noticed the Hystrix command annotations were ignored (at ‘doSomethingPrivateA’ and ‘doSomethingPrivateB’). After some ~~google-ing~~ analysis we found out this is not possible in a straightforward way using annotations. The technical concept behind this is having a proxy that just wraps calls from the outside. Method calls within that instance are thus not visible to the proxy and therefore method annotations become meaningless. Of course there are workarounds but the design smell wasn’t getting any better by looking at those.

To make things work we tried to solve this by implementing one class per provider to have a respective annotated method. This felt pretty wrong because we had duplicated code like this:

public class MySubClassA extends MyBaseClass {

  @HystrixCommand(commandKey = "providerACommandKey")
  public Object doSomething(Object input) {
    return super.doSomething(input);
  }
}

One would have to write another copy of this for every single different command.

the second fail

Ok, there is another way of implementing Hystrix by using the command API. With this approach one has to write some more code and even rewrite some code as the calling method has to be altered ( this is why we went for annotations in first place ). As if that smell ( refactoring for this use case always does ) hasn’t been enough we noticed there is no built in way to determine command keys dynamically. Of course there are workarounds for this issue as well but again: The design smell wasn’t getting any better.

the break

At this point we went back to the drawing board and asked ourselves: Is the gain worth the pain? Does the advantage of Hystrix compensate for ugly code? When asking this we also came to know we hadn’t yet implemented a meaningful fallback method. And we noticed there is none in our use case. In the financial sector one wants very accurate data - no data are still better than wrong ones. There were no alternative services to address as a fallback either. These architectural issues combined with the low meaningful usage of Hystrix features have led us drop this approach.

the conclusion

Hystrix is a great tool and rapid to use - but a tool nonetheless. As tools are designed for a meant purpose they don’t fit every use case there is. Of course one may push a screw in by using a hammer but that requires a lot of energy - and thus you probably won’t do it too often (nor see the results after hammering in some screws …)

So how did we achieve our desired result? We ended up defining a Splunk query and alert (in about 10 minutes). Wasted time? Nope - we have learned a lot about Hystrix and are using it in more appropriate places.

We would be more than happy to hear your war stories about this topic in the comments :)

Marcus Pflanz

Marcus Pflanz

tl;dr

the players

the mission

the promising start

the fail

the second fail

the break

the conclusion

I love Free Software Day

FOSDEM 2020

Lazy consensus vs explicit voting

So praktisch ist Slack!

Incident bei KreditSmart - ein Blick hinter die Kulissen