Skip to content

Conversation

@jasonayre
Copy link

This adds support for setting up a middleware stack while defining routes, and using that middleware stack for a particular route.

As I began looking at ActionSubscriber to replace my existing async pubsub library, I came across the routing / middleware, which seems really cool. Problem is, middleware wouldn't really get me much in it's current form with what I'm looking to do as it's global. With this PR, each route can have it's own set of middleware, which is forked from ActionSubscriber.config.middleware. - I borrowed the idea from the Phoenix router and it's pipeline/plugs feature. http://www.phoenixframework.org/docs/routing

Use case example: parsing payload and to lookup the resource, and pass it along in it's loaded form. (while allowing non resourceful types of messages to be sent through other areas).

#assume you publish message in an ActiveRecord callback with the following format:
{
  :resource => self.attributes,
  :changes => self.changes,
  :resource_type => self.class.base_class.name
}

class HasChanges
  def initialize(app)
    @app = app
  end

  def call(env)
    env["changes"] = Changes.new(env["payload"]["changes"])
    @app.call(env)
  end
end

class HasResource
  def initialize(app)
    @app = app
  end

  def call(env)
    attributes = env["payload"]["resource"]
    model = env["payload"]["resource_type"].constantize
    env["resource"] = model.find(attributes["id"])
    @app.call(env)
  end
end

::ActionSubscriber.draw_routes do
  stack :resourceful do
    use HasChanges
    use HasResource
  end

  route ::Census::SurveysSubscriber, :updated, :durable => true, :exchange => :events, :stack => :resourceful
  route ::Census::SurveysSubscriber, :created, :durable => true, :exchange => :events, :stack => :resourceful
end

The main changes are that the middleware for the particular route gets passed in to the env, which will be executed rather than ActionSubscriber.config.middleware (but that's not a requirement if it can be looked up another way). Also the middleware stacks are now duplicated from the existing default middleware so that each route has it's own. The last large change is that I added hash accessors to env object so it behaves more like rack, which makes what you can do via the middlewares much more powerful. Tested and working with my local app except march hare/jruby.

@mmmries
Copy link
Contributor

mmmries commented May 17, 2016

@jasonayre thanks for the PR ❤️ 💙 💚 💛 ❤️ 💙 💚 💛 ❤️ 💙 💚 💛 ❤️ 💙 💚 💛 ❤️ 💙 💚 💛

We had a debate about this idea a while ago and I lost it back then. Can you provide a few examples of where you think this functionality would be most useful?

I'm curious about your decision to fork the global middleware rather than using two separate middlewares? In Phoenix you have your endpoint.ex file which defines the global plugs for your whole app, and then you have the router where you can create little sub-stacks of plugs and compose those for certain groups of routes. When I thought about this previously I wondered about keeping the global shared middleware stack and then allowing each route to choose its own additional middleware items. So messages would first pass through the global middleware and then get passed to the route-specific middleware. Just curious to get your thoughts on the tradeoffs between these two approaches.

@jasonayre
Copy link
Author

@mmmries

Regarding the forking behavior, I figured most people would want the default behavior from the config to be the starting point of the stack, when composing a stack. (I did with my app at least). But also because if you don't want the default behavior, you could always clear out the middleware stack when configuring to start fresh. I'm not married to it though.

With what you are saying about the global/route specific state, technically that is exactly how it's working when the message is passed through, but the implementation is different since the global middleware is duped and the other middlewares are appended onto it -- The reason for this was mainly I didn't want to mess anything up with the runner class, and the routing of the message being called at the end. For example, I assume you mean that rather than do (in the enque_env method)

env.middleware.call(env)

(^ Where middleware represents the entire middleware chain), do something like

env.middleware.call(::ActionSubscriber.config.middleware.call(env))

(^ where env.middleware is just whatever middlewares were appended to the stack.)

-- Which I agree would be cleaner to do that way, but if I use the existing runner, then the message routing would be called before it was sent through the route specific middleware stack. (unless I am missing something) - So the tl;dr; for ^ is:
In order to preserve the routing of the message as the last piece of the stack, by using it with the middleware runner. But maybe there is a better way? I'm happy to explore it further if I am able to convince that the per route middleware feature is super awesome for extensibility of the library in general :)

Moving on to examples:

A more tangible example can be found here (which the way it's structured, is only possible with the stack concept)
https://github.com/jasonayre/resource_subscriber

Basically, it just allows you to work with the record in its loaded form, and gives you a resource ivar on env representing the loaded form, when initialized. -- Of course, this only applies to subscribers where I am dealing with an active record resource (and in a couple of others, I am just passing messages), so that's why I like the idea of having different middleware stacks, since the active record model messages I am publishing are all published in the same format
{attributes:, record_type:, changes:}, essentially.

The other benefit of using a stack in this instance, is, lets say you (MX) want to use my remote_subscriber library. Since you are working with ActiveRemote models, the loading of the resource behavior changes. -- By using a middleware stack you can.

class LoadActiveRemoteResource
  def initialize(app)
    @app = app
  end

  def call(env)
    guid = env["payload"]["resource"]["guid"]
    remote_model = #do something to map guid type to active remote model
    env["resource"] = remote_model.find_by_guid(guid)
    @app.call(env)
  end
end

ActiveSupport.on_load(:resource_subscriber_resourceful_stack) do
  replace(::ResourceSubscriber::Middlewares::Resourceful, ::LoadActiveRemoteResource)
end

Since the parsing of the changes is done in a separate middleware, that part is kept in tact, only the loading of the resource part changes. After implementing the resource_subscriber stuff using the middlewares rather than mixing in the behavior into the subscriber, I REALLY like this pattern, at least as someone trying to extend the library, as it lets you very cleanly insert/replace/remove components at will, and gives a very clean api for doing so.

Also a quick note but topic outside the scope of this PR, IMO the same sort of behavior should be enabled for publishing, which would enable for super clean rules/extensibility.

I.E. you set up a publishing stack somewhere, and then the publishing of the message is run through the middleware chain. I.e. say we're in a active record model

ActionSubscriber::Publisher.define_stack(:resourceful)
  use ResourceSubscriber::Middlewares::InferDestinationFromModelName
  use ResourceSubscriber::Middlewares::SerializeModelAttributes
  use ResourceSubscriber::Middlewares::SerializeChanges
  use ActionSubscriber::Middleware::Encoder
  use ActionSubscriber::Middleware::ActiveRecord::ConnectionManagement
end

after_commit :publish_create_event, :on => :create do
  ActionSubscriber.publish_with_stack(self, :resourceful)
end

That way I am able to create a publishing stack which builds up a message structure in a certain way, and then on the other side of things, process the message using a middleware stack, which processes the message in a certain way, which is a huge win for avoiding monkeypatching, and sets up a really clean interface for plugging/unplugging components.

Thoughts? FYI: I'm totally good to make changes if there are implementation detail issues (such as the route forking) or whatever, but before I put in anymore work into it, I'd like to know the red/green light consensus from the maintainers regarding this feature. I can tell you that as someone trying to extend ActionSubscriber (in the form of the library above, where I originally was just including all the behavior into the subscriber directly), this adds a huge layer of flexibility to extending the library (without removing any functionality, and isn't much code to do so)

@brianstien @liveh2o , @brettallred (looks like Nuvi using as well?)

@jasonayre
Copy link
Author

@mmmries I have some additional thoughts after converting over all my existing subscribers to AS subscribers, and writings specs against them. I think the stacks could easily be more composeable, and the forking behavior could be ditched entirely with one small design change:

TL;DR;

The router middleware should not be a middleware, it doesn't route, it executes

Reasons:

  1. Its behavior is inconsistent with the other middlewares, as the rest of them return env when called. The router middleware returns nil when called
  2. The application is handing control of the publishing to the middleware itself, when really the middleware should only be telling the application how to do it. IMO its overstepping its bounds by performing the execution. It's going, oh hey, thanks for the message, I know everyone else you talked to gave you a response back, but nah, I do what I want.

So I don't know if there was a reason it was done that way or if it will break anything for you guys by moving it, but if that is the case (and the point of the router being a middleware was to say, not call the execution with the callbacks), then I would suggest that the actual method to call on the subscriber, could be set on env, and router would just be responsible for setting the method to execute on the subscriber instance, not for actually executing it. (or could set runner_class, and instantiate the runner class with env, which would separate the execution from the processing, or whatever.)

That would remove the order dependency from the router and allow for easy composition of middlewares/stacks, and make it easier to test against if you want to run the message through the middlewares when testing. (which you can't right now without hax.)

My $,02. Thoughts?

PS: As I started testing my subscribers, I wanted to run the message through the middleware stack as if it had been published. Can't do that currently, because of ^. So an example of some really hacky code that would be much less hacky by separation of execution from the middleware stack:

RSpec.shared_context "subscriber_helpers", :type => :subscriber do
  PROPERTIES_DEFAULTS = {
    :action => :created,
    :channel => ::ActionSubscriber::RSpec::FakeChannel.new,
    :content_type => "application/json",
    :delivery_tag => "XYZ",
    :exchange => "events",
    :headers => {},
    :message_id => "MSG-123",
    :routing_key => "amigo.user.created",
    :queue => "test.amigo.user.created"
  }.freeze

  class FakeRouter
    def initialize(app)
      @app = app
    end

    def call(env)
      env
    end
  end

  class FakeMiddleware < ::Middleware::Runner
    def initialize(stack)
      stack << FakeRouter
      super(stack)
    end
  end

  def mock_resource_subscriber(resource, opts = {})
    encoded_payload = ::ResourceSubscriber::Message.new(resource).to_json
    subscriber_class = opts.fetch(:subscriber) { described_class }
    properties = PROPERTIES_DEFAULTS.merge(opts.slice(:action,
                                                      :channel,
                                                      :content_type,
                                                      :delivery_tag,
                                                      :exchange,
                                                      :message_id,
                                                      :routing_key))

    route = ::ActionSubscriber.__send__("route_set").routes.detect{|r|
      (r.subscriber == subscriber_class) && (r.action == properties[:action])
    }

    properties[:middleware] = route.middleware
    route.middleware.instance_variable_set(:@runner_class, ::FakeMiddleware)
    env = ::ActionSubscriber::Middleware::Env.new(subscriber_class, encoded_payload, properties)

    return subscriber_class.new(env.middleware.call(env))
  end
end

(side note, stacks should probably be stored not on the router)

Once again, happy to take a stab at any of this but not if there is no intent to make such a change.

class Router
def self.draw_routes(&block)
router = self.new
::ActiveSupport.run_load_hooks(:action_subscriber_routes, router)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't fork the middleware stack and instead use a tiered middleware where we first spin through the global middleware and then run through a route's middleware stack at the end of the global middleware, we wouldn't need to run any load hooks here. We also wouldn't need to make our own stack forking mechanism (because we wouldn't need it).

I understand that would mean you couldn't change a middleware in the global stack for a specific route, but in my mind, that's exactly how I'd expect a middleware stack to work. It's a singleton-esque pattern. The main reason I point this out is because I recently went through load hook hell while work on ActionSubscriber to get a few bugs fixed, and seeing a second load hook make me nervous.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this was added more for the ability of extensibility, i.e. in my gem which uses a middleware stack which adds a certain behavior, I can go

::ActionSubscriber.on_load(: action_subscriber_routes) do
  stack :whatever do
    use A
    use B
  end
end

etc. But the implementation could be changed if the router is removed from the stack and stacks were composable, i.e.

  class MyPluginsMiddlewareStack < ActionSubscriber::MiddlewareStack
    use A
    use B
  end

  ActionSubscriber.config.middleware_stacks << MyPluginsMiddlewareStack

For example

@mmmries
Copy link
Contributor

mmmries commented May 30, 2016

Here are the TL/DR;s of my thoughts so far:

The Router/Executor shouldn't be enforced as the end of the middleware

👍 The middleware stack (in its entirety) should be composable. I think this decision was made from an abundance of caution, but we can easily add a runtime assertion to make sure there is a runner/executor at the end.

The Router/Executor should not return nil

Meh. The big difference between a rack middleware and this middleware is that pub/sub will always be asynchronous so there isn't really a concept of a response. I would be okay with returning the env, but I don't know any use cases where the middleware would want to change something on the way back out of the stack.

Route-specific Middlewares Instead of callbacks

😍 ❤️ Squeee! I strongly prefer using middleware instead of having ActiveRemote::Base implement before and after callbacks. But there is an intentional choice to try to mirror the behavior of Rails controllers to make this pub/sub look and feel familiar for Rails developers.

@mmmries
Copy link
Contributor

mmmries commented May 30, 2016

I wonder if the right plan of attack from here is to start with a PR that removes the custom runner/executor part of the middleware, and change it to just use a standard middleware (keeping it global initially).

Then in a second PR we could add route-specific middlewares that get automatically composed onto the end of the global middleware. What do you think?

@jasonayre
Copy link
Author

@mmmries could of follow up comments:

The Router/Executor should not return nil

That giant hack of a mocking_resource_subscriber method above is the use case for this. I.e., I want to run the subscriber through the middleware stack when testing, and give the env to the subscriber as it will receive it in the real world:

  class FakeMiddleware < ::Middleware::Runner
    def initialize(stack)
      stack << FakeRouter
      super(stack)
    end
  end

  route.middleware.instance_variable_set(:@runner_class, ::FakeMiddleware)
  env = ::ActionSubscriber::Middleware::Env.new(subscriber_class, encoded_payload, properties)
  return subscriber_class.new(env.middleware.call(env))

If it returns nil at the end, you cant run it through the middleware independently, so it makes for much easier testing by not returning nil I think. (the second part of the problem is of course that it would try to route the message without ripping it out as per above).

Route-specific Middlewares Instead of callbacks

To be clear I'm not suggesting callbacks are removed, just that the routing of the message (which runs the callbacks), is moved outside of the middleware, and the callbacks are run as part of routing the message, not within a middleware directly.

@jasonayre
Copy link
Author

@mmmries and regarding plan of attack 👍 sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants