-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom attribute methods. #49
Add support for custom attribute methods. #49
Conversation
An alternative that requires fewer changes and is more performant (because you don't need to instance a serializer per object) can be implemented with this API: class MovieSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :release_year, :title_with_year
def self.title_with_year(record)
"#{record.name} (#{record.release_year})"
end
end |
@@ -61,8 +61,9 @@ def hash_for_collection | |||
data = [] | |||
included = [] | |||
@resource.each do |record| | |||
data << self.class.record_hash(record) | |||
included.concat self.class.get_included_records(record, @includes, @known_included_objects) if @includes.present? | |||
serializer = self.class.new(record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does initializing an object with every iteration affect performance on large collections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing numbers in current dev vs this branch:
dev:
Serialize to JSON string 250 records
Serializer Records Time
AMS serializer 250 83.96 ms
Fast serializer 250 3.4 ms
Serialize to Ruby Hash 250 records
Serializer Records Time
AMS serializer 250 77.04 ms
Fast serializer 250 2.73 ms
Serialize to JSON string 1000 records
Serializer Records Time
AMS serializer 1000 303.68 ms
Fast serializer 1000 9.94 ms
Serialize to Ruby Hash 1000 records
Serializer Records Time
AMS serializer 1000 300.06 ms
Fast serializer 1000 12.39 ms
this branch:
Serialize to JSON string 250 records
Serializer Records Time
AMS serializer 250 78.39 ms
Fast serializer 250 3.71 ms
Serialize to Ruby Hash 250 records
Serializer Records Time
AMS serializer 250 75.98 ms
Fast serializer 250 2.74 ms
Serialize to JSON string 1000 records
Serializer Records Time
AMS serializer 1000 307.92 ms
Fast serializer 1000 10.63 ms
Serialize to Ruby Hash 1000 records
Serializer Records Time
AMS serializer 1000 279.43 ms
Fast serializer 1000 8.98 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More benchmarks (with includes and meta):
dev:
Serialize to JSON string 250 with includes and meta
Serializer Records Time
AMS serializer 250 112.38 ms
Fast serializer 250 4.62 ms
Serialize to Ruby Hash 250 with includes and meta
Serializer Records Time
AMS serializer 250 101.24 ms
Fast serializer 250 4.35 ms
Serialize to JSON string 1000 with includes and meta
Serializer Records Time
AMS serializer 1000 422.2 ms
Fast serializer 1000 20.34 ms
Serialize to Ruby Hash 1000 with includes and meta
Serializer Records Time
AMS serializer 1000 394.01 ms
Fast serializer 1000 18.07 ms
this branch:
Serialize to JSON string 250 with includes and meta
Serializer Records Time
AMS serializer 250 104.63 ms
Fast serializer 250 4.96 ms
Serialize to Ruby Hash 250 with includes and meta
Serializer Records Time
AMS serializer 250 101.09 ms
Fast serializer 250 4.3 ms
Serialize to JSON string 1000 with includes and meta
Serializer Records Time
AMS serializer 1000 401.42 ms
Fast serializer 1000 21.19 ms
Serialize to Ruby Hash 1000 with includes and meta
Serializer Records Time
AMS serializer 1000 389.07 ms
Fast serializer 1000 21.17 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some cost, and incidentally, the performance cost using new instances on collections is something we actively debated about during development. On my machine the results are slightly different, and while the speed is impacted, it's still "performant" IMHO.
I do like the idea of @christophersansone's approach though. And I'm wondering if we can combine that approach with the instance method definition. The class could inspect for instance methods and convert them to procs using ruby goodness. Then, serialization would not require new instances and we could maintain the instance method syntax.
This is something I have not explored, and it may add more complexity than it warrants. It may also add some level confusion for developers on this project since an instance method would not be acting as an instance method...
For giggles, here is a comparison in branches bumped to 10,000 records. I don't know why you would serialize 10,000 records following the JSON API spec, so it's a pretty contrived example, but it does illustrate the cost a little better.
dev:
Serialize to JSON string 10000 with includes and meta
Serializer Records Time
AMS serializer 10000 4791.46 ms
Fast serializer 10000 234.48 ms
Serialize to Ruby Hash 10000 with includes and meta
Serializer Records Time
AMS serializer 10000 4484.9 ms
Fast serializer 10000 193.3 ms
this branch:
Serialize to JSON string 10000 with includes and meta
Serializer Records Time
AMS serializer 10000 4741.36 ms
Fast serializer 10000 360.68 ms
Serialize to Ruby Hash 10000 with includes and meta
Serializer Records Time
AMS serializer 10000 4529.02 ms
Fast serializer 10000 214.24 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if it's better to take a performance hit by merging this in, and then iterating to a more performant approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may also add some level confusion for developers on this project since an instance method would not be acting as an instance method
+1 on that. We would definitely have to take into account attribute methods that call other methods (e.g. if you have a complex method to calculate an attribute that calls other methods). A developer would expect that to just work, so just cherry-picking certain instance methods could be trouble. And using instance variables would not behave as expected, either. The serializer could potentially have some assertions for the developer to help avoid some common pitfalls. But if we can avoid all of this by just having a block with the attribute definition, that works.
A couple different options for those who would really want a separate method (though they would have to be class methods):
attribute :title_with_year { |r| MovieSerializer.title_with_year(r) }
def self.title_with_year(record)
...
end
or
attribute :title_with_year, class_method: :title_with_year
def self.title_with_year(record)
...
end
Or we just have a generator function (or recommend that they do it themselves). Or we just provide the capability and offer some suggestions in the readme...
def self.class_attribute(attr)
attribute(attr) { |record| self.class.send(attr, record) }
end
class_attribute :title_with_year
Lots of options to avoid serializer instantiation. My opinion would be to start with the simplest, most performant solution and then iterate on it based on feedback or limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christophersansone @grossadamm I like the simplicity of the Proc approach but in my opinion, there are many cases where it's convenient to break the calculation of attributes into multiple methods and I'm not sure if we can achieve that using Procs. e.g:
class FeedItemSerializer
# ignoring other attributes
attributes :cooked_count, :likes_count, :liked_photo_comments_count
# ignoring other associations
has_many :liked_photo_comments
def liked_photo_comments
return [] unless triggered_by_like?
photo_comments_liked_by_author.recent.limit(4)
end
def liked_photo_comments_count
return unless triggered_by_like?
photo_comments_liked_by_author.count
end
def cooked_count
approved_photo_comments.count
end
def likes_count
recipe_likes.count
end
private
def triggered_by_like?
# ...
end
def photo_comments_liked_by_author
# ...
end
def recipe_likes
# ...
end
def approved_photo_comments
# ...
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another case we rely on method attributes and goodies of OOP is in our fragment caching (used to cache attributes in serializers where isn't desirable to cache entire objects)
# Using Rails.cache for the example, the logic of our Caching System is a bit more complex.
class ApplicationSerializer < ActiveModel::Serializer
def self.expire_cache_fragment!(name)
Rails.cache.delete("ams_fragment_cache_#{name}")
end
protected
def cache_fragment(name)
Rails.cache.fetch("ams_fragment_cache_#{name}") do
yield
end
end
end
And then we can use the cache_fragment
method in all the subclasses of ApplicationSerializer, e.g:
class MovieSerializer < ApplicationSerializer
attributes :background_actors
def background_actors
cache_fragment("background_actors") do
# Heavy computation to serialize all background actors ...
end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guilleiguaran @grossadamm Agreed that complex calculations could be cumbersome with the block syntax, and the AMS syntax is quite nice for doing this. But is it worth the performance hit for every single serialization, regardless of whether this extra syntactic sugar is needed? My opinion is that we should not force some performance penalty at all times, but potentially allow the developer to opt into it as needed.
One solution is to simply use class methods, passing in the record each time. This example assumes attribute block syntax and nothing else, but we can design something better if the concept is a good one:
attribute :title_with_year { |r| MovieSerializer.title_with_year(r) }
def self.title_with_year(record)
...
end
If that is not sufficient, then consider having a separate decorator class that can be instantiated.
attribute :title_with_year { |r| MovieDecorator.new(r).title_with_year }
We could also implement the ability to define a decorator in the serializer, which would instantiate an instance of the decorator for each record, e.g.:
class MovieSerializer
include FastJsonapi::ObjectSerializer
decorator MovieDecorator
attribute :title_with_year
end
class MovieDecorator
include FastJsonapi::ObjectDecorator
def title_with_year
...
end
end
The decorator would basically define an initialize
with the record, and it would implement method_missing
to send all undefined methods to the record.
Lots of options here, just trying to come up with some ideas that work but don't impose any performance loss or unnecessary memory consumption by default.
@guilleiguaran Thanks for taking a crack at this. I was about to build an implementation myself! I wonder if this can be a lot simpler to implement and potentially faster to render if we just used a block or Proc. I was thinking of the following: class MovieSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :release_year
attribute :title_with_year do |record|
"#{record.name} (#{record.release_year})"
end
end Then the serializer can just store the block and execute it with each record iteration. |
As someone who uses AMS a lot, I am not only jumping for joy at the speed boosts that this project promises, but also how it can better handle memory usage as the size and number of documents grow (see rails-api/active_model_serializers#2051). I have not done any memory benchmarking between the two, but a quick code review seems to indicate that memory consumption will not be an issue in the current fast_jsonapi implementation due to the lack of object instantiation. I would be cautious about an implementation that requires instantiation of a new serializer object for each record. It can become problematic over time and with a lot of serialization happening in a real production environment, and there can be a ton of GC thrashing going on as a result. Better to avoid the situation all together. |
@christophersansone thanks for your feedback, I understand your concern and that was the reason I proposed another alternative myself in #49 (comment) I have added the dev branch:
this branch:
The difference appears to be That said, I'll be happy with whatever implementation the authors choose. |
@guilleiguaran Thanks for providing some actual data around my assertion about memory concerns. Yes, 3KB does not seem like much, but I would imagine that the growth would be fairly linear, until the GC kicks in. It could be a problem in high traffic situations. And my understanding of Ruby internals is weak, but it is possible that as more functionality is added to the serializer code base, the more memory each object instantiation will take. The reality is that it takes time, effort, and memory to instantiate and clean up Ruby objects, and if there is an opportunity to design an API that avoids instantiation without sacrificing code readability, it is worth serious consideration... especially for a gem whose name starts with the word |
@guilleiguaran, @christophersansone Thank you for making pull requests to support this feature. In my view the best developer experience is the one AMS has for custom attributes. It's simple and easy to understand and it would be familiar syntax to everyone using this gem. So the best case scenario is if we can get the AMS developer experience with 0-10% performance degradation. I am not sure if it's possible, but i want to give that a shot. Any ideas of how we can achieve that? |
@shishirmk Thanks! My concern with the "10% performance budget" is that this project is so young, and decisions we make today that cause performance degradation could have an increasing impact as the code base grows. I imagine AMS did not start out as slow as it is today, but design decisions made early on have made it virtually impossible to undo without causing breaking API changes. There are three high-level possibilities here:
Another concern about instance-level serialization in general is the introduction of naming collisions between attributes and the methods inside the serializer, e.g. There are several options here, each with their own set of trade-offs. I would like to propose one last idea that I believe strikes a nice balance: class MovieSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :release_year, :title_with_year
module HelperMethods
def title_with_year
...
end
end
end Upon loading the class, it can detect if the
Several options here. It's awesome to see that the community is rallying around this project so quickly, and that we are getting a wide set of opinions with this feature. |
@shishirmk I am with @christophersansone on the point about the 10% budget. We definitely should look to solve it without performance regression. This discussion is great, we should think about strategies we have not thought of yet. |
This gives me an idea, let's assume that we have this serializer: class MovieSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :release_year, :title_with_year
def title_with_year
# ...
end
end and copy/pasting most of the @christophersansone's idea: Upon loading the class, it can detect if instance methods have been defined, and if so, adjust the serialization strategy to instantiate a "decorator" for each record, containing those methods. What do you think? |
@guilleiguaran I imagine it could do this sort of detection and select an optimal strategy, such as class-based serialization vs. instance-based serialization, or reusable instances vs. one-time-use instances. My understanding is that Ruby does not allow cherry-picking methods from one class and insert them into another (decorator) class, so I think this syntax would limit us to instantiating the actual serializer. It is definitely an option. |
@guilleiguaran @christophersansone what if we used the serializer instance itself as the decorator for the record? I realize this breaks some design principles. But still something to think about. |
@shishirmk Just to clarify, are you asking what if we did the serialization at the class level but instantiated instances of the serializer solely for the purpose of "decorating"? It's an option. One side effect would be that these decorator methods would have to be |
@shishirmk @christophersansone I'll work in a proof of concept implementation of this now to see how it goes. |
@christophersansone @guilleiguaran We could move some of the class methods and make them instance methods. I don't think that would have a significant impact on performance. I think it's best if we don't create more than one instance of the serializer class or any other class per call to |
In general I would be wary of a solution that provides an AMS-like syntax... but breaks some canonical Ruby syntax, such as the use of instance variables, private and protected methods, etc. Fundamentally, I believe whatever solution that is agreed upon should embrace the architectural differences between the two projects, even if it means a slightly different syntax. AMS is designed so that a serializer instance will serialize a single record, so its attribute definition syntax can assume instance methods, an I have another idea. There are a lot of opinions flying around about what the right solution could be. What if FastJsonapi were instead less opinionated about how to implement this? Instead, we could just offer some primitive hooks and some examples in the documentation. For example: class MovieSerializer
include FastJsonapi::ObjectSerializer
attributes :name, :release_year, :title_with_year
# override to convert the record into something that accepts the defined attributes and relationships
def self.serializable_object(record)
# default implementation: "return record"
return MovieDecorator.new(record)
end
# override to change how a particular attribute is serialized
def self.serialize_attribute(record, attribute_name, options)
# default implementation: "return record.send(attribute_name)"
end
end This way, (a) it's completely customizable, and (b) we can better allow some good patterns to emerge with some real-world use. And we can ensure that FastJsonapi remains fast out of the box. In fact, for those who want a more AMS style approach, they could either implement it themselves, or we could provide another Similarly, this would allow others to build modules to invoke a particular syntax. We could include them in this gem, or just let them live as standalone gems. This kind of approach would allow FastJsonapi to be pluggable and solve things other than just how attribute methods should be declared. |
@christophersansone @shishirmk I've been exploring the option of using the serializer class itself as decorator, I've implemented it in this commit: guilleiguaran@7c10d5e I've done a comparison with 10,000 records (with fewer records the difference is insignificant) to illustrate the cost of the change: dev branch:
feature_custom_attr_serializer_v2 branch:
wdyt? |
@guilleiguaran @shishirmk Thanks, super work! I would say that, if we definitely want to support the exact AMS syntax, at the expense of other more performant options that are less AMS-like, then this looks great. It's a huge help that it detects whether or not separate instances need to be created. I added a couple comments in the code worth reviewing, but other than that, 👍 from me. The only difference to AMS is that the custom attribute methods must be public, because they are being called externally. I don't think that's a big deal though. We can even build that check into the detection and throw an exception with a helpful error message, e.g. I do think that, if And thanks for the perf metrics! Would you be willing to provide a set of metrics both with and without custom attributes? Having a separate perf test for the custom attributes may be helpful too... we expect it to be less than 25x, and that's okay, but it would help to know what it is and be able to check for performance regressions in the future. |
@christophersansone thank you very much for your review, I've applied your suggestions/fixes in guilleiguaran@b46d75d and now there isn't any difference in performance between Running again perf tests with 10,000 records: dev branch:
feature_custom_attr_serializer_v2 branch:
I'll add perf tests for the custom attributes and open a PR later today if other folks in this thread are happy with this solution. |
@guilleiguaran @christophersansone You guys have been awesome. Thank you for trying out all the different ideas we came up with. I think custom attributes is a very important feature a lot of people have asked for it. I would like to have it in the upcoming 1.1.0 release which we are planning to release next week. Keeping in mind the time we have, performance and ease of use I am planning to merge in #54. We can continue to refine the feature after we release 1.1.0 |
Just want to say... this was incredible to read through. Big brains here. Great work and thanks for the feature!! |
Is this ready to merge in? Looks like everything is passing other than some conflicts to fix. |
Isn't this redundant now that #54 is merged in? |
@axsuul as @shishirmk #54 is merged but we will continue exploring the idea of supporting custom attributes through methods in next release since blocks aren't suitable for all situations. |
We have diverged too much so closing this |
This adds support for custom attribute methods as suggested in #5.
This required substantial changes in the API, moving the SerializationCore methods from class to instance level making it more similar to AMS API.