-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParseJsonProcessor initial implementation #1688
ParseJsonProcessor initial implementation #1688
Conversation
Signed-off-by: Finn Roblin <[email protected]>
Signed-off-by: Finn Roblin <[email protected]>
* | ||
* @return The name of the destination field. | ||
*/ | ||
public String getDestination() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better for the default destination
(Event root) to be /
instead of null
. Then we avoid some null values and my opinion is that /
is more declarative. What do other people think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we used /
as the root what would the expected behavior behavior be if someone had multiple /
s in the destination? One might expect it to create a nested field similar to directory syntax .
ex: Would destination: /foo/bar
become foo.bar: <parsed_json>
where bar
is a nested field?
That might be useful, but I don't think that's what we're going for here so maybe /
would be misleading. I do think your point about a more declarative default is valid though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the behavior if someone had multiple /
in the destination (/foo/bar
) is that the parsed json will be written to the most nested field in the destination. So if the json is {"key":"value"}
and the destination is (/foo/bar
), event.get("/foo/bar", Map.class) == {"key":"value"}
and event.get("/foo/bar/key", String.class) == "value"
. Right now the parse_json
processor doesn't flatten nested fields (so /foo/bar
is not flattened to foo.bar
), it just puts any nested json into the event without changing it.
I do see how /
is misleading. It's also misleading for another reason: calling event.put("/", value)
will not put value
into the event root because the JacksonEvent
implementation trims leading slashes. So it would not write value
to the event at all.
I think we can stick with a null
default and be more declarative by mentioning that destination is null -> write to root
in documentation.
final Event event = record.getData(); | ||
final String message = event.get(source, String.class); | ||
try { | ||
TypeReference<HashMap<String, Object>> hashMapTypeReference = new TypeReference<HashMap<String, Object>>() {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching! I'll add final
in the next revision or the next parse_json
PR to avoid triggering re-review.
* | ||
* @return The name of the destination field. | ||
*/ | ||
public String getDestination() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we used /
as the root what would the expected behavior behavior be if someone had multiple /
s in the destination? One might expect it to create a nested field similar to directory syntax .
ex: Would destination: /foo/bar
become foo.bar: <parsed_json>
where bar
is a nested field?
That might be useful, but I don't think that's what we're going for here so maybe /
would be misleading. I do think your point about a more declarative default is valid though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the new ParseJsonProcessor looks good, thanks for adding it!
implementation 'com.fasterxml.jackson.core:jackson-databind' | ||
} | ||
|
||
test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend setting a minimum test coverage to maintain our high coding standards. Here is an example for another plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the recommendation!
I set the threshold to 96% but it's slightly higher in reality because there aren't any unit tests for the prepareForShutdown
, isReadyForShutdown
, and shutdown
methods. These methods are needed to extend AbstractProcessor
but they're empty because parse_json
doesn't have any special behavior for shutdown. The implemented line and branch coverage are 100%.
@BeforeEach | ||
void setup() { | ||
ParseJsonProcessorConfig defaultConfig = new ParseJsonProcessorConfig(); | ||
lenient().when(processorConfig.getSource()).thenReturn(defaultConfig.getSource()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: lenient()
can hide unexpected function calls in tests.
Signed-off-by: Finn Roblin <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #1688 +/- ##
============================================
- Coverage 93.58% 93.41% -0.17%
- Complexity 1304 1307 +3
============================================
Files 170 172 +2
Lines 3804 3829 +25
Branches 303 303
============================================
+ Hits 3560 3577 +17
- Misses 173 181 +8
Partials 71 71
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@NotEmpty | ||
@NotNull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
group 'org.opensearch.dataprepper' | ||
version '2.0.0-SNAPSHOT' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove these 2 lines. it's not required
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.8.1' | ||
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.8.1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove these dependencies. I think these are used from parent project.
Signed-off-by: Finn Roblin <[email protected]>
…espace-only sources Signed-off-by: Finn Roblin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks Finn!
* Initial implementation of parse_json processor Signed-off-by: Finn Roblin <[email protected]>
Description
Starting point to implement the
parse_json
processor. This PR adds a configuration class, unit tests, and implementation of core features of theparse_json
processor. With these changes the processor can deserialize JSON strings and will deserialize all nested fields. A JSON pointer feature to parse only a part of the JSON still needs to be added. The behavior of deserialized JSON Arrays still needs to be tested and some modifications to the logic might be necessary.Issues Resolved
Step towards resolving #831
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.