-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAR offset writer #290
CAR offset writer #290
Conversation
9c6c1cc
to
9c7956a
Compare
|
||
var _ io.ReadSeeker = (*CarReaderSeeker)(nil) | ||
|
||
func NewCarReaderSeeker(ctx context.Context, payloadCid cid.Cid, bstore blockstore.Blockstore, size uint64) *CarReaderSeeker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you confident you'll know the overall size of the car at construction time here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are for the Estuary use case - if we don't know it then it makes seeking from the end infeasible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to seek from the end? the current read-seeker in this library disallows seeking from end since it doesn't know total length in advance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http.ServeContent() gets the size of the file by seeking to the end of it. I couldn't see a way to get around that without copying a whole bunch of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially we could do something like providing the size of the file as an option to CarReaderSeeker, and if it's not set we disallow seeking from the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. I guess worth including then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirkmc Some comments from a first review
go func() { | ||
err := c.cow.Write(writeCtx, pw, uint64(c.offset)) | ||
if err != nil && !xerrors.Is(err, context.Canceled) { | ||
pw.CloseWithError(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return an error even if the context is cancelled as this error gets propogated to the read side and if we don't pass an error on context cancellation, the read side will wrongly interpret it as an EOF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little convoluted but:
- Read and Seek should not be called concurrently
- The context is only cancelled by Seek
- Therefore any reads will have finished when the context is cancelled
- Therefore there should not be a read in progress anyway. We just want to discard any bytes that have been written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary that the context is only cancelled by Seek. You also pass in a parent context in there which would be cancelled by the creator of ReadSeeker
. Therefore, context cancellation errors on write should be propagated to the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ I think I agree with this, you're accepting a parent context and it's making its way down here, the cancellation could be for any number of reasons very far from this API, so having that bubble back up would seem to be the appropriate behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... actually now I'm looking at Seek()
it's not that simple is it? we're explicitly using the cancel on the new context created for this in the case of a seek that comes in while we're writing. So the cancellation could come from internal, in Seek()
or external from parentCtx
.
But, the cancellation from Seek()
shouldn't ever show up from the CloseWithError()
here since the reader is discarded there. So it shouldn't matter if we pass it on since there's no check for it. But since it is passed on with a standard Read()
, then the parentCtx
cancellation error will propagate, but will currently be missed.
So it's still correct to pass the error on regardless of where it comes from; as long as we keep it clear in the API docs that this isn't threadsafe (which you've done, but we need to retain when we add more docs here).
What are the next steps for getting this merged? |
The two main things I can see are:
Something else I'd like to discuss is the naming - The |
8171360
to
3d96944
Compare
3d96944
to
89de813
Compare
@masih : can you please allocate some time to talk with @rvagg and figure out whether we go with this approach or #291 . I'd like to get this resolved given the complications it caused when resolving GHSA-9x4h-8wgm-8xfg . Basically, lets not have Boost depending on a custom branch of go-care :) |
This has been raised by @hannahhoward as the possible cause of the problems Estuary is having with making deals and getting consistent CommP. A reason that might be plausible is that those problems seem to be mainly (but not exclusively) for larger deals, which maybe are more prone to needing resumption in HTTP transfers?
|
2022-10-04 IPLD triage conversation: @rvagg is going to write up the current state and next steps. |
Closing this now that Boost is using a standard release and not a custom branch off this repo. Boost contains CarOffsetWriter locally, so this functionality is available there. @willscott and I will work to get #291 sorted, in some form, that provides a viable alternative to this. That ball's in my court atm, there's a bit of a limbo state in #327 that I dropped the ball on because I started to question whether the complexity to make it all work properly was worth it. So that's my TODO. |
This PR introduces two classes: