-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadableFileHandleProtocol.readToEnd
can fail to read the complete contents of file sizes less than the single shot read limit
#2769
base: main
Are you sure you want to change the base?
Conversation
… shot read limit since readChunk is not guaranteed to read all the requested bytes
PRBs are failing due to the change to make the Before we look into that tho, I think this would be a good opportunity to discuss how we could improve the API here if breaking changes were on the table. To illustrate the API problem I was describing in the PR description: // read to the end of a file using an unbounded chunk range
for try await chunk in handle.readChunks(in: ..., chunkLength: .bytes(128)) {
bytes.writeImmutableBuffer(chunk)
}
// then, call `readToEnd` on the same file
var contents = try await handle.readToEnd(maximumSizeAllowed: .bytes(1024 * 1024)) If someone unfamiliar with the project read this, I think it would be reasonable for them to think that var contents = try await handle.readToEnd(fromAbsoluteOffset: 0, maximumSizeAllowed: .bytes(1024 * 1024)) So the value of This is then made even more confusing by the behavior if the file is a fifo, since in that case, an offset of zero means that we should begin reading from the current position (since seeking is impossible). |
Thanks for opening this PR @rpecka! First of all I'd like it if we could separate this into two separate PRs: the issue with reading chunks is very different to the issue of potentially reading short so these should be addressed separately. W.r.t. the issue with reading chunks, I don't think the user should be passing in an optional range here. Instead I think we should detect whether the file being read is a FIFO and then call the appropriate |
FWIW: some of the PRBs are failing because you're using syntax which isn't available in older Swift versions (5.8) which we still support. |
let chunkLength: ByteCount = if !forceChunkedRead, readSize <= singleShotReadLimit { | ||
.bytes(Int64(readSize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still support Swift 5.8 so you can't use this syntax, you'll need to declare let chunkLength: ByteCount
and then assign to it.
Resolved an issue where
ReadableFileHandleProtocol.readToEnd
could fail to read the contents of files smaller than the single shot read limit (64 MiB).Motivation:
If
readToEnd
detects that the file in question is smaller than the single shot read limit, then it will read the file using a single call toreadChunk
, however, there isn't a guarantee thatreadChunk
will return the entire requested chunk. If this happens, thenreadToEnd
only returns the result of the first read and does not execute any followup reads.Modifications:
I separated this into two sections (two commits) because I found another issue that I had to resolve in order to fix the chunking problem.
First Commit
This is what is required to fix the missing chunk reads, but it causes
testReadFileAsChunks
to fail becausehandle.readChunks(in: ..., chunkLength: .bytes(128))
moves the file access position to the end, which means that the subsequenthandle.readToEnd(maximumSizeAllowed: .bytes(1024 * 1024))
reads zero bytes since the file is fully read, so we get a precondition failure when we runcontents.moveReaderIndex(forwardBy: 100)
because we're trying to move the reader index to 100 for a byte array of length zero.The problem is that when we initialize a
FileChunks
object, if the range is set to0..<Int.max
, we use the.entireFile
chunk range. This causesBufferedStream
to use aProducerState
with anil
range, which means that no seeking is done when reading chunks. It looks like this behavior is intended for the case where we want to read an unseekable file, but it's being inadvertently triggered when we request a chunked read of a whole file.TLDR: If we do any chunked read of a file, then try to do a chunked read of the entire file, the second read will begin where the first one left off instead of moving the pointer to the beginning of the file, despite the caller requesting a range starting at index zero.
Second Commit
ChunkRange
to have two modes:current
: reads from whatever the underlying file handle's offset currently is.specified
: reads from the specified range.ReadableFileHandleProtocol.readChunks
. This will trigger the use ofChunkRange.current
.ReadableFileHandleProtocol.readToEnd
when reading an unseekable file.testWriteAndReadUnseekableFile
: I think that this test was incorrect and there's no reason that we should not be able to read the contents of a fifo that we just wrote to.General Comment
Part of the reason I think this is happening is because the
readToEnd
function is a bit counter intuitive in that it has a default parameter of 0 forfromAbsoluteOffset
. When it's called using the default, it's not clear to the caller that it's going to go back to offset zero before reading (if the file is not a fifo). Maybe this should be changed to anil
default?Result:
readToEnd
should now return the full file contents when the file size is lower than the single shot read limit butreadChunk
does not return the entire requested chunk.