Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing error messages in bsdtar on EOF #2205

Open
stoeckmann opened this issue May 21, 2024 · 4 comments
Open

Missing error messages in bsdtar on EOF #2205

stoeckmann opened this issue May 21, 2024 · 4 comments

Comments

@stoeckmann
Copy link
Contributor

There are multiple occurrences in code where error messages are not set, which result in (null) output:

$ base64 -d <<< N3q8rycc | bsdtar -t
bsdtar: (null)
bsdtar: Error exit delayed from previous errors.

I'm not sure which approach is the best one:

The individual source code files could be adjusted, but even then different error codes are already in place. ARCHIVE_ERRNO_FILE_FORMAT seems to be used more often than ARCHIVE_ERRNO_MISC. So this would imply that unifying these files should be done first before making things worse.

Alternatively, required archive_set_error call could be integrated into __archive_read_filter_ahead so all cases get at least a bare minimum error message. This has the drawback that perfectly valid short reads (during bidding) would set an error message even though no real error exists. These could be removed manually again, but this just shifts the burden from regular readers to bidders.

If every source code file was to be individually adjusted, I would start doing this for each format individually because otherwise, the PR would be huge and hard to verify, especially if example files should be created to prove/highlight the issue. Also, it would take quite some time.

So let me know what you think!

@evelikov
Copy link
Collaborator

ARCHIVE_ERRNO_FILE_FORMAT and ARCHIVE_ERRNO_MISC are defined as Unrecognized or invalid file format and Unknown or unclassified error respectively. Which makes me wonder, if any libarchive users will be affected by such unification?

Looking at the implementation:

  • ARCHIVE_ERRNO_FILE_FORMAT is EFTYPE or EILSEQ or EINVAL
  • ARCHIVE_ERRNO_MISC is always -1

Of which, EFTYPE is mentioned in the glibc manuals referring to chmod. Although it's not mentioned in chmod(2), chmod(3p) or anywhere else on my Arch Linux box. EILSEQ on the other hand is POSIX.1, C99 standard (also available on Windows), defined as Invalid or incomplete multibyte or wide character.

Perhaps we can remove all ARCHIVE_ERRNO, with ARCHIVE_ERRNO_PROGRAMMER becoming assert(3), the rest EINVAL. Although it should probably be deferred to API v4.

Tl:Dr: Silently changing between ARCHIVE_ERRNO_ might cause subtle breakage. Can we nuke them with v4?

@evelikov
Copy link
Collaborator

Just some food for thought - I don't know the historical reason behind the libarchive specific errno(s).

@kientzle
Copy link
Contributor

I can provide some history here:

  • Libarchive used to abort() on programmer errors. This was changed because it caused problems for people creating bindings to other languages (such as Python) where they wanted to throw corresponding errors in such cases.

  • I don't remember now why libarchive prefers EFTYPE to EILSEQ to EINVAL. It looks like it used only EILSEQ until July 2004, then the other options got added as part of what looks an effort to improve portability (so presumably I was experimenting with ports to other platforms that didn't have EILSEQ?).

I think this could be changed as part of v4, but I would want ARCHIVE_ERRNO_PROGRAMMER to be different from ARCHIVE_ERRNO_FILE_FORMAT. Certainly there are still a lot of places that don't set errors when they should. I'm not sure if there's a really good way to systematically fix that. Perhaps the top level functions in archive_read.c could check for a missing error value and set some default whenever returning an error code?

@evelikov
Copy link
Collaborator

All the instances of ERRNO_PROGRAMMER that I've seen are effectively guarding against semi-broken format/filter(s) in libarchive itself. For example: look through libarchive/archive_read_append_filter.c.

Wouldn't it be better to, make those asserts() and add any commissions in the test-suite? It's what other projects do. Not to mention that both ERRNO_PROGRAMMER and ERRNO_FILE_FORMAT could be numerically identical in some cases/platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants