PE: parse rich header and refactor DOS stub parser #406

kkent030315 · 2024-04-13T00:58:38Z

This PR adds parsing of Rich headers, as someone opened issue #400.

Added rich header goblin::pe::header::RichHeader goblin::pe::header::RichMetadata parsing if present, with success/fail/corrupted tests.
DOS stub is now non-fixed size of slice reference, and explicitly isolated from the DOS header.
- The DOS stub is not always guaranteed to be 64-bytes long; there are some linkers (Borland C ) and PE packers generate application-specific DOS stub; (dos stub length restricts to the first 64 bytes #422) and
- The DOS stub includes the DOS header in definition. The DOS header and DOS stub are sometimes known to be separate, but also sometimes together. This is confusing stuff, and there are no official statement. By design, separating the header and the stub would be preferred. Concating them in the higher implementations (on the user side) are pretty easiest than making it together (otherwise user needs to mess with the raw buffer in the higher implementations.. bad).

I took the constant bytes in the test code from mthiesen/link-patcher (MIT). If this can potentially be license incompliance, I am happy to make own specimens for testing.

kkent030315 · 2024-04-13T01:05:03Z

I'm a bit tired, will fix the CI at the rest of today or tomorrow.

… the current scope

m4b

unfortunately there are some breaking changes here, but some seem unavoidable; primarily I'd like to see DosStub have it's parse method move into an impl DosStub and the &[u8] be reverted back to the DosStub type that was there before. otherwise it's looking ok!

src/pe/header.rs

src/pe/mod.rs

src/pe/header.rs

m4b · 2024-05-20T06:24:54Z

@kkent030315 gentle ping; would definitely like to see this go in, but I had requested some changes :)

NOTE: all `cargo test` passed

kkent030315 · 2024-10-10T07:45:19Z

@m4b Hey, sorry for being shadow. I had do deal with the my life, you know.. anyways, a little bit changes have been applied to fix the code indeed, and the tests cargo test are passed. Would you like to checking this out?

The current codebase in this PR is what I am used/using as an backend parser helper for my own binary rewriting framework that can partiolize the any PE64 binary into object-by-object (aka struct-by-struct, byte-by-byte etc) and this rich header code works perfectly with the 1000 unique PE64 binaries over the world (regardless of compiler toolchains, including but not limited to MSVC(link), Clang(LLD), MinGW-GCC, NOTE: Only MSVC linker inserts rich header) as well, so it's quite stable for now I believe, while there are something to deal with the coding style you pointed though.

I'm going to deal with another my PRs ASAP. I'd like to contribute more on goblin, as theres too many things to get rid of, and some insufficient features.

m4b · 2024-10-22T06:36:47Z

I'm sorry I knew there was another PR I needed to check before releasing 9.0, I will give this a review, but probably won't be till weekend. Please feel free to ping me then if you haven't gotten any feedback. And thanks for your contributions! :)

m4b · 2024-10-22T06:38:51Z

Also I hope everything in your life is good; don't feel need to push yourself either, your life comes first :)

kkent030315 · 2024-10-25T12:17:41Z

@m4b Thanks for taking attention on this! I think theres lots of things to discuess here. But I think breaking changes are inevitable for this PR, perhaps how about introducing this feature on 0.10?

m4b

Ok this is the last time, I think we're almost done here; as I mention in another comment, tl;dr I did not realize the heart of the issue is that the DosStub is variable and bounded by pe_pointer_offset; your changes look correct, assuming that statement is correct :) that being said, we should go back to your original intuition, as I mention, and the bytes should not be allocated/owned, but slices, so we have to pass a lifetime param to Header, etc. This is fine, this pattern is common in other parts of goblin, e.g., mach-o has a lot of that.

For bonus points, we can keep Copy, i'm pretty sure, on all the structs, if you do impl Iterator<Item=Result<RichMetadata>> for a method on RichHeader to avoid having a Vec inside of it.

After that, I Think this is ready to go, though do note it is a breaking change, a pretty good one, but I think it's worthwhile.

Oh and you have the honor of deleting the comment about adding RichHeader to the Header field, since you implemented it :)

src/pe/header.rs

m4b#406 (comment)

m4b#406 (comment) m4b#406 (comment)

kkent030315 · 2024-10-26T14:07:37Z

@m4b Thank you very much for the review! I think it's almost perfect now. Would you able to take a look into the fixes?

kkent030315 · 2024-10-26T14:23:32Z

Maybe something to discuss here #406 (comment):

We can explicitly isolate DOS stub and Rich header datas, with some mroe complex implementations, but it makes no sense for me to do that.

I can definitely do that work if we should.

kkent030315 · 2024-10-26T16:34:56Z

Self review is done. Looks perfect for me right now.

m4b · 2024-10-27T00:40:25Z

@kkent030315 apologies I wanted to merge the parse_without_dos patch since it was before this, and was waiting a while; so you'll have to rebase, and add a rich_header parameter to the _impl function and pass a None for parse_without_dos is my guess; in meantime I'll do a final review but this looks ready to go just cursorily glancing at it

I appreciate your patience!

m4b · 2024-10-27T04:35:55Z

So I just tried doing this rebase, and it's very annoying/tedious due to all the individual commits; i suggest squashing your patch down to a single commit first, and then rebasing on master to make it easiest.

kkent030315 · 2024-10-27T08:32:48Z

How about now? I resolved all conflicts wtih the upstream and should be able to merge with squash without problems.

kkent030315 · 2024-10-27T09:20:50Z

apologies I wanted to merge the parse_without_dos patch since it was before this, and was waiting a while; so you'll have to rebase, and add a rich_header parameter to the _impl function and pass a None for parse_without_dos is my guess; in meantime I'll do a final review but this looks ready to go just cursorily glancing at it

As of current implementation of rich header parser is always individual regardless of DOS stub, since it takes hardcoded offset of end of DOS stub, so rich header parser still works even if DOS stub is default (no rich header) by Header::parse_without_dos, as I added test for this situation:

    #[test]
    fn parse_without_dos() {
        let header = Header::parse_without_dos(&BORLAND_PE32_VALID_NO_RICH_HEADER).unwrap();
        assert_eq!(header.dos_stub, DosStub::default());
        assert_eq!(header.rich_header.is_none(), true);

        // DOS stub is default but rich parser still works
        let header = Header::parse_without_dos(&CORRECT_RICH_HEADER).unwrap();
        assert_eq!(header.dos_stub, DosStub::default());
        assert_eq!(header.rich_header.is_some(), true);
    }

I don't know if we should not parse rich header when Header::parse_without_dos or we should. Do you have any idea?

m4b

Let's get feedback from them whether they want the rich data, otherwise we can leave as is; i have to run outside right now but i'll likely merge this tonight, thanks for all your great work!

m4b · 2024-10-27T18:29:23Z

@ideeockus do you have any opinion on the rich header question?

m4b · 2024-10-28T01:36:13Z

src/pe/header.rs

-pub struct DosStub(pub [u8; 0x40]);
-impl Default for DosStub {
+pub struct DosStub<'a> {
+    pub data: &'a [u8],


I'm going to change this to private and have a method bytes() -> &'a [u8] in a separate patch; if we ever want to switch this to something like Cow<'a, [u8]> and have an owned version to allow Header<'a> will be easier

m4b · 2024-10-28T01:38:10Z

this was an epic PR @kkent030315 thanks for your patience!

PE: parse rich header

b3714de

kkent030315 added 3 commits April 13, 2024 10:17

Fix comments

9088827

Fix no method named to_string found for reference &'static str in…

d10f92a

… the current scope

Explicitly isolate the Rich header from DOS stub

61515b0

kkent030315 marked this pull request as draft April 13, 2024 19:30

Separate build and struct; revert DOS stub change

e29c121

m4b marked this pull request as ready for review April 15, 2024 04:25

m4b requested changes Apr 15, 2024

View reviewed changes

Fix rich header parser for stability

7274230

NOTE: all `cargo test` passed

kkent030315 added 5 commits October 10, 2024 16:58

Lifetime reverts; implicit lifetime for now and deep copy

998b6f1

Make DosStub implicit lifetime aswell as 998b6f1

b1b841c

static to const, of vector datas in tests

0e7c377

Move ptr to symbol table assert before gwrite borrow

7bcc51b

Fixed pe pointer issue, added DOS explanation, add Borland tests

7dc98ad

kkent030315 force-pushed the richh branch from 7234670 to 7dc98ad Compare October 10, 2024 12:32

impl Pread, Pwrite for DosStub

275f0fa

kkent030315 mentioned this pull request Oct 21, 2024

PE: parse thread local storage - TLS data #404

Merged

kkent030315 mentioned this pull request Oct 25, 2024

dos stub length restricts to the first 64 bytes #422

Closed

kkent030315 added 3 commits October 25, 2024 22:23

Merge remote-tracking branch 'upstream/master' into richh

a39eb99

Fix doc

d204a9b

impl TryIntoCtx for DosStub

67e5e0f

m4b requested changes Oct 26, 2024

View reviewed changes

kkent030315 added 2 commits October 26, 2024 20:56

Revert to PE_POINTER_OFFSET

892341e

m4b#406 (comment)

Remove #[repr(C)] from DosStub

21bcc0c

m4b#406 (comment)

kkent030315 added 5 commits October 26, 2024 22:06

Documented marker subtract

5116ab9

m4b#406 (comment)

panic-free rich header padding size calculation

c539b53

m4b#406 (comment)

more stricter dos stub offset check

a878275

m4b#406 (comment)

improve rich key discovery strategy and syntax

c309f67

m4b#406 (comment)

Add rich offset checks for buffer scopes

c5e7e83

m4b#406 (comment) m4b#406 (comment)

kkent030315 added 4 commits October 26, 2024 23:08

Use core::mem instead of std::mem

2cf7861

DosStub has explicit lifetime and dynamic size now

d22797c

Fix doc comment for dos program

ff8e282

impl Copy for pe::Header

72a449b

kkent030315 added 3 commits October 27, 2024 01:16

Remove unused imports

56653d4

panic-free DanS marker discovery

b307a5b

Revert unnecessary changes

00c5d78

Kindly deleting comment

bddc811

kkent030315 added 3 commits October 27, 2024 17:14

Merge branch 'master' into richh

49bcd5b

Merge with upstream

65ad6f0

Fix tests

8b3757e

Add test for parse_without_dos

f8b390f

m4b approved these changes Oct 27, 2024

View reviewed changes

m4b reviewed Oct 28, 2024

View reviewed changes

m4b merged commit f43bc30 into m4b:master Oct 28, 2024
6 checks passed

kkent030315 mentioned this pull request Nov 3, 2024

Tracking issue for 0.10 #434

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PE: parse rich header and refactor DOS stub parser #406

PE: parse rich header and refactor DOS stub parser #406

kkent030315 commented Apr 13, 2024 •

edited

Loading

kkent030315 commented Apr 13, 2024

m4b left a comment

m4b commented May 20, 2024

kkent030315 commented Oct 10, 2024

m4b commented Oct 22, 2024

m4b commented Oct 22, 2024

kkent030315 commented Oct 25, 2024

m4b left a comment

kkent030315 commented Oct 26, 2024

kkent030315 commented Oct 26, 2024

kkent030315 commented Oct 26, 2024

m4b commented Oct 27, 2024

m4b commented Oct 27, 2024

kkent030315 commented Oct 27, 2024

kkent030315 commented Oct 27, 2024

m4b left a comment

m4b commented Oct 27, 2024

m4b Oct 28, 2024

m4b commented Oct 28, 2024

PE: parse rich header and refactor DOS stub parser #406

PE: parse rich header and refactor DOS stub parser #406

Conversation

kkent030315 commented Apr 13, 2024 • edited Loading

kkent030315 commented Apr 13, 2024

m4b left a comment

Choose a reason for hiding this comment

m4b commented May 20, 2024

kkent030315 commented Oct 10, 2024

m4b commented Oct 22, 2024

m4b commented Oct 22, 2024

kkent030315 commented Oct 25, 2024

m4b left a comment

Choose a reason for hiding this comment

kkent030315 commented Oct 26, 2024

kkent030315 commented Oct 26, 2024

kkent030315 commented Oct 26, 2024

m4b commented Oct 27, 2024

m4b commented Oct 27, 2024

kkent030315 commented Oct 27, 2024

kkent030315 commented Oct 27, 2024

m4b left a comment

Choose a reason for hiding this comment

m4b commented Oct 27, 2024

m4b Oct 28, 2024

Choose a reason for hiding this comment

m4b commented Oct 28, 2024

kkent030315 commented Apr 13, 2024 •

edited

Loading