Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix integer overflow undefined behaviour #177

Closed
wants to merge 1 commit into from
Closed

Fix integer overflow undefined behaviour #177

wants to merge 1 commit into from

Conversation

nielsdos
Copy link

Even though delta's type is PCRE2_SIZE, the computation for delta uses all integers at the right hand side.
This means that there is a potential integer overflow. The if below then checks whether a computation equivalent to delta is larger than INT_MAX: the overflow check.
However, since integer overflow is undefined behaviour, the compiler may assume it never happens. Therefore, the overflow check can be assumed to always be false even though there is casting because according to the compiler if an overflow doesn't happen for ints, it surely does not happen for INT64_OR_DOUBLE...

I found this issue using the stack static analysis tool. I verified that the compiler actually optimizes away the check by looking at the IR.

Fix it by computing delta with the casts first, and use that same delta in the if.

Even though delta's type is PCRE2_SIZE, the computation for delta uses
all integers at the right hand side.
This means that there is a potential integer overflow. The if below then
checks whether a computation equivalent to delta is larger than INT_MAX:
the overflow check.
However, since integer overflow is undefined behaviour, the compiler may
assume it never happens. Therefore, the overflow check can be assumed to
always be false even though there is casting because according to the
compiler if an overflow doesn't happen for ints, it surely does not
happen for INT64_OR_DOUBLE...

I found this issue using the stack static analysis tool.
I verified that the compiler actually optimizes away the check by
looking at the IR.

Fix it by computing delta with the casts first, and use that same delta
in the if.
@carenas
Copy link
Contributor

carenas commented Dec 28, 2022

LINK_SIZE is a very tiny integer (usually 2), so the check (as mentioned in the comment above) is just being extra paranoid.

I suspect though, that if your compiler is optimizing away that check, it might be because of a bug with it, or you should be configuring your pcre2 to use double (as a 64bit int) instead.

if ((INT64_OR_DOUBLE)replicate*
(INT64_OR_DOUBLE)(1 LINK_SIZE) >
(INT64_OR_DOUBLE)INT_MAX ||
INT64_OR_DOUBLE delta = (INT64_OR_DOUBLE)replicate*(INT64_OR_DOUBLE)(1 LINK_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a broken int64 cast is being used here, this is likely to make the problem worst by potentially halving the width of delta.

it migth be better IMHO to only cast "replicate" to ensure the multiplication is not being done with plain ints

Copy link
Contributor

@carenas carenas Dec 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a maybe cleaner way to approach this sort of check pushed (not tested) in the following branch.

note that similar logic is used in several other places, so not sure why the static analyzer used didn't pick on those.

also, details on the system/compiler that was affected (so a fix could be tested) would be ideal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your approach is indeed cleaner.
System details: Linux 6.1.3, Clang 14.0.6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you confirm the check is indeed getting removed?, still can't reproduce that with the same compiler as shown by the following simplified code:

$ cat i.c
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
	int a = atoi(argv[1]);
	
	size_t m;
	       
	m = a * 2;
	if ((long)a * 2 > INT_MAX)
		printf("OVERFLOW\n");
	printf("%zu\n", m);
	return 0;
}
$ clang --version
Debian clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang -O3 -o i i.c
$ ./i 2147483647
OVERFLOW
18446744073709551614

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That example gives the same result on my system, so this PR was probably a mistake.

carenas added a commit to carenas/pcre2 that referenced this pull request Dec 28, 2022
Break the check in two to better protect from overflow in systems
that have a 32bit PCRE2_SIZE.

As a side effect make the overflow check happen before the actual
code to avoid compilers optimizing it out under the assumption an
integer overflow wouldn't had happened because of undefined behaviour.

Fixes: PCRE2Project#177
carenas added a commit to carenas/pcre2 that referenced this pull request Dec 28, 2022
Break the check in two to better protect from overflow in systems
that have a 32bit PCRE2_SIZE.

As a side effect make the overflow check happen before the actual
code to avoid compilers optimizing it out under the assumption an
integer overflow wouldn't had happened because of undefined behaviour.

Fixes: PCRE2Project#177
carenas added a commit to carenas/pcre2 that referenced this pull request Dec 28, 2022
Break the check in two to better protect from overflow in systems
that have a 32bit PCRE2_SIZE.

As a side effect make the overflow check happen before the actual
code to avoid compilers optimizing it out under the assumption an
integer overflow wouldn't had happened because of undefined behaviour.

Fixes: PCRE2Project#177
@PhilipHazel
Copy link
Collaborator

What is the status of this, please? Should I implement the one-line patch in the conversation above?

@carenas
Copy link
Contributor

carenas commented Jan 25, 2023

This PR could be closed, but I think we should do some related improvements anyway, as I still think that there might be some truth to the original static analyzer report

@PhilipHazel
Copy link
Collaborator

Can we close this now?

@PhilipHazel
Copy link
Collaborator

OK, I' closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants