Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential problem with L2P1 msg_data consumption #100

Open
wants to merge 4 commits into
base: openpiton-dev
Choose a base branch
from

Conversation

fei-g
Copy link
Contributor

@fei-g fei-g commented Mar 25, 2021

Problem description:

Some noc1 requests getting into L2 pipe1 contain msg_data (e.g. nc_store, atomics, interrupt forward), but the pipeline does not always consume the data immediately. While the request being pushed into mshr, the msg_data is not stored in mshr array, but being left in the noc1 buffer. At this moment, if another request gets into L2 pipe1, and also trying to consume msg_data, it would read out the msg_data of the previous request.

Is this problem real and triggerable?

Only two types of request may contain msg_data and has the potential to be pushed in mshr, atomic operation and nc store, which may cause the msg_data pending at the buffer.

atomic operation

An atomic operation is divided into two internal requests in L2: xx_P1 and xx_P2. In phase1 it will invalidate all sharers if the line is in S/M state; in phase2 it will read msg_data and do the arithmetic computation. L2 stalls the first stage between phase1 and phase2, and won't ack the msg header until it reaches phase2. Thus, no new requests can be processed by the pipe when a msg_data in the atomic operation is pending. A request in mshr can be recovered between phase1 and phase2, though, and that request may even be a nc_store. But in this case it was the nc_store what arrived at L2 first and made the msg_data pending, we'll discuss that in the next case.

So the conclusion is: due to the late ask of the msg header, we've already prevented new operations being consumed during the time when a msg_data in atomic operation is pending. @morenes also wrote tests to issue lots of consecutive atomic operations from multiple threads to try to trigger this problem, but we saw nothing happened. It's kind of verified that we are fine in this case.

non-cacheable store

If the target line of the nc_store is already in S/M state, L2 will firstly invalidate all the sharers and push the nc_store into mshr. At this time the msg_data is pending, and bad thing may happen if the pipe receives another request which carries msg_data as well. However, with Ariane core this case could not be triggered, because the non-cacheable region is fixed, and the line in the nc space will never be stored in L2.

The plan would be either try to build a test with sparc core, sending nc_store to a previous cacheable address, or using other device sending nc_store to cacheable region.

This fix

The idea of this fix is to figure out when the msg_data is pending. It would raise a flag msg_data_pending when a request is supposed to consume msg_data, but not and in reality being pushed into mshr. When that flag is set, the pipeline would not accept new request which also carries msg_data.

It's verified that this fix would not break the system. But we need further tests to make sure this problem would actually be triggered and this fix could solve the problem.

fei-g added 4 commits March 24, 2021 17:24
When an op goes into mshr in L2P1, we didn't store the msg_data
in mshr, thus the data left in l2_pipe1_buf_in may be consumed
by later ops. This fix stalls the pipe when msg_data is waiting
to be consumed in the buf_in.
Conflicts:
	piton/design/chip/tile/l2/rtl/l2_pipe1_ctrl.v.pyv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant