feature: Allow forcing processing an epoch #2318

jkitman · 2023-04-24T20:38:05Z

Note we upload the entire serialized epoch (which can be downloaded from the API), because it's both easier to get from the API and doesn't require additional P2P calls (we don't really know who to query).

codecov · 2023-04-24T20:51:14Z

Codecov Report

Patch coverage: 29.75% and project coverage change: -1.14 ⚠️

Comparison is base (88bec9e) 59.87% compared to head (4aec216) 58.74%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2318       /-   ##
==========================================
- Coverage   59.87%   58.74%   -1.14%     
==========================================
  Files         158      159        1     
  Lines       33438    34224      786     
==========================================
  Hits        20022    20105       83     
- Misses      13416    14119      703

Impacted Files	Coverage Δ
fedimint-bin-tests/src/main.rs	`0.10% <0.00%> (-0.01%)`	⬇️
fedimint-cli/src/lib.rs	`5.24% <0.00%> (-0.26%)`	⬇️
fedimint-core/src/admin_client.rs	`86.36% <0.00%> (-10.25%)`	⬇️
fedimint-core/src/encoding/mod.rs	`88.54% <0.00%> (-2.15%)`	⬇️
fedimint-server/src/consensus/mod.rs	`88.03% <75.00%> ( 1.15%)`	⬆️
fedimint-server/src/net/api.rs	`91.54% <91.30%> (-0.05%)`	⬇️
fedimint-server/src/consensus/server.rs	`94.94% <92.30%> (-0.35%)`	⬇️

... and 37 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

dpc · 2023-04-24T21:34:23Z

LGTM

How will the flow look from the cli? Do we need another PR with fedimint-cli command to fetch the epoch from one peer, and then force the other to chew it? I guess the fetch part might be there already.

elsirion · 2023-04-25T15:17:48Z

fedimint-server/src/consensus/server.rs

+    async fn force_process_epoch(&mut self, outcome: EpochOutcome) {
+        let convert = ConsensusOutcomeConversion::from(outcome).0;
+        match self.process_outcome(convert).await {
+            Ok(_) => {}
+            Err(err) => warn!("Unable to force process epoch {:?}", err),
+        }
+    }
+


Don't we also need to tell HBBFT that it can accept the new epoch now and do we need to ask peers to trigger more epochs to rejoin? Or is it assumed that after that API call the guardian node has to be restarted?

Realized we already handle that here https://github.com/fedimint/fedimint/blob/master/fedimint-server/src/consensus/server.rs#L334-L335

If an epoch is in the past, it won't be processed, if it's in the future we'll download missing epochs.

Added a test for it though.

I don't think this fixes the problem, this comment wasn't about downloading old epochs.

Let's suppose there f servers are permanently lost and t (A, B, C) are trying to restart but one of them (A) is at a wrong epoch:

A, B, C start HBBFT at their respective last epochs

A, B, C send rejoin requests and generate HBBFT messages

A sends message for epoch n

B, C send messages for epoch n 1

No consensus is found on the epoch, A stays on epoch n

No progress is made since only 2 of the required 3 are sending contributions for the correct epoch

A forces the processing of epoch n 1, but does not tell HBBFT about it, so no new messages are generated for epoch n 1 and HBBFT remains stuck

I think this is solved by restarting fedimintd afterwards, but ideally there'd be a more elegant solution.

elsirion · 2023-04-25T19:27:48Z

integrationtests/tests/tests.rs

+        bitcoin.mine_blocks(100).await;
+        fed.run_consensus_epochs(1).await;
+
+        // We cannot process a past each and reverse the block height


Did you mean epoch?

elsirion · 2023-04-25T19:38:32Z

fedimint-server/src/consensus/server.rs

+    async fn force_process_epoch(&mut self, outcome: EpochOutcome) {
+        let convert = ConsensusOutcomeConversion::from(outcome).0;
+        match self.process_outcome(convert).await {
+            Ok(_) => {}
+            Err(err) => warn!("Unable to force process epoch {:?}", err),
+        }
+    }
+


I don't think this fixes the problem, this comment wasn't about downloading old epochs.

Let's suppose there f servers are permanently lost and t (A, B, C) are trying to restart but one of them (A) is at a wrong epoch:

A, B, C start HBBFT at their respective last epochs

A, B, C send rejoin requests and generate HBBFT messages

A sends message for epoch n

B, C send messages for epoch n 1

No consensus is found on the epoch, A stays on epoch n

No progress is made since only 2 of the required 3 are sending contributions for the correct epoch

A forces the processing of epoch n 1, but does not tell HBBFT about it, so no new messages are generated for epoch n 1 and HBBFT remains stuck

I think this is solved by restarting fedimintd afterwards, but ideally there'd be a more elegant solution.

jkitman · 2023-04-25T21:57:11Z

@elsirion Right, forgot about that point. See updated code. I need a way to simulate a server restart more easily I think.

douglaz · 2023-04-25T22:19:13Z

I tried running on this branch:

❯ fedimint-cli --password pass0  api status --peer-id 0
{
  "error": "CliError",
  "kind": "GeneralFailure",
  "message": "RPC call failed: ErrorObject { code: ServerError(401), message: \"Request missing required authorization\", data: None }",
  "raw_error": "RPC call failed: ErrorObject { code: ServerError(401), message: \"Request missing required authorization\", data: None }"
}

Was this supposed to work?

jkitman · 2023-04-25T23:10:30Z

@douglaz I've switch to using env vars instead of params to make it easier to call the admin API.

Try setting FM_SALT_PATH, FM_PASSWORD, and FM_OUR_ID

elsirion

Looks good now :)

jkitman added 2 commits April 24, 2023 00:54

feature: Allow forcing processing an epoch

76cf027

feature: CLI and tests for forcing processing an epoch

4aec216

jkitman marked this pull request as ready for review April 24, 2023 20:50

jkitman requested review from a team as code owners April 24, 2023 20:50

elsirion reviewed Apr 25, 2023

View reviewed changes

jkitman force-pushed the process-until branch from e1bed6d to caeadc8 Compare April 25, 2023 17:16

jkitman enabled auto-merge (squash) April 25, 2023 17:40

elsirion reviewed Apr 25, 2023

View reviewed changes

jkitman force-pushed the process-until branch from caeadc8 to 4aec216 Compare April 25, 2023 21:56

elsirion approved these changes Apr 26, 2023

View reviewed changes

jkitman merged commit d70b0cc into fedimint:master Apr 26, 2023

elsirion mentioned this pull request Jul 18, 2023

Falling out of consensus quick-fix #2779

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: Allow forcing processing an epoch #2318

feature: Allow forcing processing an epoch #2318

jkitman commented Apr 24, 2023

codecov bot commented Apr 24, 2023 •

edited

Loading

dpc commented Apr 24, 2023

elsirion Apr 25, 2023

jkitman Apr 25, 2023

jkitman Apr 25, 2023

elsirion Apr 25, 2023

elsirion Apr 25, 2023

elsirion Apr 25, 2023

jkitman commented Apr 25, 2023

douglaz commented Apr 25, 2023

jkitman commented Apr 25, 2023 •

edited

Loading

elsirion left a comment

feature: Allow forcing processing an epoch #2318

feature: Allow forcing processing an epoch #2318

Conversation

jkitman commented Apr 24, 2023

codecov bot commented Apr 24, 2023 • edited Loading

Codecov Report

dpc commented Apr 24, 2023

elsirion Apr 25, 2023

Choose a reason for hiding this comment

jkitman Apr 25, 2023

Choose a reason for hiding this comment

jkitman Apr 25, 2023

Choose a reason for hiding this comment

elsirion Apr 25, 2023

Choose a reason for hiding this comment

elsirion Apr 25, 2023

Choose a reason for hiding this comment

elsirion Apr 25, 2023

Choose a reason for hiding this comment

jkitman commented Apr 25, 2023

douglaz commented Apr 25, 2023

jkitman commented Apr 25, 2023 • edited Loading

elsirion left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 24, 2023 •

edited

Loading

jkitman commented Apr 25, 2023 •

edited

Loading