multi: log additional information for local force closures#7787
multi: log additional information for local force closures#7787shaurya947 wants to merge 6 commits intolightningnetwork:masterfrom
Conversation
A local force close could be due to one of the following reasons: - User initiated - Link failure (automatic) - On-chain HTLC conditions such as timeout (automatic) This commit adds a new struct that can encapsulate these different reasons, and logs that struct to the channel's ArbitratorLog.
In this commit we use the functional optional for specifying the local force close reason that we built in the previous commit. The option can be for user-initiated FC, or link failure or FC, or just logging the ChainActionMap in the case of chainTrigger.
During the local force close flow, at the time of actually marking the channel as closed, we need to fetch the local force close info from the channel's arbitrator log and write it into the persistent closed channels bucket in the DB. This commit achieves that by creating a new exported struct inside the channeldb package, populates the struct, and writes it to the DB.
We add a new proto message to populate the local force close info in the ClosedChannels response and generate the new RPC files. We also add the info to the response.
1c67ef3 to
08ca9fc
Compare
|
Hey @saubyk this is mostly ready for review (I just need to investigate some failing tests) but I've fixed all the lint issues and added a release notes entry for 0.17.1 per the milestone you assigned. Let me know whom we can ask for reviews 🙂 |
|
Hi @shaurya947 great work so far. thanks. Will start assigning reviewers when we start the dev cycle for 17.1 |
|
looking forward to this PR ! |
guggero
left a comment
There was a problem hiding this comment.
Great idea for approaching this quite complex task. Very nice work, looks pretty far along.
Main comments are around using TLV to serialize the force close info and to share the code between the arbitrator and the channel DB if possible.
| b := new(bytes.Buffer) | ||
|
|
||
| // first byte is userInitiated bool | ||
| err = binary.Write(b, endian, info.userInitiated) |
There was a problem hiding this comment.
We should use TLV for any new data structure that we save to the DB. That will make it way easier to extend this info structure as well. See the example in fetchChanInfo() in channeldb/channel.go.
| return err | ||
| } | ||
|
|
||
| // next bytes are actual map data |
There was a problem hiding this comment.
nit: all inline comments should also be full sentences with proper capitalization and punctuation.
| func UserInitiatedForceClose(ca *ChannelArbitrator) { | ||
| // Ignore errors since logging force close info is not a critical part | ||
| // of the force close flow. | ||
| _ = ca.log.LogLocalForceCloseInfo(localForceCloseInfo{ |
There was a problem hiding this comment.
Maybe we should at least log any errors?
| // of ForceCloseContract(). We don't export this type directly, but rather the | ||
| // concrete functions implementing this type, so that callers can choose the | ||
| // concrete function that best describes their use case for force closing. | ||
| type forceCloseContractOption func(*ChannelArbitrator) |
There was a problem hiding this comment.
Hmm, this doesn't look like the normal option pattern, as it's just a function type. Maybe just call it forceCloseContractCallback instead?
| } | ||
|
|
||
| log.Infof("Attempting to force close ChannelPoint(%v)", chanPoint) | ||
| opt(arbitrator) |
There was a problem hiding this comment.
Nice design with the callback 👍
| // next state, as we still need to broadcast the commitment | ||
| // transaction. | ||
| case chainTrigger: | ||
| // Ignore errors since logging force close info is not a |
There was a problem hiding this comment.
Add a named callback for this as well? Then we have all the code in one place.
| // this may be extended in the future. For example, we may decide to log HTLC | ||
| // information that doesn't automatically result in a force close, but leaves | ||
| // the channel in a state whereby the user has no option but to force close. | ||
| type LocalForceCloseInfo struct { |
There was a problem hiding this comment.
Could we use this in the contractcourt directly? I see that the HtlcActions isn't exactly the same type, but maybe we could convert? Otherwise we need encode/decode code in two places.
If we share this struct, it can have a public Encode() and Decode() method.
| } | ||
|
|
||
| // Write whether the local force close info is present. | ||
| if err := WriteElements(w, cs.LocalFCInfo != nil); err != nil { |
There was a problem hiding this comment.
This won't work without changes to the WriteElement and ReadElement functions in channeldb/codec.go.
There was a problem hiding this comment.
Why do you say this? bool is handled by WriteElement.
There was a problem hiding this comment.
Oh, I misread it as just WriteElements(w, cs.LocalFCInfo), missed the != nil part. So maybe extract to a bool to make it more obvious?
There was a problem hiding this comment.
That is the general pattern of writing it looking at the function.
| for action, htlcs := range localFCInfo.HtlcActions { | ||
| htlcHashes := make([][]byte, len(htlcs)) | ||
| for i, htlc := range htlcs { | ||
| htlcHashes[i] = htlc.RHash[:] |
There was a problem hiding this comment.
We need to use copy() here to avoid running into the famous golang loop variable issues.
|
|
||
| message ListHTLCHash { | ||
| // Hash of HTLC | ||
| repeated bytes hash = 1; |
There was a problem hiding this comment.
nit: should be plural since it's a slice of hashes.
Thanks @Chinwendu20 |
|
|
||
| // htlcActions is a non-empty map iff the force close was automatically | ||
| // initiated by an on-chain trigger such as HTLC timeout. | ||
| htlcActions ChainActionMap |
There was a problem hiding this comment.
Hello @guggero would be a good idea to just indicate that the closure was due to ahtlcAction and we just put the particular htlc action responsible for the closure. Would there be need for the actual htlc?
There was a problem hiding this comment.
I guess it would be nice to be able to correlate the closure resulting from an HTLC with the actual HTLC in the channel (to see why it timed out, what its CLTV delta was and so on).
|
@shaurya947, remember to re-request review from reviewers when ready |
|
Continued here: #8072 |
|
Replaced by #8072. |
Change Description
Addresses #7696 by logging the reason for a local force close in the DB, and echoing it within the ClosedChannels response. Since initiating a force close and marking the channel as closed happen at two distinct parts of the code, we utilize the ChannelArbitrator log to persist the necessary information between these two moments in time. Once we mark the channel as closed, we log that info in the closed channels bucket. In this PR we address 3 possible reasons of local force closure: user-initiated, link failure, and HTLC/chainTrigger-related.
Steps to Test
itests have been updated to test for all 3 conditions, but manual testing can be done by simulating any of the 3 local force close situations and then checking the channel information in the ClosedChannels response.
Pull Request Checklist
Testing
Code Style and Documentation
[skip ci]in the commit message for small changes.📝 Please see our Contribution Guidelines for further guidance.