multi: log additional information for local force closures#8072
multi: log additional information for local force closures#8072Chinwendu20 wants to merge 7 commits intolightningnetwork:masterfrom
Conversation
hieblmi
left a comment
There was a problem hiding this comment.
Hi, thank you for working on extending force closure information. I've left some comments.
|
cc: @ziggie1984 for review |
|
@Chinwendu20 can you please fix all the linter issues first: |
|
I am aware there is a lint issue this is because the PR is dependent on this #8057 so I added a replace directive for tlv temporarily when a new tag for tlv is released and it is bumped in lnd, I will remove it. This is just for review sake. I do not know if there are any other lint issues, make lint or golangci-lint run times out on my system even when using multiple cores but I see no lint issues on the CI the workflow has not been approved, I would soon fix the file conflicts as well but hopefully that should not get in the way of reviews. |
There was a problem hiding this comment.
Took a look at the PR (first commit only for now), good work 🙌
Definitely a ConceptAck.
In terms of the approach I think we can do even better especially in terms of future enhancements.
Overall I think we can already combine the infos when a force-close happens meaning that we could always add the HTLC action map, why restrict it to closures which are explictily triggered by the chaintrigger. Meaning that why not add the detailed state of the HTLCs when the user or any linkfailure triggered the FC?
Now when we combine different infos, we see clearly that the current approach would always require a new DB transaction, this is something we should improve especially when we wanna try to add even more infos in the future. We should add all the necessary infos and commit it to the db when we mark the channel as closed.
Another point which we should add to the FC Info is the broadcast height of the FC, because that would fill the HTLC action map with more life under the condition that we add more infos of the HTLC in the summary other than just the Hash (mainly the timeout blockheights)
Will provide a comment how I envision the design of the datastruct shorty.
Let me know what you think.
channeldb/channel.go
Outdated
There was a problem hiding this comment.
we should use ChainAction instead of string here, which is way more space efficient.
There was a problem hiding this comment.
Is it the ChainAction type in contractcourt package? The channeldb package is already imported there, so circular dependence.
There was a problem hiding this comment.
Understand, so use the a byte (uint8) instead and convert to the string type when delivering to the user in the rpcserver ?
There was a problem hiding this comment.
We can solve the circular dependence by making the ChainAction a common subpackage that is pulled into both scopes. I agree that we shouldn't be using strings when there is a finitely enumerable set of keys.
There was a problem hiding this comment.
Note that this is an outdated version of this PR and why do we need a separate package just because we want to import the chainAction type.
channeldb/channel.go
Outdated
There was a problem hiding this comment.
Nit: naming actionStream => tlvStream
channeldb/channel.go
Outdated
There was a problem hiding this comment.
thanks for catching this.
channeldb/channel.go
Outdated
There was a problem hiding this comment.
Nit actionStream => tlvStream
contractcourt/briefcase.go
Outdated
channeldb/channel.go
Outdated
There was a problem hiding this comment.
I think we should also add this customized encode decoders to the FuzzSuite as well, or what do you think @morehouse ?
There was a problem hiding this comment.
This would be great, though we currently have no fuzz tests for channeldb. Writing those tests is probably a bigger project to tackle separately, rather than holding up this PR.
See #8045 for some progress on channeldb fuzzing.
contractcourt/chain_arbitrator.go
Outdated
contractcourt/chain_arbitrator.go
Outdated
go.mod
Outdated
There was a problem hiding this comment.
Can you separate the temporary mod changes in a separate commit.
|
I thought about something like this: => add the OptionalForceCloseInfo to the ChanArbitrator Log infos when the channel close transaction is broadcasted and provide the information not only when channels are completely closed but also in pendingclose like for example channels which are in This approach would reduce the db operations to only one when we combine the ForceClose infos. Let me know what you think or whether this is helpful, if you need a more detailed picture I can provide it as well. |
Do you mean when the channel is marked as open? @ziggie1984 |
I updated the comment, was referring to the broadcasting of the channel close transaction. |
FYI you can create a PR on your fork of the lnd repo in github and then the CI tests will run without you needing permission every time |
Are you saying that beyond indicating that a force close was caused by a user using |
Exactly why not record this info as well, when it's there, it could also paint a more detailed picture when a link_error happens for whatever reason ? Currently when the user manually force closes or the link fails the gained info is very low , adding the htlc action map could reveal some potential problems in hindsight? Moreover it would also give use the possibility to enhance the information we show when channels are in a |
I would see about implementing this. |
Please confirm if this same as the |
Not quite, this is the height when the channel close transaction was actually confirmed in the blockchain, I was referring to the height the close tx was broadcasted to the network, because this is the height we are most interested in when we would analyse the HTLC Action Map in the feature. The spent can sometimes happen several blocks later, especially for STATIC_REMOTE Channels. |
Could you please explain more about this? How would we always require a new DB transaction? Adding your new suggestion of including htlcs and broadcast height to this struct. I would add a new struct perhaps named, type ForceCloseInfoSummary {
HTLCActionMap map[string][]HTLC
BroadcastHeight int64
}type OptionalForceCloseInfo struct {
userInitiated *bool
linkFailureError *string
chainAction *bool
summary ForceCloseInfoSummary
}Of
(For
This is the design that I am looking at what is your thought? |
|
I do not 100% understand your proposal do you have maybe already something to look at in your local repo? Happy to look into your approach if you have something up. I thought I layout my thoughts in more detail in the following (maybe it helps): So I contemplated about this PR maybe some thoughts which might be helpful:
I tend to go with 2, because then we can reuse the serialization and deserialization already implemented ? Based on choice 2, part of the code would look like this: channeldb.go briefcase.go Then call this function with the appropriate Infos where appropriate: Example: channel_arbitrator.go chain_arbitrator.go and the calls in for example rpcserver.go: |
|
Thanks a lot for the explanation @ziggie1984
I agree with this except from the structure of the struct, we would still have to indicate if a force close was due to chain actions.
It is not clear to me how this would be used, since there would only be one
One of the things I was trying to pass by you in my previous comment is that we might me adding a function here as well to log htlc due to userTrigger.
I was trying to understand what you were saying about this current approach needing multiple DB transactions when we include htlcActions for userTrigger and broadcast height as part of the info. I was just pointing out the number of times we would be logging into the localInfo store:
3.Log the broadcast height for closets. All these happen at different sites and that is why we cannot log multiple parts of the info at the same time. |
|
Ok understand. So how do you wanna make sure you do not overwrite Infos you saved already into the logs in a prior state ? Because the current approach flushes the I think you mentioned before that there are possibilities were you can log more then one info at a time, for example the chainActions + broadcastHeight (it's not exactly the height were the tx is published, but the point in time were we decided to switch to the from the default value.) Thats why I came up with this design: Only an idea from my side, feel free to follow your ideas. Yeah I think the chainAction boolean makes sense besides the other ones. |
I think I would explore your idea of creating a different key for each info.Then see how it goes.
chainActions Bool +_ HTLC map
Thanks a lot,I think I would start working on the PR now then get more feedback from you. I just wanted to be sure that am not completely off track. Thanks a lot. |
contractcourt/channel_arbitrator.go
Outdated
There was a problem hiding this comment.
So this one actually seems to behave the way I would expect. Why do we have two functions named the same thing that do conceptually differeint things
https://github.com/lightningnetwork/lnd/pull/8072/files#r1417996512
There was a problem hiding this comment.
They are in two different packages
lntest/harness_assertion.go
Outdated
There was a problem hiding this comment.
Please make this name more descriptive. It seems that it is asserting that the ChainActions match the supplied slice. Maybe call it AsserChainActionsMatch
lntest/harness_assertion.go
Outdated
There was a problem hiding this comment.
Please make this describe what it is actually asserting. Maybe AssertCloseSummaryContainsLocalFCInsights
rpcserver.go
Outdated
rpcserver.go
Outdated
rpcserver.go
Outdated
|
Upon a second review I think the approach seems fine for the most part. I do want to echo @ziggie1984 's concern that given we are changing the Left some comments on specific code but I think this is pretty close to good to go. |
Thanks for the review, I think a migration would be needed if we want to backfill the local force close reason for existing channels but I think in this case, there would be no way to get the reason for a local force close for channels that existed before this change. So I guess that field would just be empty for such channels. |
8b5f8bd to
86ff79c
Compare
LocalForceCloseInitiator type indicates what initiated the local force close. It could be: - User initiated - Initiated due to link failure errors - chain trigger initiated This commit adds the LocalForceCloseInitiator type and logs it in tlv format to the channel's ArbitratorLog.
In this commit we use the functional optional for specifying the local force close initiator that we built in the previous commit. The option can be for user-initiated FC, or link failure or chain trigger FC.
Signed-off-by: Ononiwu Maureen <amaka013@gmail.com>
This commit logs the Local Force Close insights when the close transaction commit has been broadcasted and persists it in the channel's bucket. The `OpenChannel` and `CloseChannelSummary` is also updated with the LocalForceCloseInsights field so that the insights can be fetched during a call to fetch `WaitingCloseChannels`, `PendingCloseChannels` and `ClosedChannels`. Signed-off-by: Ononiwu Maureen <amaka013@gmail.com>
We add a new proto message to populate the local force close insights in the ClosedChannels, pendingCloseChannels and waitingCloseChannels response and generate the new RPC files.
86ff79c to
65e7d6a
Compare
|
I have updated this PR. |
yyforyongyu
left a comment
There was a problem hiding this comment.
Thanks for the PR! I think the info should not be stored in openChannelBucket since not every channel is force closed, plus that bucket will be deleted anyway once the FC process is finished. Instead, it should either be stored in historicalChannelBucket or closedChannelBucket. Another option is store it in ContractReport since we would fetch the contracts via arbitratorPopulateForceCloseResp for all force closed channels,
Line 3866 in 0df507e
I think we need a pros and cons analysis to decide which approach is better.
Aside from that, I'd recommend reading this guideline first as there are some places violating our code formatting rules.
And what is the plan for logging the remote force close? That being said, I'm not sure how much this #7696 would help - for instance, if an FC happened, can the data saved here be more helpful than looking at the debug logs?
|
|
||
| // LocalForceCloseInitiator is a type that gives information about what led to a | ||
| // channel being force closed locally. | ||
| type LocalForceCloseInitiator string |
There was a problem hiding this comment.
We should use uint8 plus a String method to make this more space-efficient, something like
Lines 49 to 64 in 0df507e
There was a problem hiding this comment.
Also this should be called ForceCloseInitiator?
There was a problem hiding this comment.
Also, my intial approach was to do just as you have stated concerning making it unit8 instead of a string but that would mean, creating a LocalForceCloseInitiator type for each of the LinkFailureError messages. Making the initiator a string, I can easily convert any link failure error message to the LocalForceCloseInitiator on the go instead of duplicating all the LinkFailure error message type the channel.go file.
There was a problem hiding this comment.
| } | ||
|
|
||
| // HtlcActionDecoder is a custom TLV decoder for the htlcAction record. | ||
| func HtlcActionDecoder(r io.Reader, val interface{}, buf *[8]byte, |
There was a problem hiding this comment.
Why do we use interface value here? I think these encoder/decoder methods are only used for HTLC?
There was a problem hiding this comment.
This is where that function is used: https://github.com/lightningnetwork/lnd/blob/b22d72d44d5c71f732f15e5a796f5de7fb39a451/channeldb/channel.go#L3498-L3503C5
Also take a look at the function parameters for that function:
Lines 186 to 187 in 0df507e
This is the function signature for Decoder type:
Line 28 in 0df507e
| // critical part of the force close flow. | ||
| ChainTriggerForceClose(chainActions)(c) | ||
| fallthrough | ||
| nextState = StateBroadcastCommit |
There was a problem hiding this comment.
Because we do not want this function called for https://github.com/lightningnetwork/lnd/pull/8072/files/c60ea993c2a31fe62c3e481cd40618455bf60afd..b22d72d44d5c71f732f15e5a796f5de7fb39a451#diff-a0b8064876b1b1d6085fa7ffdbfd38c81cb06c1ca3f34a08dbaacba203cda3ebL969-L970 on line 970 called for chain trigger force close. Chain action would be recorded twice in that case.
| // broadcasted when moving the channel to state CoopBroadcasted. | ||
| coopCloseTxKey = []byte("coop-closing-tx-key") | ||
|
|
||
| // localFCInsightsKe points to the insights gotten about a local |
There was a problem hiding this comment.
| // localFCInsightsKe points to the insights gotten about a local | |
| // localFCInsightsKey points to the insights gotten about a local |
channeldb/channel.go
Outdated
There was a problem hiding this comment.
In that case we should call this ForceCloseInsights and ForceCloseInitiator since we gonna also save the info for remote force close?
channeldb/channel.go
Outdated
There was a problem hiding this comment.
Mostly like due to old practice - we've switched to use varint as it's more space-efficient.
| channel *OpenChannel) error { | ||
|
|
||
| insightBytes := chanBucket.Get(localFCInsightsKey) | ||
| if insightBytes == nil { |
There was a problem hiding this comment.
How could this be nil? Since it can only end up here when it has ChanStatusCommitBroadcasted? Unless it's a remote FC?
There was a problem hiding this comment.
I think even if it is a remote FC the channel would not have the ChanStatusCommitBroadcasted bit in the status. While fetching historical buckets we also use this function and then the ChanStatusCommitBroadcasted bit would be in the status but we do not store the force close insight in this bucket so it would be nil. I would update the comment in this line to reflect that.
|
|
||
| // attachLocalForceCloseInsights fetches the local force close insights | ||
| // previously stored when commitment was broadcast. | ||
| func attachLocalForceCloseInsights(chanBucket kvdb.RBucket, |
There was a problem hiding this comment.
Looks like this can be a method on OpenChannel?
There was a problem hiding this comment.
I could do that but I was just following the function signature of all the other functions in the function that call this and they are not methods though they modify the OpenChannel.
| if cs.LocalFCInsights != nil { | ||
| if err := serializeLocalForceCloseInsights(w, | ||
| cs.LocalFCInsights); err != nil { | ||
| log.Warnf("Error serializing channel close "+ |
There was a problem hiding this comment.
Also these errors from deserialization and serialization should be returned instead of logged.
contractcourt/chain_arbitrator.go
Outdated
There was a problem hiding this comment.
I also don't think it's the right approach - 1) it's not really an option, but an extra step in ForceCloseContract where we save the info 2) this can be expanded indefinitely, instead we should limit the function scope here. I think you could define a ForceCloseReason struct here and let other callers use it in ForceCloseContract. Just a rough thought, need to think about it more
Thanks for the review! This was initially stored only in the closedChannel bucket but this review comment (#8072 (comment)) stated that the force close reason should be included for "waiting close channels" which are essentially still in the openChannel bucket. I am aware that the bucket would be deleted once the FC process is complete that is why when the channel is marked as closed then it is stored along with the channel CloseSummary: https://github.com/lightningnetwork/lnd/pull/8072/files#diff-793204e241c4f89dc7184266600f1aca57cb8e3970ab714f0c65fba1a6826066L3647-L3648 So the force close reason is displayed for waiting close channels, pending close channels and close channels, just like the review comment that I linked suggested.
You mean rethinking the entire approach?
Thanks I would look into it.
Reading through the conversation of the issue linked to this PR, you would see that the original author suggested that, that could be done in follow up PR.: #7696 (comment) |
Are you saying we just display the insights in the debug logs instead of saving them? I think the point is to make it available during rpcserver calls but you can advice on what would be more useful, @saubyk what do you think? |
@yyforyongyu looking at the debug logs is not an option every user has or is technically inclined to do, making it difficult to discern why the channel was closed. If we have this information available in the database and a rpc to pull that, it can be easily utilized by an application to inform users of the possible reason for force closures. |
|
@hieblmi: review reminder |
|
!lightninglabs-deploy mute |
|
Closing this pr due to inactivity. |
Change Description
Follow up to: #7787
Fixes: #7696