fuzz-tests: Add a test for the gossipd-connectd interface#8423
fuzz-tests: Add a test for the gossipd-connectd interface#8423Chand-ra wants to merge 2 commits intoElementsProject:masterfrom
Conversation
|
The test results in the following LeakSanitizer error when run on its corpus: |
morehouse
left a comment
There was a problem hiding this comment.
High level review: I like the target. Needs some rework so we consider exit to be a crash.
Haven't studied create_gossip_msg yet.
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| struct node_id id = node_id(privkey_from_index(tal_count(peer_ids))); | ||
| tal_arr_expand(&peer_ids, id); | ||
|
|
||
| msg = towire_gossipd_new_peer(tmpctx, &id, false); |
There was a problem hiding this comment.
Setting gossip_queries_feature = false will limit some of the code paths that can be executed, especially all the seeker stuff. If there's a good reason to do this, please add a comment here that explains why.
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| switch (fromwire_u8(&data, &size) % 5) | ||
| { | ||
| case 0: | ||
| gossip_msg = create_gossip_msg(tmpctx, &data, &size, WIRE_CHANNEL_ANNOUNCEMENT); | ||
| break; | ||
| case 1: | ||
| gossip_msg = create_gossip_msg(tmpctx, &data, &size, WIRE_CHANNEL_UPDATE); | ||
| break; | ||
| case 2: | ||
| gossip_msg = create_gossip_msg(tmpctx, &data, &size, WIRE_NODE_ANNOUNCEMENT); | ||
| break; | ||
| case 3: | ||
| gossip_msg = create_gossip_msg(tmpctx, &data, &size, WIRE_REPLY_CHANNEL_RANGE); | ||
| break; | ||
| case 4: | ||
| gossip_msg = create_gossip_msg(tmpctx, &data, &size, WIRE_REPLY_SHORT_CHANNEL_IDS_END); | ||
| break; | ||
| default: | ||
| break; | ||
| } |
There was a problem hiding this comment.
Nit: perhaps this switch could go inside create_gossip_msg, so there's one less parameter to pass.
|
|
||
| cleanup: | ||
| if (daemon) | ||
| tal_free(daemon->connectd); |
There was a problem hiding this comment.
Is the explicit tal_free necessary? Naively it looks like clean_tmpctx should take care of this already.
There was a problem hiding this comment.
This is to circumvent the dangling allocation error fixed in #8424 .
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| if (setjmp(exit_jmp) != 0) | ||
| goto cleanup; |
There was a problem hiding this comment.
For gossipd, we actually shouldn't consider exit to be normal. If gossipd exits, the entire CLN node shuts down, which would be a DoS vulnerability.
There was a problem hiding this comment.
This path can only be triggered by status_failed() and towire_warningfmt().
I can get rid of the mock for towire_warningfmt() by including common/wire_error.o for linking. As for status_failed(), I don't think triggering it would mean a vulnerability, it's used pretty liberally throughout gossipd/gossipd.c to report a bad message from the peer. Maybe exit()'s behavior manifests differently in a live node than it does in our fuzzer (where it aborts the entire process).
There was a problem hiding this comment.
I don't see any reason for us to quit the fuzz target for towire_warning_fmt.
status_failed means "print error and exit", and should never happen in the wild. If it does, it's a DoS vulnerability.
There was a problem hiding this comment.
I don't see any reason for us to quit the fuzz target for
towire_warning_fmt.
Right, I think I didn't word that clearly. Right now we're mocking the behavior of towire_warning_fmt() which we don't need to, by simply including common/wire_error.o for linking along with the other artifacts. I've already done this in the latest push and it doesn't seem to result in any crash.
status_failed means "print error and exit", and should never happen in the wild. If it does, it's a DoS vulnerability.
Oh okay, makes sense.
I looked closer at this. The accused line of code does a naked We might need to manually delete all elements from the gossmap at the end of each iteration. |
This makes the target a bit messier but does seem to manage to evade the issue at hand. |
448e689 to
4721eb2
Compare
|
I was fixing some of the other issues with this target when it ran into yet another memory leak: This time, the leak occurs when we try to send a |
Then we also need to manually free that map at the end of each iteration. |
4721eb2 to
da8afec
Compare
morehouse
left a comment
There was a problem hiding this comment.
Have you been able to run this fuzz target? I tried and hit a bunch of crashes immediately.
Only reproduces when I use multiple workers, and seems multiple workers are trying to touch the same file (yikes).
fuzz-gossipd-connectd: gossip_store_compact: rename failed: No such file or dire
ctory (version v25.05-2-gda8afec-modded)
==174417== ERROR: libFuzzer: fuzz target exited
#7 0x00000057ce3c in status_failed common/status.c:208:2
#8 0x000000526e9c in gossip_store_compact gossipd/gossip_store.c:360:3
#9 0x000000526e9c in gossip_store_new gossipd/gossip_store.c:404:11
#10 0x0000005ef290 in setup_gossmap tests/fuzz/../../gossipd/gossmap_manage.c:453:11
#11 0x0000005eed07 in gossmap_manage_new tests/fuzz/../../gossipd/gossmap_manage.c:485:7
#12 0x0000005f8381 in new_daemon tests/fuzz/fuzz-gossipd-connectd.c:128:15
#13 0x0000005f8381 in run tests/fuzz/fuzz-gossipd-connectd.c:472:26
I'll review further once all shallow crashes are fixed.
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| if (setjmp(exit_jmp) != 0) | ||
| goto cleanup; |
There was a problem hiding this comment.
The jmp_buf is now not needed. Please remove.
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| channel_flags = 1; | ||
| } | ||
|
|
||
| timestamp = time_now().ts.tv_sec - 3600; |
There was a problem hiding this comment.
The peer could send any timestamp. It doesn't have to be recent. Perhaps we should let the fuzzer decide the timestamp.
There was a problem hiding this comment.
Right, that's how I designed the fuzzer initially:
timestamp = fromwire_u32(cursor, max);
and how the NODE_ANNOUNCEMENT message is crafted, but the fuzzer was unable to get past this check so I swapped that out for the current setup. Maybe something like
timestamp = time_now().ts.tv_sec - fromwire_u16(cursor, max);
would be better?
There was a problem hiding this comment.
There's an additional problem with using time_now -- it could make reproducibility difficult. Perhaps we should be using dev_gossip_time instead.
I would prefer all timestamps to be possible. I think it would be better to do something like 75% of the time choose a valid one but 25% of the time the fuzzer can choose whatever random values it wants.
Yeah, we discussed this over our meeting a while ago, and we decided upon creating a new file for each fuzz run. I tried to do that but the file name needs to be statically defined using The next best thing I could muster up was resetting the |
da8afec to
ab7dc45
Compare
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| channel_flags = 1; | ||
| } | ||
|
|
||
| timestamp = time_now().ts.tv_sec - - fromwire_u16(cursor, max); |
There was a problem hiding this comment.
Does this even compile? Please fix.
tests/fuzz/fuzz-gossipd-connectd.c
Outdated
| channel_flags = 1; | ||
| } | ||
|
|
||
| timestamp = time_now().ts.tv_sec - 3600; |
There was a problem hiding this comment.
There's an additional problem with using time_now -- it could make reproducibility difficult. Perhaps we should be using dev_gossip_time instead.
I would prefer all timestamps to be possible. I think it would be better to do something like 75% of the time choose a valid one but 25% of the time the fuzzer can choose whatever random values it wants.
Ugh, that's tricky. It kind of seems like this fuzz target would do better with an out-of-process fuzzing engine. We can try to hack around the issue though. Maybe we could redefine |
In C, only adjacent string literals are concatenated at compile time, so the |
Changelon-None: `connectd_req()` in `gossipd/gossipd.c` is responsible for handling gossip messages from peers handed to it by `connectd`. Add a stateful test simulating its behaviour.
Add a minimal input set as a seed corpus for the newly introduced test. This leads to discovery of interesting code paths faster.
ab7dc45 to
8dd0ec8
Compare
For now we could just change that location to use |
I ran the target for quite some time and it doesn't seem to be able to discover any newer vulnerabilities. |
|
I'm deferring this one, as it will need significant rework for the rebase. In particular, we are going to have a way of overriding entropy, so this should use that. |
Changelog-None:
connectd_req()ingossipd/gossipd.cis responsible for handling gossip messages from peers handed to it byconnectd. Add a stateful test simulating its behaviour.Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:
CC: @morehouse