Application should abort on panic in any of the threads #213

mickvandijke · 2023-03-03T18:00:53Z

Currently, when there is a panic in any of our spawned tokio tasks, the application will continue to run. Only the specific task from where the panic originated will be aborted.

I believe that we should change this behaviour to abort the main runtime on any panic. Whether it is from the main runtime or from a spawned task.

The problem with the current behaviour is mainly that the app will continue to run even if one or multiple servers (HttpTracker, Api, UdpTracker) yield a panic when trying to start them.

Relevant discussion

josecelano · 2023-04-04T12:55:42Z

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:

Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..

It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?

nyacat · 2023-05-01T15:38:14Z

Maybe try this in release profile, it should abort once something panic

[profile.release]
opt-level = 3
lto = true
panic = "abort"

mickvandijke · 2023-05-02T09:59:45Z

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:
Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..
It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?

Hey @josecelano , Sorry for the extremely late reply. I never saw this message 😅.

Should we just replace the warn! with a panic!?

mickvandijke · 2023-05-02T10:13:26Z

Maybe try this in release profile, it should abort once something panic
[profile.release]
opt-level = 3
lto = true
panic = "abort"

Thank you for your suggestion @nyacat ! I have done a bit of testing and it does indeed seem to abort the main application if there is a panic in any of the Tokio tasks :).

What do you think of using this @josecelano ?

josecelano · 2023-05-02T10:24:05Z

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:
Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..
It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?
Hey @josecelano , Sorry for the extremely late reply. I never saw this message sweat_smile.

Should we just replace the warn! with a panic!?

Hi @WarmBeer, for that particular case, I think we only need to panic instead of warning:

    // Start the UDP blocks
    for udp_tracker_config in &config.udp_trackers {
        if !udp_tracker_config.enabled {
            continue;
        }

        if tracker.is_private() {
            warn!(
                "Could not start UDP tracker on: {} while in {:?}. UDP is not safe for private trackers!",
                udp_tracker_config.bind_address, config.mode
            );
        } else {
            jobs.push(udp_tracker::start_job(udp_tracker_config, tracker.clone()));
        }
    }

because that code is executed by the main thread.

For the other cases, when we spawn a new task with Tokio I guess we should check the JoinHandle, it depends on what the spawned task returns. If it panics, we should maybe panic also the main thread, for example:

#[tokio::main]
async fn main() -> io::Result<()> {
    let join_handle: task::JoinHandle<Result<i32, io::Error>> = tokio::spawn(async {
        panic!("boom");
    });

    let err = join_handle.await.unwrap_err();
    assert!(err.is_panic());
    Ok(())
}

And @nyacat, regarding the panic = "abort" option, it seems it could work:

https://users.rust-lang.org/t/what-happens-when-an-async-task-panics/68668

but I would prefer to catch it and handle it in the main thread.

Right now, I think we prefer the "all or nothing" approach, which means we would stop all the services if we cannot run one of them. But in the future, that could depend on configuration settings, or we could restart some services from an admin panel.

What I would do is review all the "Unrecoverable Errors" (with panic or error) and handle them on the main tread. The current implementation could just panic again. For example:

Main thread
Setup UDP tracker
Missing required configuration or already used port -> panic or error
Catch the error in the main bootstrapping and panic.

I think that's basically the original @WarmBeer's proposal

cgbosse · 2024-01-16T16:07:48Z

notes from discussion:
needs to be defined better and see whether it is split into several smaller ones.

josecelano · 2024-01-16T16:12:03Z

Relates to: #618

Hi @WarmBeer maybe in some cases we should not abort. For example, if the app is processing a request and that request is the one producing the panic.

Maybe, we should specify concrete examples and open a new issue for each of them.

mickvandijke added the Bug Incorrect Behavior label Mar 3, 2023

mickvandijke mentioned this issue Mar 3, 2023

Refactor: move shared test functionality to a dev-dependency #170

Closed

1 task

josecelano mentioned this issue Mar 12, 2023

Warn when API cannot be started on HTTPs #161

Closed

da2ce7 added Security Publicly Connected to Security - User - Enjoyable to Use our Software labels Oct 10, 2023

josecelano mentioned this issue Nov 27, 2023

Fix container HEALTHCHECK when port 0 is used in confguration #524

Closed

cgbosse added this to Torrust Solution Jan 8, 2024

cgbosse moved this to BUG in Torrust Solution Jan 8, 2024

cgbosse added this to the v3.0.0 milestone Jan 16, 2024

torrust locked and limited conversation to collaborators Jan 16, 2024

cgbosse converted this issue into discussion #619 Jan 16, 2024

github-project-automation bot moved this from BUG & Security to Done in Torrust Solution Jan 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Application should abort on panic in any of the threads #213

Application should abort on panic in any of the threads #213

mickvandijke commented Mar 3, 2023 •

edited

Loading

josecelano commented Apr 4, 2023

nyacat commented May 1, 2023 •

edited

Loading

mickvandijke commented May 2, 2023

mickvandijke commented May 2, 2023

josecelano commented May 2, 2023

cgbosse commented Jan 16, 2024

josecelano commented Jan 16, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Application should abort on panic in any of the threads #213

Application should abort on panic in any of the threads #213

Comments

mickvandijke commented Mar 3, 2023 • edited Loading

josecelano commented Apr 4, 2023

nyacat commented May 1, 2023 • edited Loading

mickvandijke commented May 2, 2023

mickvandijke commented May 2, 2023

josecelano commented May 2, 2023

cgbosse commented Jan 16, 2024

josecelano commented Jan 16, 2024

This issue was moved to a discussion.

mickvandijke commented Mar 3, 2023 •

edited

Loading

nyacat commented May 1, 2023 •

edited

Loading