Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application should abort on panic in any of the threads #213

Closed
mickvandijke opened this issue Mar 3, 2023 · 7 comments
Closed

Application should abort on panic in any of the threads #213

mickvandijke opened this issue Mar 3, 2023 · 7 comments
Labels
- User - Enjoyable to Use our Software Bug Incorrect Behavior Security Publicly Connected to Security
Milestone

Comments

@mickvandijke
Copy link
Member

mickvandijke commented Mar 3, 2023

Currently, when there is a panic in any of our spawned tokio tasks, the application will continue to run. Only the specific task from where the panic originated will be aborted.

I believe that we should change this behaviour to abort the main runtime on any panic. Whether it is from the main runtime or from a spawned task.

The problem with the current behaviour is mainly that the app will continue to run even if one or multiple servers (HttpTracker, Api, UdpTracker) yield a panic when trying to start them.

Relevant discussion

@josecelano
Copy link
Member

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:

Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..

It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?

@nyacat
Copy link

nyacat commented May 1, 2023

Maybe try this in release profile, it should abort once something panic

[profile.release]
opt-level = 3
lto = true
panic = "abort"

@mickvandijke
Copy link
Member Author

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:

Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..

It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?

Hey @josecelano , Sorry for the extremely late reply. I never saw this message 😅.

Should we just replace the warn! with a panic!?

@mickvandijke
Copy link
Member Author

Maybe try this in release profile, it should abort once something panic

[profile.release]
opt-level = 3
lto = true
panic = "abort"

Thank you for your suggestion @nyacat ! I have done a bit of testing and it does indeed seem to abort the main application if there is a panic in any of the Tokio tasks :).

What do you think of using this @josecelano ?

@josecelano
Copy link
Member

There is another related case. WHen you start the tracker in private more and you enable the UDP tracker in the configuration, you get this message:

Loading configuration from config file ./config.toml
2023-04-04T13:20:49.719450708+01:00 [torrust_tracker::bootstrap::logging][INFO] logging initialized.
2023-04-04T13:20:49.720843820+01:00 [torrust_tracker::app][WARN] Could not start UDP tracker on: 0.0.0.0:6969 while in Private. UDP is not safe for private trackers!
2023-04-04T13:30:49.722860551+01:00 [torrust_tracker::bootstrap::jobs::torrent_cleanup][INFO] Cleaning up torrents..

It is only a warning, but the UDP does not start at all. I think we should force the user to either disable the UDP tracker or change the tracker mode. What do you think @WarmBeer?

Hey @josecelano , Sorry for the extremely late reply. I never saw this message sweat_smile.

Should we just replace the warn! with a panic!?

Hi @WarmBeer, for that particular case, I think we only need to panic instead of warning:

    // Start the UDP blocks
    for udp_tracker_config in &config.udp_trackers {
        if !udp_tracker_config.enabled {
            continue;
        }

        if tracker.is_private() {
            warn!(
                "Could not start UDP tracker on: {} while in {:?}. UDP is not safe for private trackers!",
                udp_tracker_config.bind_address, config.mode
            );
        } else {
            jobs.push(udp_tracker::start_job(udp_tracker_config, tracker.clone()));
        }
    }

because that code is executed by the main thread.

For the other cases, when we spawn a new task with Tokio I guess we should check the JoinHandle, it depends on what the spawned task returns. If it panics, we should maybe panic also the main thread, for example:

#[tokio::main]
async fn main() -> io::Result<()> {
    let join_handle: task::JoinHandle<Result<i32, io::Error>> = tokio::spawn(async {
        panic!("boom");
    });

    let err = join_handle.await.unwrap_err();
    assert!(err.is_panic());
    Ok(())
}

And @nyacat, regarding the panic = "abort" option, it seems it could work:

https://users.rust-lang.org/t/what-happens-when-an-async-task-panics/68668

but I would prefer to catch it and handle it in the main thread.

Right now, I think we prefer the "all or nothing" approach, which means we would stop all the services if we cannot run one of them. But in the future, that could depend on configuration settings, or we could restart some services from an admin panel.

What I would do is review all the "Unrecoverable Errors" (with panic or error) and handle them on the main tread. The current implementation could just panic again. For example:

  1. Main thread
  2. Setup UDP tracker
  3. Missing required configuration or already used port -> panic or error
  4. Catch the error in the main bootstrapping and panic.

I think that's basically the original @WarmBeer's proposal

@da2ce7 da2ce7 added Security Publicly Connected to Security - User - Enjoyable to Use our Software labels Oct 10, 2023
@cgbosse cgbosse moved this to BUG in Torrust Solution Jan 8, 2024
@cgbosse cgbosse added this to the v3.0.0 milestone Jan 16, 2024
@cgbosse
Copy link
Member

cgbosse commented Jan 16, 2024

notes from discussion:
needs to be defined better and see whether it is split into several smaller ones.

@josecelano
Copy link
Member

Relates to: #618

Hi @WarmBeer maybe in some cases we should not abort. For example, if the app is processing a request and that request is the one producing the panic.

Maybe, we should specify concrete examples and open a new issue for each of them.

@torrust torrust locked and limited conversation to collaborators Jan 16, 2024
@cgbosse cgbosse converted this issue into discussion #619 Jan 16, 2024
@github-project-automation github-project-automation bot moved this from BUG & Security to Done in Torrust Solution Jan 16, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
- User - Enjoyable to Use our Software Bug Incorrect Behavior Security Publicly Connected to Security
Projects
Status: Done
Development

No branches or pull requests

5 participants