feature-request: trivial coordinator heartbeat router #58

maxgruber19 · 2025-02-03T14:41:51Z

I'd like to have a very easy router that just sends a http get to the trino coordinator to know about its state.

we currently use a custom pythonscript router, make it curl https://trino-coordinator-default/v1/info and check for "starting" field to be false. this leads to a very very basic "queuing procedure" but clients die after couple of seconds when they dont get a feedback from the lb because its stuck in its routing loop. I'll attach a basic example below. Of course this scenario limits the routing functionality to one cluster only instead of multiple clusters dynamically.

The behavior Id like to propose is that the trino-lb should send back "QUEUED_IN_TRINO_LB" as long as its waiting for the coordinator to be alive again. Unfortunately I have no clue about rust, so I dont feel ready to propose some code myself.

If there already is something like that I'm very curious to know.

import time
from typing import Optional
import requests


def isCoordinatorReady():
  try:
    response = requests.get(
      "https://trino-coordinator-default.mesh-platform-core.svc.cluster.local:8443/v1/info",
      verify="/etc/secret-provisioner-tls/ca.crt"
    )
  except Exception as e:
    return False

  if response.status_code == 200 and not response.json()['starting']:
    return True
  return False


def targetClusterGroup(query: str, headers: dict[str, str]) -> Optional[str]:
  while not isCoordinatorReady():
    time.sleep(10)
  return "my-single-cluster"

The text was updated successfully, but these errors were encountered:

maxgruber19 · 2025-02-05T10:23:47Z

I thought about this once again and came to a simple solution to set the routingFallback as not required, when a pythonscript router returns None or throws an exception the lb could treat all clusters as non routable and fall back to the "queued_in_trino_lb" state similar to an empty collection of trino clusters. maybe that makes the changes way easier?

Fixup of #58

lfrancke added the customer-request label Feb 19, 2025

sbernauer added a commit that referenced this issue Feb 19, 2025

fix: Adopt http path parameters according to axum 0.8

04b484e

Fixup of #58

sbernauer mentioned this issue Feb 19, 2025

fix: Adopt http path parameters according to axum 0.8 #60

Merged

github-merge-queue bot pushed a commit that referenced this issue Feb 20, 2025

fix: Adopt http path parameters according to axum 0.8 (#60)

ac08608

Fixup of #58

sbernauer mentioned this issue Feb 24, 2025

feat: Add a new cluster state "Unhealthy" #63

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature-request: trivial coordinator heartbeat router #58

feature-request: trivial coordinator heartbeat router #58

maxgruber19 commented Feb 3, 2025 •

edited

Loading

maxgruber19 commented Feb 5, 2025

feature-request: trivial coordinator heartbeat router #58

feature-request: trivial coordinator heartbeat router #58

Comments

maxgruber19 commented Feb 3, 2025 • edited Loading

maxgruber19 commented Feb 5, 2025

maxgruber19 commented Feb 3, 2025 •

edited

Loading