Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching object IDs between environment graph and instance segmentation #153

Open
Yuchen0112 opened this issue Feb 27, 2025 · 0 comments
Open

Comments

@Yuchen0112
Copy link

Hi, I'm trying to overlay the object IDs in a scene image using the instance segmentation seg_inst and the environment graph comm.environment_graph() generated by the simulator. The scene image and the segmentation mask are shown as follows:

Image

Image

For example, the sofa node from the environment graph looks like this:
{ "id": 368, "category": "Furniture", "class_name": "sofa", "properties": ["SURFACES", "SITTABLE", "LIEABLE", "MOVABLE"], "states": []}
which has an ID of 368.

Here is the code I wrote for overlaying object IDs with the scene image rgb_file, instance segmentation seg_file, and the environment graph sg:

rgb_image = cv2.imread(rgb_file)
rgb_image = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2RGB)
seg_mask = cv2.imread(seg_file, cv2.IMREAD_GRAYSCALE)

image_pil = Image.fromarray(rgb_image)
draw = ImageDraw.Draw(image_pil)
unique_labels = np.unique(seg_mask)

for label in unique_labels:
    if label == 0:  # Skip background
        continue
    obj_info = next((obj for obj in sg["nodes"]
                     if obj["id"] == int(label)), None)
    if obj_info is None:
        continue
    y_coords, x_coords = np.where(seg_mask == label)
    if len(x_coords) == 0 or len(y_coords) == 0:
        continue

    center_x = int(np.mean(x_coords))
    center_y = int(np.mean(y_coords))
    text = f"{label}"
    bbox = draw.textbbox((center_x, center_y), text, font=font)
    box_width = bbox[2] - bbox[0]
    box_height = bbox[3] - bbox[1]
    draw.rectangle([center_x - box_width//2 - 2,
                    center_y - box_height//2 - 2,
                    center_x + box_width//2 + 2,
                    center_y + box_height//2 + 2],
                   fill='white', outline='black')
    draw.text((center_x, center_y), text,
              fill='black', font=font, anchor="mm")

labeled_image = np.array(image_pil)
Image.fromarray(labeled_image).save("label.png")

The output is shown in the following image. However, the unique labels obtained from unique_labels = np.unique(seg_mask) do not match the object IDs from the environment graph. For example, the sofa is labeled with 40 instead of 368 (the id from the environment graph), see the center lower part of the image.

Image

I assume the values from the segmentation mask are randomly given, since the pixel value in a 3-channel RGB image normally varies from 0 to 255, but the values of the object IDs in the environment graph can be greater than 255. I'd like to ask how I can get the corresponding object IDs from the segmentation mask. Thanks very much!

P.S. The versions of VirtualHome and Unity simulator I use are both v2.3.0. The code is implemented in Python 3.10.15 and executed on Ubuntu 20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant