Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Fix "in function" detection in "thread until" #123622

Merged
merged 9 commits into from
Feb 21, 2025
Merged

Conversation

labath
Copy link
Collaborator

@labath labath commented Jan 20, 2025

The implementation has an optimization which detects the range of line table entries covered by the function and then only searches for a matching line between them.

This optimization was interfering with the logic for detecting whether a line belongs to the function because the first call to FindLineEntry was made with exact=false, which meant that if the function did not contain any exact matches, we would just pick the closest line number in that range, even if it was very far away.

This patch fixes that by first attempting an inexact search across the entire line table, and then use the (potentially inexact) result of that for searching within the function. This makes the optimization a less effective, but I don't think we can differentiate between a line that belongs to the function (but doesn't have any code) and a line outside the function without that.

The patch also avoids the use of (deprecated) Function::GetAddressRange by iterating over the GetAddressRanges result to find the full range of line entries for the function.

The implementation has an optimization which detects the range of line
table entries covered by the function and then only searches for a
matching line between them.

This optimization was interfering with the logic for detecting whether a
line belongs to the function because the first call to FindLineEntry was
made with exact=false, which meant that if the function did not contain
any exact matches, we would just pick the closest line number in that
range, even if it was very far away.

This patch fixes that by first attempting an inexact search across the
entire line table, and then use the (potentially inexact) result of that
for searching within the function. This makes the optimization a less
effective, but I don't think we can differentiate between a line that
belongs to the function (but doesn't have any code) and a line outside
the function without that.

The patch also avoids the use of (deprecated) Function::GetAddressRange
by iterating over the GetAddressRanges result to find the full range of
line entries for the function.
@labath labath requested a review from jimingham January 20, 2025 14:32
@labath labath requested a review from JDevlieghere as a code owner January 20, 2025 14:32
@llvmbot llvmbot added the lldb label Jan 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 20, 2025

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

The implementation has an optimization which detects the range of line table entries covered by the function and then only searches for a matching line between them.

This optimization was interfering with the logic for detecting whether a line belongs to the function because the first call to FindLineEntry was made with exact=false, which meant that if the function did not contain any exact matches, we would just pick the closest line number in that range, even if it was very far away.

This patch fixes that by first attempting an inexact search across the entire line table, and then use the (potentially inexact) result of that for searching within the function. This makes the optimization a less effective, but I don't think we can differentiate between a line that belongs to the function (but doesn't have any code) and a line outside the function without that.

The patch also avoids the use of (deprecated) Function::GetAddressRange by iterating over the GetAddressRanges result to find the full range of line entries for the function.


Full diff: https://github.com/llvm/llvm-project/pull/123622.diff

2 Files Affected:

  • (modified) lldb/source/Commands/CommandObjectThread.cpp (+30-27)
  • (modified) lldb/test/API/functionalities/thread/step_until/TestStepUntil.py (+35-3)
diff --git a/lldb/source/Commands/CommandObjectThread.cpp b/lldb/source/Commands/CommandObjectThread.cpp
index 4e2c4c1126bc3f..829abb8c5839bb 100644
--- a/lldb/source/Commands/CommandObjectThread.cpp
+++ b/lldb/source/Commands/CommandObjectThread.cpp
@@ -959,7 +959,6 @@ class CommandObjectThreadUntil : public CommandObjectParsed {
         }
 
         LineEntry function_start;
-        uint32_t index_ptr = 0, end_ptr = UINT32_MAX;
         std::vector<addr_t> address_list;
 
         // Find the beginning & end index of the function, but first make
@@ -970,19 +969,22 @@ class CommandObjectThreadUntil : public CommandObjectParsed {
           return;
         }
 
-        AddressRange fun_addr_range = sc.function->GetAddressRange();
-        Address fun_start_addr = fun_addr_range.GetBaseAddress();
-        line_table->FindLineEntryByAddress(fun_start_addr, function_start,
-                                           &index_ptr);
 
-        Address fun_end_addr(fun_start_addr.GetSection(),
-                             fun_start_addr.GetOffset() +
-                                 fun_addr_range.GetByteSize());
+        uint32_t lowest_func_idx = UINT32_MAX;
+        uint32_t highest_func_idx = 0;
+        for (AddressRange range : sc.function->GetAddressRanges()) {
+          uint32_t idx;
+          LineEntry unused;
+          Address addr = range.GetBaseAddress();
+          if (line_table->FindLineEntryByAddress(addr, unused, &idx))
+            lowest_func_idx = std::min(lowest_func_idx, idx);
 
-        bool all_in_function = true;
+          addr.Slide(range.GetByteSize());
+          if (line_table->FindLineEntryByAddress(addr, unused, &idx))
+            highest_func_idx = std::max(highest_func_idx, idx);
+        }
 
-        line_table->FindLineEntryByAddress(fun_end_addr, function_start,
-                                           &end_ptr);
+        bool found_something = false;
 
         // Since not all source lines will contribute code, check if we are
         // setting the breakpoint on the exact line number or the nearest
@@ -991,14 +993,15 @@ class CommandObjectThreadUntil : public CommandObjectParsed {
         for (uint32_t line_number : line_numbers) {
           LineEntry line_entry;
           bool exact = false;
-          uint32_t start_idx_ptr = index_ptr;
-          start_idx_ptr = sc.comp_unit->FindLineEntry(
-              index_ptr, line_number, nullptr, exact, &line_entry);
-          if (start_idx_ptr != UINT32_MAX)
-            line_number = line_entry.line;
+          if (sc.comp_unit->FindLineEntry(0, line_number, nullptr, exact,
+                                          &line_entry) == UINT32_MAX)
+            continue;
+
+          found_something = true;
+          line_number = line_entry.line;
           exact = true;
-          start_idx_ptr = index_ptr;
-          while (start_idx_ptr <= end_ptr) {
+          uint32_t start_idx_ptr = lowest_func_idx;
+          while (start_idx_ptr <= highest_func_idx) {
             start_idx_ptr = sc.comp_unit->FindLineEntry(
                 start_idx_ptr, line_number, nullptr, exact, &line_entry);
             if (start_idx_ptr == UINT32_MAX)
@@ -1007,29 +1010,29 @@ class CommandObjectThreadUntil : public CommandObjectParsed {
             addr_t address =
                 line_entry.range.GetBaseAddress().GetLoadAddress(target);
             if (address != LLDB_INVALID_ADDRESS) {
-              if (fun_addr_range.ContainsLoadAddress(address, target))
+              AddressRange unused;
+              if (sc.function->GetRangeContainingLoadAddress(address, *target,
+                                                             unused))
                 address_list.push_back(address);
-              else
-                all_in_function = false;
             }
             start_idx_ptr++;
           }
         }
 
         for (lldb::addr_t address : m_options.m_until_addrs) {
-          if (fun_addr_range.ContainsLoadAddress(address, target))
+          AddressRange unused;
+          if (sc.function->GetRangeContainingLoadAddress(address, *target,
+                                                         unused))
             address_list.push_back(address);
-          else
-            all_in_function = false;
         }
 
         if (address_list.empty()) {
-          if (all_in_function)
+          if (found_something)
             result.AppendErrorWithFormat(
-                "No line entries matching until target.\n");
+                "Until target outside of the current function.\n");
           else
             result.AppendErrorWithFormat(
-                "Until target outside of the current function.\n");
+                "No line entries matching until target.\n");
 
           return;
         }
diff --git a/lldb/test/API/functionalities/thread/step_until/TestStepUntil.py b/lldb/test/API/functionalities/thread/step_until/TestStepUntil.py
index 1fbb404aeae58b..9e2588dc61c007 100644
--- a/lldb/test/API/functionalities/thread/step_until/TestStepUntil.py
+++ b/lldb/test/API/functionalities/thread/step_until/TestStepUntil.py
@@ -16,9 +16,18 @@ def setUp(self):
         self.less_than_two = line_number("main.c", "Less than 2")
         self.greater_than_two = line_number("main.c", "Greater than or equal to 2.")
         self.back_out_in_main = line_number("main.c", "Back out in main")
+        self.in_foo = line_number("main.c", "In foo")
+
+    def _build_dict_for_discontinuity(self):
+        return dict(
+            CFLAGS_EXTRAS="-funique-basic-block-section-names "
+            + "-ffunction-sections -fbasic-block-sections=list="
+            + self.getSourcePath("function.list"),
+            LD_EXTRAS="-Wl,--script=" + self.getSourcePath("symbol.order"),
+        )
 
-    def common_setup(self, args):
-        self.build()
+    def _common_setup(self, build_dict, args):
+        self.build(dictionary=build_dict)
         exe = self.getBuildArtifact("a.out")
 
         target = self.dbg.CreateTarget(exe)
@@ -45,7 +54,7 @@ def common_setup(self, args):
         return thread
 
     def do_until(self, args, until_lines, expected_linenum):
-        thread = self.common_setup(args)
+        thread = self._common_setup(None, args)
 
         cmd_interp = self.dbg.GetCommandInterpreter()
         ret_obj = lldb.SBCommandReturnObject()
@@ -88,3 +97,26 @@ def test_missing_one(self):
         self.do_until(
             ["foo", "bar", "baz"], [self.less_than_two], self.back_out_in_main
         )
+
+    @no_debug_info_test
+    def test_bad_line(self):
+        """Test that we get an error if attempting to step outside the current
+        function"""
+        thread = self._common_setup(None, None)
+        self.expect(f"thread until {self.in_foo}",
+                    substrs=["Until target outside of the current function"],
+                             error=True)
+
+    @no_debug_info_test
+    @skipIf(oslist=lldbplatformutil.getDarwinOSTriples() + ["windows"])
+    @skipIf(archs=no_match(["x86_64", "aarch64"]))
+    def test_bad_line_discontinuous(self):
+        """Test that we get an error if attempting to step outside the current
+        function -- and the function is discontinuous"""
+        self.build(dictionary=self._build_dict_for_discontinuity())
+        _, _, thread, _ = lldbutil.run_to_source_breakpoint(
+            self, "At the start", lldb.SBFileSpec(self.main_source)
+        )
+        self.expect(f"thread until {self.in_foo}",
+                    substrs=["Until target outside of the current function"],
+                             error=True)

Copy link

github-actions bot commented Jan 20, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Jan 20, 2025

✅ With the latest revision this PR passed the Python code formatter.

@labath
Copy link
Collaborator Author

labath commented Jan 29, 2025

ping @jimingham

if (line_table->FindLineEntryByAddress(addr, unused, &idx))
lowest_func_idx = std::min(lowest_func_idx, idx);

addr.Slide(range.GetByteSize() - 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magical -1's...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new version might be even better (or at least, more similar to what was happening with the single range). Let me know what you think.

address_list.push_back(address);
else
all_in_function = false;
}

if (address_list.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you only care here about found_something and address_list.size > 0 could you short-circuit this search when you find the first address that matches the line?

Copy link
Collaborator Author

@labath labath Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. address_list being empty is the "error" case. In case of success, the full list of addresses is used for stepping (line 1040)

@labath
Copy link
Collaborator Author

labath commented Feb 12, 2025

ping @jimingham

Copy link
Collaborator

@jimingham jimingham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I left the question of whether this high-low index finding algorithm would be useful elsewhere (it could be a method on Function from what you have & return).

If you don't think that's likely to be reusable, then this is fine as is.

fun_addr_range.GetByteSize());

bool all_in_function = true;
uint32_t lowest_func_idx = UINT32_MAX;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be useful anywhere else? It's asking a Function what its highest and lowest indexes are in the linetable from its CU.

@labath
Copy link
Collaborator Author

labath commented Feb 14, 2025

Good question. I think I could use it in immediately in #126526 if I make the result exact (return the precise set of ranges instead of the high/low water mark). Let me see how that would look like.

labath added a commit to labath/llvm-project that referenced this pull request Feb 17, 2025
The motivation is llvm#123622 and the fact that is hard to fine the last
line entry in a given range. `FindLineEntryByAddress(range_end-1)` is
the best we have, but it's not ideal because it has a magic -1 and that
it relies on there existing a line entry at that address (generally, it
should be there, but if for some case it isn't, we might end up ignoring
the entries that are there (or -- like my incorrect fix in llvm#123622 did
-- iterating through the entire line table).

What we really want is to get the last entry that exists in the given
range. Or, equivalently (and more STL-like) the first entry after that
range. This is what these functions do. I've used the STL names since
they do pretty much exactly what the standard functions do (the main
head-scratcher comes from the fact that our entries represent ranges
rather than single values).

The functions can also be used to simplify the maze of `if` statements
in `FindLineEntryByAddress`, but I'm keeping that as a separate patch.
For now, I'm just adding some unit testing for that function to gain
more confidence that the patch does not change the function behavior.
labath added a commit to labath/llvm-project that referenced this pull request Feb 17, 2025
The motivation is llvm#123622 and the fact that is hard to fine the last
line entry in a given range. `FindLineEntryByAddress(range_end-1)` is
the best we have, but it's not ideal because it has a magic -1 and that
it relies on there existing a line entry at that address (generally, it
should be there, but if for some case it isn't, we might end up ignoring
the entries that are there (or -- like my incorrect fix in llvm#123622 did
-- iterating through the entire line table).

What we really want is to get the last entry that exists in the given
range. Or, equivalently (and more STL-like) the first entry after that
range. This is what these functions do. I've used the STL names since
they do pretty much exactly what the standard functions do (the main
head-scratcher comes from the fact that our entries represent ranges
rather than single values).

The functions can also be used to simplify the maze of `if` statements
in `FindLineEntryByAddress`, but I'm keeping that as a separate patch.
For now, I'm just adding some unit testing for that function to gain
more confidence that the patch does not change the function behavior.
labath added a commit to labath/llvm-project that referenced this pull request Feb 17, 2025
The motivation is llvm#123622 and the fact that is hard to fine the last
line entry in a given range. `FindLineEntryByAddress(range_end-1)` is
the best we have, but it's not ideal because it has a magic -1 and that
it relies on there existing a line entry at that address (generally, it
should be there, but if for some case it isn't, we might end up ignoring
the entries that are there (or -- like my incorrect fix in llvm#123622 did
-- iterating through the entire line table).

What we really want is to get the last entry that exists in the given
range. Or, equivalently (and more STL-like) the first entry after that
range. This is what these functions do. I've used the STL names since
they do pretty much exactly what the standard functions do (the main
head-scratcher comes from the fact that our entries represent ranges
rather than single values).

The functions can also be used to simplify the maze of `if` statements
in `FindLineEntryByAddress`, but I'm keeping that as a separate patch.
For now, I'm just adding some unit testing for that function to gain
more confidence that the patch does not change the function behavior.
labath added a commit that referenced this pull request Feb 19, 2025
The motivation is #123622 and the fact that is hard to fine the last
line entry in a given range. `FindLineEntryByAddress(range_end-1)` is
the best we have, but it's not ideal because it has a magic -1 and that
it relies on there existing a line entry at that address (generally, it
should be there, but if for some case it isn't, we might end up ignoring
the entries that are there (or -- like my incorrect fix in #123622 did
-- iterating through the entire line table).

What we really want is to get the last entry that exists in the given
range. Or, equivalently (and more STL-like) the first entry after that
range. This is what these functions do. I've used the STL names since
they do pretty much exactly what the standard functions do (the main
head-scratcher comes from the fact that our entries represent ranges
rather than single values).

The functions can also be used to simplify the maze of `if` statements
in `FindLineEntryByAddress`, but I'm keeping that as a separate patch.
For now, I'm just adding some unit testing for that function to gain
more confidence that the patch does not change the function behavior.

---------

Co-authored-by: Jonas Devlieghere <[email protected]>
Prakhar-Dixit pushed a commit to Prakhar-Dixit/llvm-project that referenced this pull request Feb 19, 2025
The motivation is llvm#123622 and the fact that is hard to fine the last
line entry in a given range. `FindLineEntryByAddress(range_end-1)` is
the best we have, but it's not ideal because it has a magic -1 and that
it relies on there existing a line entry at that address (generally, it
should be there, but if for some case it isn't, we might end up ignoring
the entries that are there (or -- like my incorrect fix in llvm#123622 did
-- iterating through the entire line table).

What we really want is to get the last entry that exists in the given
range. Or, equivalently (and more STL-like) the first entry after that
range. This is what these functions do. I've used the STL names since
they do pretty much exactly what the standard functions do (the main
head-scratcher comes from the fact that our entries represent ranges
rather than single values).

The functions can also be used to simplify the maze of `if` statements
in `FindLineEntryByAddress`, but I'm keeping that as a separate patch.
For now, I'm just adding some unit testing for that function to gain
more confidence that the patch does not change the function behavior.

---------

Co-authored-by: Jonas Devlieghere <[email protected]>
@labath labath merged commit 917ed99 into llvm:main Feb 21, 2025
7 checks passed
@labath labath deleted the until branch February 21, 2025 12:17
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 27, 2025
The implementation has an optimization which detects the range of line
table entries covered by the function and then only searches for a
matching line between them.

This optimization was interfering with the logic for detecting whether a
line belongs to the function because the first call to FindLineEntry was
made with exact=false, which meant that if the function did not contain
any exact matches, we would just pick the closest line number in that
range, even if it was very far away.

This patch fixes that by first attempting an inexact search across the
entire line table, and then use the (potentially inexact) result of that
for searching within the function. This makes the optimization a less
effective, but I don't think we can differentiate between a line that
belongs to the function (but doesn't have any code) and a line outside
the function without that.

The patch also avoids the use of (deprecated) Function::GetAddressRange
by iterating over the GetAddressRanges result to find the full range of
line entries for the function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants