Replies: 6 comments 16 replies
-
dbgeng -> lm |
Beta Was this translation helpful? Give feedback.
-
also, for programs you have loaded or are loading, you can automate the downloading of the PDBs. |
Beta Was this translation helpful? Give feedback.
-
OK, let me try to be a bit clearer.... There are two areas we could be talking about and possibly also a crossover between the two. There is the dynamic space associated with the debugger and a running target, and there is the static space associated with the Ghidra project and whatever programs you have loaded. Loading symbols into the dynamic space, typically for use in the debugger, is a function of the underlying API. This functionality may or may not be exported to the CLI. The dbgeng/dbgmodel APIs have built-in support for refreshing symbols, which is exposed in the CLI. Refreshing all the symbols is a VERY expensive task, so we do not automate that on a per-event basis, although that can be done relatively straightforwardly. We also generally try to implement functionality that has general utility and some common way of implementing it across debugger sets. For example, breakpoints are common to the Windows, Linux, and macOS debuggers and are implemented in similar ways. Downloading symbols is a little different, and you've already highlighted the issues here, e.g. using lldb on Windows is not a common use case and supporting symbol downloads through some GUI interface would be highly idiosynchratic. x64dbg supports debugging of x32/x64 executables on Windows-only, so the developers have a lot more flexibility regarding functions tailored to a single, very narrowly-defined environment. They do a great job in that environment, but their use cases differ considerably from ours. Loading symbols into the static space is typically done at load time, although there are cases when you might want to re-provision the symbols at some point after. The reason for this is that the symbols directly influence in a positive way the analysis of the binary. Doing it after the fact produces less satisfying results. This is also a VERY expensive operation, particularly if you wish the symbols to re-shape the analysis. If you decide to builk download hundreds of DLLS and apply the results to the existing project, you are queueing up a boatload of work. Your results will also suffer from the fact that the symbol information was not available for the original analysis. All that said, if you think applying symbols to the static analysis after the fact from the debugger has some utility, feel free to put in a ticket. However, in doing so, you'll need to put together a detailed description of your use case and why you think this solution is on-par with or supplants the existing solutions. We have weekly reviews to discus the merits of every ticket, and, if we think it has merit, we will prioritize it and add it to our work queues. Alternatively, you could write a script for your own use and/or submit it for public consumption if it proves useful to you. |
Beta Was this translation helpful? Give feedback.
-
Right, the two processes are essentially distinct. The overlap occurs in the mapping from the dynamic view to the static view, i.e. symbols are not moved from the debugger to the static view or vice versa, but, if you're at an offset in the debugger and the view is tracking accurately to the static view, obviously symbols applied to the static view should inform your understanding. There are some exceptions to this in that the Dynamic View shares an interface with the Static View. For example, you can drag & drop a structures from the DataType Manager onto the Dynamic Listing and that should work. Bear in mind, generally speaking, the two views are drawing from the same source, i.e. the symbols generated at compile time, whether archived, online, or local. So, the question I guess I would put to you is what exactly is your use case, i.e. what do you want to do that you cannot do with the current tool set. More specifically, is there something you're trying to do that requires loading symbols in the debugger and applying them to the project? |
Beta Was this translation helpful? Give feedback.
-
OK, getting somewhere now! I think I understand your use case and the feature you want. Am going to try to summarize it, just to make sure we're on the same page, suggest a couple possible solutions, and perhaps argue that this may not be the best approach. The problem, if I'm understanding it correctly, is that the debugger has (a) an accurate list of loaded modules during the target's execution, (b) the full paths to those modules, and a call stack with runtime addresses in those modules. You would like to load these modules into the current project to (a) aid understanding, and (b) enable search. In particular, while you could easily use the "Import File" function or cut&paste the path from the debugger module list into the Import function for one or two modules, doing that for hundreds of modules is an unpleasant prospect. Hopefully, I've haven't misrepresented your description, but let me know if I have obviously. So, solutions:
This last point is super-important and bears more discussion. I can't think of any common reverse engineering scenario where you would want to import the entire list. Most of these modules are kernel modules and highly unlikely to be relevant to any analysis your undertaking of the target. (If you're interested in understanding the kernel, that's fine, but there are much better ways to do it.) The list you posted above, for example, is entirely composed of kernel DLLs, except for ntdll.dll, which is basically the gate to the syscall mechanism. If you need to understand functions in ntdll or the system calls behind them, you're much better off googling them than disassembling the constituent code. The same could be said for the call stack, in general. It's very unlikely that the top half of the call stack will be worth exploring in the context of any current execution. I know you expressed wanting to know "what is really happening inside each (imported) function", but that's generally not a sane strategy for RE, if you'll pardon my saying so. Two other small points: you mentioned having to map all the modules by hand. I'm not sure I understand what you're referring to here. The debugger matches running targets to static programs based on names and some heuristics. 90% of the time (or better) that's a accurate match. Discrepancies occur when a program was renamed after compilation or renamed on disk (or renamed on import into Ghidra). Those are cases where "Map Module" is needed, but they shouldn't be the norm. These matches will hold even if you decide to import a module while the target is running. Also, you mentioned wanting to search using the project. There are times this makes sense, but again not the norm. Searching memory (even all of loaded memory) in the debugger is much, much more efficient than searching multiple programs in the database. (I'm actually not even sure you can do searches across the project database - am pretty sure the search feature operates per program. Will check tomorrow.) A case for using the static listing might be wanting to compose a complex search or searching for things only found in the static listing. Do you have a particular search in mind? OK, so apologies for the long-winded rant, but I think we're moving the ball forward here. I would probably lean towards option four, extending the selection range, as this seems closest to your original request, relatively straighforward, and possibly useful for other contexts. Let me know what you think. |
Beta Was this translation helpful? Give feedback.
-
@nsadeveloper789 and I are leaning towards at least option 4. I'm not sure you'd need anything more if 4 were implemented, i.e. if multi-entry selection were enabled (and the ability to use that correctly on the back end), then you could select the entire table contents or any subset, right-click load, and all of those would be imported (with PDBs, assuming that was set up correctly in the tool options). Option 3 would be a bonus for batch processing. Regarding PDB mismatches, it looks like windbg is confused by the fact that you have multiple copies of ntdll.dll, i.e. one in your local python path and one in system32. Haven't seen that before, so, sadly, no guesses on a fix. And I would never expect the decompilation to be perfect without some effort on the user's part, symbols or no. Worth working through the decompiler-related sections of the GhidraClass for pointers. Re M$ documentation, heard, although for straight API descriptions they're not that horrible. There are other options as well. The Windows Internals books are a good starting point for understanding the kernel. IsDebuggerPresent - set the breakpoint. That's the obvious first-shot approach. If the program has more complicated anti-RE, try setting a hardware breakpoint. If that doesn't work, well, time to break out a copy of "Crackproof Your Software" and equivalent. One last note, which I should have thought of earlier: are you debugging with "dbgmodel" checked in the starup dialog? If so, maybe try unselecting that - might improve the single-stepping experience. |
Beta Was this translation helpful? Give feedback.
-
xbg
has this nicesymbols
table where I can right click -> downloadpbo
for all modules:It also automatically shows full paths to them. I wonder what would be the correct way to mimic this in
ghidra
at the momemnt?This seems really efforthless!
Beta Was this translation helpful? Give feedback.
All reactions