Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added secondary selectors to ARM opinion file for correctly identifyi… #2066

Closed
wants to merge 1 commit into from

Conversation

WorksButNotTested
Copy link
Contributor

Fixes to identification of ARMBE8 binaries.

@WorksButNotTested
Copy link
Contributor Author

Fix for #1187

@mumbel
Copy link
Contributor

mumbel commented Jul 3, 2020

that looks like its only ABI 4,5 so should probably have at least 3 entries: bit 26 unset ; bit 22 and 26 set and 23 unset; bit 23 and bit 26 set and 22 unset. That comment is a little overkill, you could accomplish the same in a one liner like <-- e_flags parse EF_ARM_BE8 and EF_ARM_LE8 -->

@WorksButNotTested
Copy link
Contributor Author

Thanks for the quick feedback.

Although it seems unlikely you'd encounter an ELF with EF_ARM_BE8 set if it wasn't EF_ARM_EABI_VER4 or EF_ARM_EABI_VER5, and equally it wouldn't make sense to see one with both
EF_ARM_BE8 and EF_ARM_LE8 set it certainly won't do any harm to exclude them from being treated as ARM BE8 (V8LEInstruction).

If I understand, we have a total of 8 possible combinations of the 3 flags, but I think we can split into 4 cases:
V8LEInstruction = EF_ARM_EABI_VER4 set, EF_ARM_BE8 set and EF_ARM_LE8 clear.
v8 #1 = EF_ARM_EABI_VER4 clear
v8 #2 = EF_ARM_BE8 clear
v8 #3 = EF_ARM_LE8 set

This should simply each of the negative cases to consider only a single flag.

This has the property that v4 and later will consider the EF_ARM_BE8 flag to be valid and that if both EF_ARM_BE8 and EF_ARM_LE8 are set (ambiguous) then the binary will not be treated as V8LEInstruction.

Does this seem right? Or did I miss something subtle in the 3 cases you had above?

I copied the comment style from here. Happy to truncate as you see fit however.

@mumbel
Copy link
Contributor

mumbel commented Jul 3, 2020

I really only see the 3 cases:

  1. It's not 4 or 5. Don't read either BE or LE
  2. It's 4 or 5. It's BE (not LE)
  3. It's 4 or 5. It's LE (not BE)

If it can't be both and has to be one or the other (unless that's not true), your 2 an 3 read the same to me. If you check for BE8 clear, then you're assuming it's 4 or 5 and LE8 set. If you check for LE8 set, then you're assuming 4 or 5 and BE8 clear.

@WorksButNotTested
Copy link
Contributor Author

Ok. No problems. So we treat the BE8 and LE8 flags as manually exclusive? Logically they should be, and I'm pretty sure the binary would be invalid if both are set. All the little endian binaries I've seen don't set either flag. I've never seen the LE8 flag set.

So I should just drop v8 #3 from my list above and I'm all good? It might mean that binaries with both flags set are treated as be8, but then they're junk anyway, and neither setting is more right than the other.

@mumbel
Copy link
Contributor

mumbel commented Jul 3, 2020

Yeah, just thinking it would be nicer to manage all 3 bits in all those cases than your current commit; though, like you brought up with the odds of seeing anything that looked weird, I guess making a change from the current commit isn't even necessary, but just breaks it down in case (just depending on your view and how loose you want the matches to be).

All the little endian binaries I've seen don't set either flag.

but.. this, yeah it would need that 4 case (yeah i was reading it as mutually exclusive which it sounds like thats not true)... okay, I think never mind about my original comment, I wouldn't change anything at this point

I'm all good

Just to be clear I'm not anyone authoritative, just another contributor reading your PR offering thoughts.

@WorksButNotTested
Copy link
Contributor Author

Maybe it's best to leave it as is for now, and see what others think? Not that my way is right, just that it's perhaps the most simplistic for now. I'm not particularly precious about exactly what the rules are, as long as it can correctly identify BE8 images then I'm happy. Perhaps someone else will have come across this sort of issue before from other architectures and have a strong opinion on how it should be done.

Authoritive or not, it's always useful to get a second set of eyes on things. How does the review process normally work? Are PRs usually merged fairly quickly? How frequently are new versions released to incorporate the changes?

Thanks for taking the time to help me out.

@mumbel
Copy link
Contributor

mumbel commented Jul 3, 2020

https://github.com/NationalSecurityAgency/ghidra/blob/master/CONTRIBUTING.md

Would start with reading through this, but that's tough to say. I think with all software dev, it just comes down to a lot of variables... how much review it takes (at some point this PR will get tagged and assigned to a collaborator), do they have the cycles to look at it (and do they prioritize that), is it a feature or bugfix (how bad?) and what are the odds of regressions. I've had PRs merged hours later (I think) vs months (though those were processor modules).

Coming off US holidays and end of calendar year (i.e. lots of PTO to use before you forfeit) into COVID and now summer; and general increase in popularity/usage, the backlog for PRs has grown. They seem back to work in more of a full capacity in the last week or so (at least in regards to github). Their release cadence has been pretty good though https://ghidra-sre.org/releaseNotes_9.1.2.html

@ghidra1
Copy link
Collaborator

ghidra1 commented Jul 14, 2020

Since EF_ARM_BE8 corresponds to v8LEInstruction, do we need an additional variant to address EF_ARM_LE8 (e.g., v8BEInstruction; big-endian instructions with little-endian data) ?

<constraint primary="40"   processor="ARM"                      size="32" variant="v8"
        secondary= "0b .... .... 00.. .... .... .... .... ...."/>
<constraint primary="40"   processor="ARM"                      size="32" endian="big" variant="v8LEInstruction"
        secondary= "0b .... .... 10.. .... .... .... .... ...."/> <!-- BE8 -->
<constraint primary="40"   processor="ARM"                      size="32" endian="little" variant="v8BEInstruction"
        secondary= "0b .... .... 01.. .... .... .... .... ...."/> <!-- LE8 -->

I also added some additional constraints for the BE8 and LE8 cases.

@WorksButNotTested
Copy link
Contributor Author

I haven't come across a combination where instructions are big endian and data little endian. The only other variant I'm aware of is BE32, where everything is big endian. As far as I'm aware this is also deprecated.

I figured as long as the PR doesn't introduce any regressions, subsequent PRs can address any other uncommon missing supported options?

@ghidra1
Copy link
Collaborator

ghidra1 commented Jul 14, 2020

I believe my additional endian constraint is automatic and not needed, I was on the fence about constraining the LE8 bit to 0 but left it as is. Pruned comments a bit and merged.

@ghidra1 ghidra1 closed this Jul 14, 2020
@WorksButNotTested
Copy link
Contributor Author

Brilliant. Thanks for taking the time to merge these changes. They'll help a great deal for my use cases. Do you have a schedule for the next release which might incorporate them?

@ghidra1 ghidra1 added this to the 9.2 milestone Jul 15, 2020
@ghidra1
Copy link
Collaborator

ghidra1 commented Jul 15, 2020

Ghidra 9.2, although no date has been set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants