Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M68000 OperandTypes not being set properly #4355

Closed
alex-bellon opened this issue Jun 18, 2022 · 5 comments
Closed

M68000 OperandTypes not being set properly #4355

alex-bellon opened this issue Jun 18, 2022 · 5 comments

Comments

@alex-bellon
Copy link
Contributor

alex-bellon commented Jun 18, 2022

Describe the bug
For instructions where one of the operands is a register, the OperandTypes are not always set the same.

To Reproduce
Steps to reproduce the behavior:

  1. In a Ghidra Script, iterate through instructions, and then through the the operands of each instruction.
  2. For each operand, print out the raw bitstring value of OperandType, in addition to Types that corresponds to.

Expected behavior
Unless I am misunderstanding how these flags work, operands that are using the same addressing modes (e.g. offset from the address in a register) should have the same resulting operand types.

Screenshots

Here, the second operand is an address, but is not marked as one (output from my own script, attached)

instruction: move.b (0x070a079c).l,(0x4,A5)
    operand: [0x4, A5]
object type: <type 'ghidra.program.model.scalar.Scalar'>
object type: <type 'ghidra.program.model.lang.Register'>
  bitstring: 10000000000000000000000
operandtype: DYNAMIC

But here another operand is using the same addressing mode and is marked as an ADDRESS in addition to DYNAMIC

instruction: movea.l (-0x380,A6),A5
    operand: [-0x380, A6]
object type: <type 'ghidra.program.model.scalar.Scalar'>
object type: <type 'ghidra.program.model.lang.Register'>
  bitstring: 10000000010000000000000
operandtype: DYNAMIC, ADDRESS

Here is another example, where in one case a register is properly flagged as REGISTER,

instruction: move.l #0x288,D0
    operand: [D0]
object type: <type 'ghidra.program.model.lang.Register'>
  bitstring: 1000000000
operandtype: REGISTER

but here a register is not

instruction: mulu.l #0x406,D0
    operand: [D0]
object type: <type 'ghidra.program.model.lang.Register'>
  bitstring: 10000000000000000000000
operandtype: DYNAMIC

Attachments
My (hacky) Ghidra script to print out this information:

# Print operand types for each operands
#
# @author balex
# @category M68K

from ghidra import *

# from https://ghidra.re/ghidra_docs/api/constant-values.html#ghidra.program.model.lang.OperandType
operand_type = {
  0: "READ",
  1: "WRITE",
  2: "INDIRECT",
  3: "IMMEDIATE",
  4: "RELATIVE",
  5: "IMPLICIT",
  6: "CODE",
  7: "DATA",
  8: "PORT",
  9: "REGISTER",
  10: "LIST",
  11: "FLAG",
  12: "TEXT",
  13: "ADDRESS",
  14: "SCALAR",
  15: "BIT",
  16: "BYTE",
  17: "WORD",
  18: "QUADWORD",
  19: "SIGNED",
  20: "FLOAT",
  21: "COP",
  22: "DYNAMIC"
}

def get_types(value):
    bitstr = "{0:b}".format(value)
    length = len(bitstr)
    print("  bitstring: " + bitstr)

    types = list()
    for i, bit in enumerate(bitstr):
        pos = length - i - 1
        if int(bit): types.append(operand_type[pos])

    return types

listing = currentProgram.getListing()

fm = currentProgram.getFunctionManager()
for func in fm.getFunctions(True):
    addr = func.getBody()
    instructs = listing.getInstructions(addr, True)

    for ins in instructs:
        for i in range(ins.getNumOperands()):
            oper = ins.getOpObjects(i)
            if len(oper):
                print("================")
                print('instruction: ' + str(ins))
                print('    operand: ' + str(list(oper)))
                for obj in oper:
                    print('object type: ' + str(type(obj)))
                print('operandtype: ' + ', '.join(get_types(ins.getOperandType(i))))

Environment (please complete the following information):

  • OS: Manjaro Linux 21.2.6
  • Java Version: 18.0.1.1
  • Ghidra Version: 10.1.3
  • Ghidra Origin: pacman
@alex-bellon alex-bellon changed the title OperandTypes not being set properly in 68000 M68000 OperandTypes not being set properly in Jun 18, 2022
@alex-bellon alex-bellon changed the title M68000 OperandTypes not being set properly in M68000 OperandTypes not being set properly Jun 18, 2022
@ryanmkurtz ryanmkurtz added Feature: Processor/68000 Status: Triage Information is being gathered labels Jun 22, 2022
@GhidorahRex GhidorahRex assigned ghidra1 and unassigned GhidorahRex Jun 28, 2022
@ghidra1
Copy link
Collaborator

ghidra1 commented Jun 29, 2022

Operand type determination is very sensitive to how a language is written and the complexity of operand subconstructors. In some cases the issue can stem from poor choices in language specification in other cases it is unavoidable. Could you please provide screen captures of the Instruction Information dialog for sample instruction cases. This dialog can be displayed via the popup menu in the listing for a selected instruction.

@ghidra1
Copy link
Collaborator

ghidra1 commented Jun 29, 2022

You should probably eliminate the following restriction in your script to allow output for every operand. Not all operands produce op-objects but may produce an operand type.

oper = ins.getOpObjects(i)
            if len(oper):

It is also confusing since some of your output relates to the entire instruction but is output for each operand. Your nested loops which both iterate per operand seems odd:

for i in range(ins.getNumOperands()):  <<<<<<<<< 1st
            oper = ins.getOpObjects(i)
            if len(oper):
                print("================")
                print('instruction: ' + str(ins))
                print('    operand: ' + str(list(oper)))
                for obj in oper:  <<<<<<<<<<<< 2nd
                    print('object type: ' + str(type(obj)))

@alex-bellon
Copy link
Contributor Author

alex-bellon commented Jun 29, 2022

For the two iterators, the first loop is iterating through the operands, and the second is iterating through the objects in each operand. E.g. for the instruction movea.l (-0x380,A6),A5, the first iteration of the first loop is returning the list of objects [-0x380, A6], and then the second loop would iterate through this list.

And as for printing out the same instruction for each operand, that was just so I could have the context of what the instruction was in every print.

@ghidra1
Copy link
Collaborator

ghidra1 commented Jun 29, 2022

Case 1: For your movea.l (-0x380,A6),A5 case the presence of a memory/stack/external reference on the first operand will cause the addition of the ADDRESS type to what the instruction prototype generated (i.e., DYNAMIC). This reference is likely missing from the second. Analysis in some case is able to determine the resulting address associated with a computed operand and add such a reference.

Case 2: The move.l #0x288,D0 uses a simple register attach on the operand while the later one mulu.l #0x406,D0 uses submul which does not export a value for the operand. Without more investigation it is unclear to me why this mulu.l instruction has been implemented in such a complex manner.

:mul^mulsize e2l,submul		is opbig=0x4c & op67=0 & $(DAT_ALTER_ADDR_MODES); submul & mulsize; e2l [ savmod2=savmod1; regtsan=regtfan; ] { glbdenom=e2l; build submul; }

submul: regdq			is regdq & divsgn=0 & divsz=0				{ regdq = glbdenom * regdq; resflags(regdq); CF=0; }
submul: regdr-regdq		is regdq & divsgn=0 & divsz=1 & regdr			{ tmp1:8 = zext(glbdenom); tmp2:8 = zext(regdq); local res=tmp1*tmp2; regdq=res:4;

The DYNAMIC type is a default if no other specific type can be determined.

You can also look into the Instruction.getDefaultOperandRepresentationList(int operand) which will include objects such as Scalar, Address, Register which are specific to what you see rendered for an operand. This method avoids any extra markup which may occur when formatted for the listing display.

@ghidra1 ghidra1 added Feature: API Reason: Working as intended This is working as intended. and removed Status: Triage Information is being gathered labels Jun 29, 2022
@ghidra1
Copy link
Collaborator

ghidra1 commented Jun 29, 2022

@alex-bellon API appears to be working as intended - albeit confusing at times. Closing ticket, but feel free to add comments/questions if you like.

@ghidra1 ghidra1 closed this as completed Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants