-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add common extensions to Motorola 68k Assembly #4637
Merged
lildude
merged 6 commits into
github-linguist:master
from
idrougge:m68k-common-extensions
Jan 14, 2020
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
d94a49c
Add common extensions to Motorola 68k
idrougge de3c2d4
Revert ACE mode for m68k assembly
idrougge ee17499
Merge branch 'master' into m68k-common-extensions
Alhadis 6ef9030
Merge branch 'master' into m68k-common-extensions
Alhadis c8bface
Add heuristics for Motorola 68K Assembly
Alhadis 3547e2f
Add SWIG language and `.i` Assembly extension
Alhadis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
; this file is part of Release, written by Malban in 2017 | ||
; | ||
*********************************************************** | ||
; input list in X | ||
; destroys u | ||
; 0 move | ||
; negative use as shift | ||
; positive end | ||
asm_draw_3ds: | ||
ldu 2,x | ||
lda 1,x; | ||
starts: | ||
sta $d004; | ||
ldd ,u; | ||
sta $d001; | ||
clr $d000; | ||
lda ,x; | ||
inc $d000; | ||
stb $d001; | ||
sta $d00A; | ||
clr $d005; | ||
leax 4,x; | ||
ldu 2,x; | ||
lda ,x; | ||
bgt end1s; | ||
lda 1,x; | ||
ldb #$40; | ||
waits: bitb $d00D; | ||
beq waits; | ||
ldb #0 | ||
stb $d00A; | ||
bra starts; | ||
end1s: ldd #$0040; | ||
ends: bitb $d00D; | ||
beq ends; | ||
sta $d00A | ||
rts | ||
|
||
|
||
asm_draw_3d: | ||
ldu 1,x | ||
start: ldd ,u; | ||
sta $d001; | ||
clr $d000; | ||
lda ,x; | ||
inc $d000; | ||
stb $d001; | ||
sta $d00A; | ||
clr $d005; | ||
leax 3,x; | ||
ldu 1,x; | ||
lda ,x; | ||
bgt end1; | ||
ldd #$0040; | ||
wait: bitb $d00D; | ||
beq wait; | ||
sta $d00A; | ||
bra start; | ||
end1: ldd #$0040; | ||
end: bitb $d00D; | ||
beq end; | ||
sta $d00A | ||
rts | ||
|
||
|
||
|
||
; Cosinus data | ||
cosinus3d: | ||
DB 63, 62, 61, 60, 58, 55, 52, 48, 43, 39, 34 ; 11 | ||
DB 28, 23, 17, 10, 4, -1, -7, -14, -20, -25, -31 ; 22 | ||
DB -36, -41, -46, -50, -53, -56, -59, -61, -62, -62, -62 ; 33 | ||
DB -62, -61, -59, -56, -53, -50, -46, -41, -36, -31, -25 ; 44 | ||
DB -20, -14, -7, -1, 4, 10, 17, 23, 28, 34, 39 ; 55 | ||
DB 43, 48, 52, 55, 58, 60, 61, 62, 63 | ||
; Sinus data | ||
sinus3d: | ||
DB 0, 6, 12, 18, 24, 30, 35, 40, 45, 49, 52 ; 11 | ||
DB 56, 58, 60, 62, 62, 62, 62, 61, 59, 57, 54 ; 22 | ||
DB 51, 47, 42, 38, 32, 27, 21, 15, 9, 3, -3 ; 33 | ||
DB -9, -15, -21, -27, -32, -38, -42, -47, -51, -54, -57 ; 44 | ||
DB -59, -61, -62, -62, -62, -62, -60, -58, -56, -52, -49 ; 55 | ||
DB -45, -40, -35, -30, -24, -18, -12, -6, -3 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any registers or opcodes unique to Motorola we can use to disambiguate assembly files with?
We're definitely going to need some heuristics for
.asm
and.inc
. The latter of which is particularly important because it sees very general use across a range of unrelated) languages…There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M68k assembly is easily distinguished from other assembly languages, both by registers and opcodes.
How would such a heuristic look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A regular expression; I'm happy to write it for you, provided you give me the names of substrings guaranteed (or highly unlikely) to appear in the source code of any other assembler language.
Here are our existing heuristics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(?im:moveq\b.*?d\d|move\.[bwl]\s+.*\b[ad]\d|movem\.[bwl]\b|btst\b|dbra\b)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to limit the
moveq
heuristic to match two registers (one address, one data)? From what I hear, 68k is unique for differentiating between the two.If so, we could try this:
When writing heuristics, it's best to be as specific as possible; anything which doesn't match is passed down to the (less accurate) classification techniques.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Credit for this expression belongs to @zerkman, since it's taken from the
language-m68k
grammar we're using to highlight 68k on GitHub (I did clean it up and remove some redundant syntax for clarity).I've amended the other parts of that expression to use other bits of that grammar, bringing us down to:
Notice that I've anchored the remaining parts to match at the beginning of a line (with or without indentation). This reduces the risk of incorrectly matching part of a comment in an unrelated file. For the same reason, you'll notice I avoid using wildcards when possible (
.*
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@idrougge If the above revisions look good to you, then the changes to make to
heuristics.yml
are below.I'll still need to test them thoroughly on my end, as well as investigate any possible formats using the
.i
extension that we've not registered yet.Click to show diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the
^ \s* move (\.[bwl])? \s+ (sr|usp), \s* [^\s]+
line, and it only catches moves tosr
orusp
, not moves to any given register. I feel that the Motorola syntax of `move.size with source or destination as a register named Dn or An is sufficiently dissimilar to other assembly syntaxes to avoid confusion with other assembly languages while also catching even the shortest snippet.Testing may prove the totality of heuristics to still be sufficient to catch all m68k assembly sources.
.i
, like.inc
,.asm
or.s
is used by assemblers on most platforms AFAIK.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, the third line of the
m68k
heuristic becomes:- '(?im)^\s*move\.[bwl]\s+.*\b[ad]\d'
I'll update the diff I just posted.
There are currently 11,273,157
.i
files publicly indexed on GitHub. Surely there must be other formats hidden out there...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That regex looks fine for heuristics.
A quick glance at those results indicate that a lot of
.i
files are SWIG files, which may need a language definition of their own.