-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add iterative shifter #4
Conversation
34019ef
to
6b7c97b
Compare
} | ||
|
||
val format = new eu.Execute(formatAt) { | ||
wb.valid := SHIFT_DONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok to wb.valid bad data (as long as the good data happen last)
So, you can do wb.valid := SEL to reduce logic to minimum
the wb interface itself is then checked for lane.valid + reschedule at the end point.
val shiftInput = busy ? shiftReg | signExtended | ||
val shiftResult = zeroShift ? signExtended | shifted | ||
|
||
shifted := (LEFT & True).mux( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asking myself if maybe we could do a "smarter" design with the following hardware shifts :
- >> 1
- >> 8
- reverse all bits
- shift by zero case being emulated by doing 2 "reverse all bits"
That way we would have as many mux than now, but much high performance for big shifts, as we could compose shifts using the >> 8 and \ >> 1
In a more generic manner, the number of different >> X possible could be a parameter List[Int]
So, the design would be kinda much closer to the actual BarrelShifterPlugin, but over multiple cycles.
val busy = RegInit(False) | ||
val amplitudeReg = Reg(cloneOf(shamt)) | ||
val amplitude = busy ? amplitudeReg | shamt | ||
val done = amplitude(4 downto 1) === 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only for 32 bits it works ^^
f5e1c58
to
4f12c89
Compare
Fix busy for slow result
4f12c89
to
979ea27
Compare
Thanks :) |
# Conflicts: # src/main/scala/vexiiriscv/execute/lsu/LsuCachelessPlugin.scala
Reproduce issue with
Test/runMain vexiiriscv.tester.TestBench --load-elf ext/NaxSoftware/baremetal/coremark/build/rv32ima/coremark.elf --with-konata --with-wave --with-spike-log --with-rvls-log --decoders=2 --lanes=2 --allow-bypass-from=0 --with-btb --with-late-alu --with-mul --with-div --with-iterative-shift