-
Notifications
You must be signed in to change notification settings - Fork 1.6k
S390x: emit new instructions added in z17 #12319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ae1c56a to
fac9929
Compare
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:x64", "cranelift:meta", "isle"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
I'm going to shift review of this over to @uweigand |
|
or, well, I can't officially do that, but @uweigand I'm happy to rubber-stamp once you've approved |
uweigand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me, but see inline comments.
In addition to what is implemented here, we now could implement vector integer division for 32-bit and 64-bit integer vectors - but there is currently no ISLE to even express this.
| (rule 16 (lower (has_type (and (vxrs_ext3_enabled) (vr128_ty ty)) (band (band y z) (bnot x)))) | ||
| (vec_eval ty 0b00000010 x y z)) | ||
| (rule 17 (lower (has_type (and (vxrs_ext3_enabled) (vr128_ty ty)) (band (band x (bnot y)) z))) | ||
| (vec_eval ty 0b00000010 y x z)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what if any canonicalization is done at the ISLE level here, but these four don't cover all possible combinations. E.g. (band x (band (bnot y) z)) is not covered.
I guess a more fundamental question is which combinations we should be covering. For example, why cover and-not with three inputs but not or-not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stopped because I realized this would add hundreds of rules to get correct, and ran out of steam pretty quickly. I think it's an open question: what should we encode? what 3-input binary operations are actually used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a python script to handle this, and committed the results. My thinking is this:
- This only has to happen once
- It would be better to have the results interspersed in the file near their 2-input counterparts for better
| ; block0: ; offset 0x0 | ||
| ; .byte 0xe7, 0x8a | ||
| ; .byte 0x80, 0x02 | ||
| ; .byte 0x9f, 0x88 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add z17 insns to the disassembler, but that is of course a different patch.
fac9929 to
73c5595
Compare
73c5595 to
a32e45e
Compare
This emits & tests a bunch of instructions:
* from Miscellaneous-Instruction-Extensions Facility 4:
* CLZ, 64bit
* CTZ, 64bit
* from Vector-Enhancements Facility 3:
* 32x4, 64x2 & 128x1 variants of the following:
* Divide
* Remainder
* 64x2 & 128x1 multiply variants
* 128x1 vaiants of:
* Compare
* CLZ
* CTZ
* Max
* Min
* Average
* Negation
* Evaluate
Co-authored-by: Jimmy Brisson <jbrisson@linux.ibm.com>
Now that s390x implements blendv as well, we should refer to the instruction without the x86 prefix.
a32e45e to
898cc0e
Compare
Z17 (arch15) includes some instructions that allow us to encode some more complicated operations in fewer instructions. This PR adds support to cranelift-codegen to emit these newer instructions when appropriate.
Further, Z17 includes a VBLEND instruction that mimics the same instruction on x64. Since this is no longer an x64-exclusive instruction type, I've renamed the appropriate stuff within cranelift codegen to reflect that this is not ISA-specific anymore.