cranelift_codegen/machinst/buffer.rs
1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//! emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//! For example, on AArch64, conditional branches can only target code +/- 1MB
12//! from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//! edge blocks when the register allocator does not need to insert any
17//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18//! patterns. The lowering also pays no attention to block order, and so
19//! two-target conditional forms (cond-br followed by uncond-br) can often by
20//! avoided because one of the targets is the fallthrough. There are several
21//! cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//! forms, but with zero offsets and optimistically assuming the target will be
29//! in range. Record the "fixup" for later. Targets are denoted instead by
30//! symbolic "labels" that are then bound to certain offsets in the buffer as
31//! we emit code. (Nominally, there is a label at the start of every basic
32//! block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//! reference "goes out of range". We call this the "deadline". If we reach the
36//! deadline and we still have not bound the label to which an unresolved branch
37//! refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//! simply a chunk of code inserted in the middle of the code actually produced
41//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//! has some awareness of this: it either asks for an island between blocks, so
43//! it is not accidentally executed, or else it emits a branch around the island
44//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//! that implements a longer-range reference to a label. The idea is that, for
48//! example, a branch with a limited range can branch to a "veneer" instead,
49//! which is simply a branch in a form that can use a longer-range reference. On
50//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//! conditional branch's label reference can be fixed up with a "veneer" to
53//! achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//! type that implements a trait. This is nominally an enum that records one of
57//! several kinds of references to an offset in code -- basically, a relocation
58//! type -- and will usually correspond to different instruction formats. The
59//! `LabelUse` implementation specifies the maximum range, how to patch in the
60//! actual label location when known, and how to generate a veneer to extend the
61//! range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//! offset of the branch (the fallthrough location), can be removed altogether,
79//! because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//! another label, results in a "label alias": all references to the label bound
83//! *to* this branch instruction are instead resolved to the *target* of the
84//! branch instruction. This effectively removes empty blocks that just
85//! unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//! to the unconditional's fallthrough, results in (i) the truncation of the
89//! unconditional, (ii) the inversion of the condition's condition, and (iii)
90//! replacement of the conditional's target (using the original target of the
91//! unconditional). This is a fancy way of saying "we can flip a two-target
92//! conditional branch's taken/not-taken targets if it works better with our
93//! fallthrough". To make this work, the emitter actually gives the buffer
94//! *both* forms of every conditional branch: the true form is emitted into the
95//! buffer, and the "inverted" machine-code bytes are provided as part of the
96//! branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//! been redirected to target(B), can be removed entirely. This is an extension
100//! of the branch-threading optimization, and is valid because if we know there
101//! will be no fallthrough into this branch instruction (the prior instruction
102//! is an unconditional jump), and if we know we have successfully redirected
103//! all labels, then this branch instruction is unreachable. Note that this
104//! works because the redirection happens before the label is ever resolved
105//! (fixups happen at island emission time, at which point latest-branches are
106//! cleared, or at the end of emission), so we are sure to catch and redirect
107//! all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//! (start/end offsets), the label target, the corresponding fixup-list entry
117//! index, and the bytes (must be the same length) for the inverted form, if
118//! conditional. The list of labels that are bound to the start-offset of this
119//! branch is *complete* (if any label has a resolved offset equal to `start`
120//! and is not an alias, it must appear in this list) and *precise* (no label
121//! in this list can be bound to another offset). No label in this list should
122//! be an alias. No two branch ranges can overlap, and branches are in
123//! ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//! to (whose resolved offsets are equal to) the tail offset of the buffer.
127//! No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//! UNKNOWN. No label can be bound at an offset greater than the current
131//! buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//! bound or UNKNOWN. A label's resolved offset is the resolved offset
135//! of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142
143use crate::binemit::{Addend, CodeOffset, Reloc, StackMap};
144use crate::ir::{ExternalName, Opcode, RelSourceLoc, SourceLoc, TrapCode};
145use crate::isa::unwind::UnwindInst;
146use crate::machinst::{
147 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
148};
149use crate::timing;
150use crate::trace;
151use cranelift_entity::{entity_impl, SecondaryMap};
152use smallvec::SmallVec;
153use std::convert::TryFrom;
154use std::mem;
155use std::string::String;
156use std::vec::Vec;
157
158#[cfg(feature = "enable-serde")]
159use serde::{Deserialize, Serialize};
160
161#[cfg(feature = "enable-serde")]
162pub trait CompilePhase {
163 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
164 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
165}
166
167#[cfg(not(feature = "enable-serde"))]
168pub trait CompilePhase {
169 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
170 type SourceLocType: core::fmt::Debug + PartialEq + Clone;
171}
172
173/// Status of a compiled artifact that needs patching before being used.
174#[derive(Clone, Debug, PartialEq)]
175#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
176pub struct Stencil;
177
178/// Status of a compiled artifact ready to use.
179#[derive(Clone, Debug, PartialEq)]
180pub struct Final;
181
182impl CompilePhase for Stencil {
183 type MachSrcLocType = MachSrcLoc<Stencil>;
184 type SourceLocType = RelSourceLoc;
185}
186
187impl CompilePhase for Final {
188 type MachSrcLocType = MachSrcLoc<Final>;
189 type SourceLocType = SourceLoc;
190}
191
192/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
193/// in bulk.
194///
195/// This struct uses `SmallVec`s to support small-ish function bodies without
196/// any heap allocation. As such, it will be several kilobytes large. This is
197/// likely fine as long as it is stack-allocated for function emission then
198/// thrown away; but beware if many buffer objects are retained persistently.
199pub struct MachBuffer<I: VCodeInst> {
200 /// The buffer contents, as raw bytes.
201 data: SmallVec<[u8; 1024]>,
202 /// Any relocations referring to this code. Note that only *external*
203 /// relocations are tracked here; references to labels within the buffer are
204 /// resolved before emission.
205 relocs: SmallVec<[MachReloc; 16]>,
206 /// Any trap records referring to this code.
207 traps: SmallVec<[MachTrap; 16]>,
208 /// Any call site records referring to this code.
209 call_sites: SmallVec<[MachCallSite; 16]>,
210 /// Any source location mappings referring to this code.
211 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
212 /// Any stack maps referring to this code.
213 stack_maps: SmallVec<[MachStackMap; 8]>,
214 /// Any unwind info at a given location.
215 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
216 /// The current source location in progress (after `start_srcloc()` and
217 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
218 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
219 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
220 label_offsets: SmallVec<[CodeOffset; 16]>,
221 /// Label aliases: when one label points to an unconditional jump, and that
222 /// jump points to another label, we can redirect references to the first
223 /// label immediately to the second.
224 ///
225 /// Invariant: we don't have label-alias cycles. We ensure this by,
226 /// before setting label A to alias label B, resolving B's alias
227 /// target (iteratively until a non-aliased label); if B is already
228 /// aliased to A, then we cannot alias A back to B.
229 label_aliases: SmallVec<[MachLabel; 16]>,
230 /// Constants that must be emitted at some point.
231 pending_constants: SmallVec<[MachLabelConstant; 16]>,
232 /// Traps that must be emitted at some point.
233 pending_traps: SmallVec<[MachLabelTrap; 16]>,
234 /// Fixups that must be performed after all code is emitted.
235 fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
236 /// Current deadline at which all constants are flushed and all code labels
237 /// are extended by emitting long-range jumps in an island. This flush
238 /// should be rare (e.g., on AArch64, the shortest-range PC-rel references
239 /// are +/- 1MB for conditional jumps and load-literal instructions), so
240 /// it's acceptable to track a minimum and flush-all rather than doing more
241 /// detailed "current minimum" / sort-by-deadline trickery.
242 island_deadline: CodeOffset,
243 /// How many bytes are needed in the worst case for an island, given all
244 /// pending constants and fixups.
245 island_worst_case_size: CodeOffset,
246 /// Latest branches, to facilitate in-place editing for better fallthrough
247 /// behavior and empty-block removal.
248 latest_branches: SmallVec<[MachBranch; 4]>,
249 /// All labels at the current offset (emission tail). This is lazily
250 /// cleared: it is actually accurate as long as the current offset is
251 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
252 /// be considered as empty.
253 ///
254 /// For correctness, this *must* be complete (i.e., the vector must contain
255 /// all labels whose offsets are resolved to the current tail), because we
256 /// rely on it to update labels when we truncate branches.
257 labels_at_tail: SmallVec<[MachLabel; 4]>,
258 /// The last offset at which `labels_at_tail` is valid. It is conceptually
259 /// always describing the tail of the buffer, but we do not clear
260 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
261 /// when the offset has grown past this (`labels_at_tail_off`) point.
262 /// Always <= `cur_offset()`.
263 labels_at_tail_off: CodeOffset,
264 /// Map used constants to their [MachLabel].
265 constant_labels: SecondaryMap<VCodeConstant, MachLabel>,
266}
267
268impl MachBufferFinalized<Stencil> {
269 /// Get a finalized machine buffer by applying the function's base source location.
270 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
271 MachBufferFinalized {
272 data: self.data,
273 relocs: self.relocs,
274 traps: self.traps,
275 call_sites: self.call_sites,
276 srclocs: self
277 .srclocs
278 .into_iter()
279 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
280 .collect(),
281 stack_maps: self.stack_maps,
282 unwind_info: self.unwind_info,
283 }
284 }
285}
286
287/// A `MachBuffer` once emission is completed: holds generated code and records,
288/// without fixups. This allows the type to be independent of the backend.
289#[derive(PartialEq, Debug, Clone)]
290#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
291pub struct MachBufferFinalized<T: CompilePhase> {
292 /// The buffer contents, as raw bytes.
293 pub(crate) data: SmallVec<[u8; 1024]>,
294 /// Any relocations referring to this code. Note that only *external*
295 /// relocations are tracked here; references to labels within the buffer are
296 /// resolved before emission.
297 pub(crate) relocs: SmallVec<[MachReloc; 16]>,
298 /// Any trap records referring to this code.
299 pub(crate) traps: SmallVec<[MachTrap; 16]>,
300 /// Any call site records referring to this code.
301 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
302 /// Any source location mappings referring to this code.
303 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
304 /// Any stack maps referring to this code.
305 pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>,
306 /// Any unwind info at a given location.
307 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
308}
309
310const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
311const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
312
313/// Threshold on max length of `labels_at_this_branch` list to avoid
314/// unbounded quadratic behavior (see comment below at use-site).
315const LABEL_LIST_THRESHOLD: usize = 100;
316
317/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
318/// the point at which it is used by emitted code; the buffer records "fixups"
319/// for references to the label, and will come back and patch the code
320/// appropriately when the label's location is eventually known.
321#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
322pub struct MachLabel(u32);
323entity_impl!(MachLabel);
324
325impl MachLabel {
326 /// Get a label for a block. (The first N MachLabels are always reseved for
327 /// the N blocks in the vcode.)
328 pub fn from_block(bindex: BlockIndex) -> MachLabel {
329 MachLabel(bindex.index() as u32)
330 }
331
332 /// Get the numeric label index.
333 pub fn get(self) -> u32 {
334 self.0
335 }
336
337 /// Creates a string representing this label, for convenience.
338 pub fn to_string(&self) -> String {
339 format!("label{}", self.0)
340 }
341}
342
343impl Default for MachLabel {
344 fn default() -> Self {
345 UNKNOWN_LABEL
346 }
347}
348
349/// A stack map extent, when creating a stack map.
350pub enum StackMapExtent {
351 /// The stack map starts at this instruction, and ends after the number of upcoming bytes
352 /// (note: this is a code offset diff).
353 UpcomingBytes(CodeOffset),
354
355 /// The stack map started at the given offset and ends at the current one. This helps
356 /// architectures where the instruction size has not a fixed length.
357 StartedAtOffset(CodeOffset),
358}
359
360impl<I: VCodeInst> MachBuffer<I> {
361 /// Create a new section, known to start at `start_offset` and with a size limited to
362 /// `length_limit`.
363 pub fn new() -> MachBuffer<I> {
364 MachBuffer {
365 data: SmallVec::new(),
366 relocs: SmallVec::new(),
367 traps: SmallVec::new(),
368 call_sites: SmallVec::new(),
369 srclocs: SmallVec::new(),
370 stack_maps: SmallVec::new(),
371 unwind_info: SmallVec::new(),
372 cur_srcloc: None,
373 label_offsets: SmallVec::new(),
374 label_aliases: SmallVec::new(),
375 pending_constants: SmallVec::new(),
376 pending_traps: SmallVec::new(),
377 fixup_records: SmallVec::new(),
378 island_deadline: UNKNOWN_LABEL_OFFSET,
379 island_worst_case_size: 0,
380 latest_branches: SmallVec::new(),
381 labels_at_tail: SmallVec::new(),
382 labels_at_tail_off: 0,
383 constant_labels: SecondaryMap::new(),
384 }
385 }
386
387 /// Current offset from start of buffer.
388 pub fn cur_offset(&self) -> CodeOffset {
389 self.data.len() as CodeOffset
390 }
391
392 /// Add a byte.
393 pub fn put1(&mut self, value: u8) {
394 trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value);
395 self.data.push(value);
396
397 // Post-invariant: conceptual-labels_at_tail contains a complete and
398 // precise list of labels bound at `cur_offset()`. We have advanced
399 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
400 // before, it is not anymore (and it cannot become equal, because
401 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
402 // conceptually empty (even though it is only lazily cleared). No labels
403 // can be bound at this new offset (by invariant on `label_offsets`).
404 // Hence the invariant holds.
405 }
406
407 /// Add 2 bytes.
408 pub fn put2(&mut self, value: u16) {
409 trace!(
410 "MachBuffer: put 16-bit word @ {}: {:x}",
411 self.cur_offset(),
412 value
413 );
414 let bytes = value.to_le_bytes();
415 self.data.extend_from_slice(&bytes[..]);
416
417 // Post-invariant: as for `put1()`.
418 }
419
420 /// Add 4 bytes.
421 pub fn put4(&mut self, value: u32) {
422 trace!(
423 "MachBuffer: put 32-bit word @ {}: {:x}",
424 self.cur_offset(),
425 value
426 );
427 let bytes = value.to_le_bytes();
428 self.data.extend_from_slice(&bytes[..]);
429
430 // Post-invariant: as for `put1()`.
431 }
432
433 /// Add 8 bytes.
434 pub fn put8(&mut self, value: u64) {
435 trace!(
436 "MachBuffer: put 64-bit word @ {}: {:x}",
437 self.cur_offset(),
438 value
439 );
440 let bytes = value.to_le_bytes();
441 self.data.extend_from_slice(&bytes[..]);
442
443 // Post-invariant: as for `put1()`.
444 }
445
446 /// Add a slice of bytes.
447 pub fn put_data(&mut self, data: &[u8]) {
448 trace!(
449 "MachBuffer: put data @ {}: len {}",
450 self.cur_offset(),
451 data.len()
452 );
453 self.data.extend_from_slice(data);
454
455 // Post-invariant: as for `put1()`.
456 }
457
458 /// Reserve appended space and return a mutable slice referring to it.
459 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
460 trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len);
461 let off = self.data.len();
462 let new_len = self.data.len() + len;
463 self.data.resize(new_len, 0);
464 &mut self.data[off..]
465
466 // Post-invariant: as for `put1()`.
467 }
468
469 /// Align up to the given alignment.
470 pub fn align_to(&mut self, align_to: CodeOffset) {
471 trace!("MachBuffer: align to {}", align_to);
472 assert!(
473 align_to.is_power_of_two(),
474 "{} is not a power of two",
475 align_to
476 );
477 while self.cur_offset() & (align_to - 1) != 0 {
478 self.put1(0);
479 }
480
481 // Post-invariant: as for `put1()`.
482 }
483
484 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
485 /// offset yet.
486 pub fn get_label(&mut self) -> MachLabel {
487 let l = self.label_offsets.len() as u32;
488 self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
489 self.label_aliases.push(UNKNOWN_LABEL);
490 trace!("MachBuffer: new label -> {:?}", MachLabel(l));
491 MachLabel(l)
492
493 // Post-invariant: the only mutation is to add a new label; it has no
494 // bound offset yet, so it trivially satisfies all invariants.
495 }
496
497 /// Reserve the first N MachLabels for blocks.
498 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
499 trace!("MachBuffer: first {} labels are for blocks", blocks);
500 debug_assert!(self.label_offsets.is_empty());
501 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
502 self.label_aliases.resize(blocks, UNKNOWN_LABEL);
503
504 // Post-invariant: as for `get_label()`.
505 }
506
507 /// Reserve the next N MachLabels for constants.
508 pub fn reserve_labels_for_constants(&mut self, constants: &VCodeConstants) {
509 trace!(
510 "MachBuffer: next {} labels are for constants",
511 constants.len()
512 );
513 for c in constants.keys() {
514 self.constant_labels[c] = self.get_label();
515 }
516
517 // Post-invariant: as for `get_label()`.
518 }
519
520 /// Retrieve the reserved label for a constant.
521 pub fn get_label_for_constant(&self, constant: VCodeConstant) -> MachLabel {
522 self.constant_labels[constant]
523 }
524
525 /// Bind a label to the current offset. A label can only be bound once.
526 pub fn bind_label(&mut self, label: MachLabel) {
527 trace!(
528 "MachBuffer: bind label {:?} at offset {}",
529 label,
530 self.cur_offset()
531 );
532 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
533 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
534 let offset = self.cur_offset();
535 self.label_offsets[label.0 as usize] = offset;
536 self.lazily_clear_labels_at_tail();
537 self.labels_at_tail.push(label);
538
539 // Invariants hold: bound offset of label is <= cur_offset (in fact it
540 // is equal). If the `labels_at_tail` list was complete and precise
541 // before, it is still, because we have bound this label to the current
542 // offset and added it to the list (which contains all labels at the
543 // current offset).
544
545 self.optimize_branches();
546
547 // Post-invariant: by `optimize_branches()` (see argument there).
548 }
549
550 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
551 /// offset that it applies to.
552 fn lazily_clear_labels_at_tail(&mut self) {
553 let offset = self.cur_offset();
554 if offset > self.labels_at_tail_off {
555 self.labels_at_tail_off = offset;
556 self.labels_at_tail.clear();
557 }
558
559 // Post-invariant: either labels_at_tail_off was at cur_offset, and
560 // state is untouched, or was less than cur_offset, in which case the
561 // labels_at_tail list was conceptually empty, and is now actually
562 // empty.
563 }
564
565 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
566 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
567 let mut iters = 0;
568 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
569 label = self.label_aliases[label.0 as usize];
570 // To protect against an infinite loop (despite our assurances to
571 // ourselves that the invariants make this impossible), assert out
572 // after 1M iterations. The number of basic blocks is limited
573 // in most contexts anyway so this should be impossible to hit with
574 // a legitimate input.
575 iters += 1;
576 assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
577 }
578 self.label_offsets[label.0 as usize]
579
580 // Post-invariant: no mutations.
581 }
582
583 /// Emit a reference to the given label with the given reference type (i.e.,
584 /// branch-instruction format) at the current offset. This is like a
585 /// relocation, but handled internally.
586 ///
587 /// This can be called before the branch is actually emitted; fixups will
588 /// not happen until an island is emitted or the buffer is finished.
589 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
590 trace!(
591 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
592 offset,
593 label,
594 kind
595 );
596
597 // Add the fixup, and update the worst-case island size based on a
598 // veneer for this label use.
599 self.fixup_records.push(MachLabelFixup {
600 label,
601 offset,
602 kind,
603 });
604 if kind.supports_veneer() {
605 self.island_worst_case_size += kind.veneer_size();
606 self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1);
607 }
608 let deadline = offset.saturating_add(kind.max_pos_range());
609 if deadline < self.island_deadline {
610 self.island_deadline = deadline;
611 }
612
613 // Post-invariant: no mutations to branches/labels data structures.
614 }
615
616 /// Inform the buffer of an unconditional branch at the given offset,
617 /// targetting the given label. May be used to optimize branches.
618 /// The last added label-use must correspond to this branch.
619 /// This must be called when the current offset is equal to `start`; i.e.,
620 /// before actually emitting the branch. This implies that for a branch that
621 /// uses a label and is eligible for optimizations by the MachBuffer, the
622 /// proper sequence is:
623 ///
624 /// - Call `use_label_at_offset()` to emit the fixup record.
625 /// - Call `add_uncond_branch()` to make note of the branch.
626 /// - Emit the bytes for the branch's machine code.
627 ///
628 /// Additional requirement: no labels may be bound between `start` and `end`
629 /// (exclusive on both ends).
630 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
631 assert!(self.cur_offset() == start);
632 debug_assert!(end > start);
633 assert!(!self.fixup_records.is_empty());
634 let fixup = self.fixup_records.len() - 1;
635 self.lazily_clear_labels_at_tail();
636 self.latest_branches.push(MachBranch {
637 start,
638 end,
639 target,
640 fixup,
641 inverted: None,
642 labels_at_this_branch: self.labels_at_tail.clone(),
643 });
644
645 // Post-invariant: we asserted branch start is current tail; the list of
646 // labels at branch is cloned from list of labels at current tail.
647 }
648
649 /// Inform the buffer of a conditional branch at the given offset,
650 /// targetting the given label. May be used to optimize branches.
651 /// The last added label-use must correspond to this branch.
652 ///
653 /// Additional requirement: no labels may be bound between `start` and `end`
654 /// (exclusive on both ends).
655 pub fn add_cond_branch(
656 &mut self,
657 start: CodeOffset,
658 end: CodeOffset,
659 target: MachLabel,
660 inverted: &[u8],
661 ) {
662 assert!(self.cur_offset() == start);
663 debug_assert!(end > start);
664 assert!(!self.fixup_records.is_empty());
665 debug_assert!(inverted.len() == (end - start) as usize);
666 let fixup = self.fixup_records.len() - 1;
667 let inverted = Some(SmallVec::from(inverted));
668 self.lazily_clear_labels_at_tail();
669 self.latest_branches.push(MachBranch {
670 start,
671 end,
672 target,
673 fixup,
674 inverted,
675 labels_at_this_branch: self.labels_at_tail.clone(),
676 });
677
678 // Post-invariant: we asserted branch start is current tail; labels at
679 // branch list is cloned from list of labels at current tail.
680 }
681
682 fn truncate_last_branch(&mut self) {
683 self.lazily_clear_labels_at_tail();
684 // Invariants hold at this point.
685
686 let b = self.latest_branches.pop().unwrap();
687 assert!(b.end == self.cur_offset());
688
689 // State:
690 // [PRE CODE]
691 // Offset b.start, b.labels_at_this_branch:
692 // [BRANCH CODE]
693 // cur_off, self.labels_at_tail -->
694 // (end of buffer)
695 self.data.truncate(b.start as usize);
696 self.fixup_records.truncate(b.fixup);
697 while let Some(mut last_srcloc) = self.srclocs.last_mut() {
698 if last_srcloc.end <= b.start {
699 break;
700 }
701 if last_srcloc.start < b.start {
702 last_srcloc.end = b.start;
703 break;
704 }
705 self.srclocs.pop();
706 }
707 // State:
708 // [PRE CODE]
709 // cur_off, Offset b.start, b.labels_at_this_branch:
710 // (end of buffer)
711 //
712 // self.labels_at_tail --> (past end of buffer)
713 let cur_off = self.cur_offset();
714 self.labels_at_tail_off = cur_off;
715 // State:
716 // [PRE CODE]
717 // cur_off, Offset b.start, b.labels_at_this_branch,
718 // self.labels_at_tail:
719 // (end of buffer)
720 //
721 // resolve_label_offset(l) for l in labels_at_tail:
722 // (past end of buffer)
723
724 trace!(
725 "truncate_last_branch: truncated {:?}; off now {}",
726 b,
727 cur_off
728 );
729
730 // Fix up resolved label offsets for labels at tail.
731 for &l in &self.labels_at_tail {
732 self.label_offsets[l.0 as usize] = cur_off;
733 }
734 // Old labels_at_this_branch are now at cur_off.
735 self.labels_at_tail
736 .extend(b.labels_at_this_branch.into_iter());
737
738 // Post-invariant: this operation is defined to truncate the buffer,
739 // which moves cur_off backward, and to move labels at the end of the
740 // buffer back to the start-of-branch offset.
741 //
742 // latest_branches satisfies all invariants:
743 // - it has no branches past the end of the buffer (branches are in
744 // order, we removed the last one, and we truncated the buffer to just
745 // before the start of that branch)
746 // - no labels were moved to lower offsets than the (new) cur_off, so
747 // the labels_at_this_branch list for any other branch need not change.
748 //
749 // labels_at_tail satisfies all invariants:
750 // - all labels that were at the tail after the truncated branch are
751 // moved backward to just before the branch, which becomes the new tail;
752 // thus every element in the list should remain (ensured by `.extend()`
753 // above).
754 // - all labels that refer to the new tail, which is the start-offset of
755 // the truncated branch, must be present. The `labels_at_this_branch`
756 // list in the truncated branch's record is a complete and precise list
757 // of exactly these labels; we append these to labels_at_tail.
758 // - labels_at_tail_off is at cur_off after truncation occurs, so the
759 // list is valid (not to be lazily cleared).
760 //
761 // The stated operation was performed:
762 // - For each label at the end of the buffer prior to this method, it
763 // now resolves to the new (truncated) end of the buffer: it must have
764 // been in `labels_at_tail` (this list is precise and complete, and
765 // the tail was at the end of the truncated branch on entry), and we
766 // iterate over this list and set `label_offsets` to the new tail.
767 // None of these labels could have been an alias (by invariant), so
768 // `label_offsets` is authoritative for each.
769 // - No other labels will be past the end of the buffer, because of the
770 // requirement that no labels be bound to the middle of branch ranges
771 // (see comments to `add_{cond,uncond}_branch()`).
772 // - The buffer is truncated to just before the last branch, and the
773 // fixup record referring to that last branch is removed.
774 }
775
776 fn optimize_branches(&mut self) {
777 self.lazily_clear_labels_at_tail();
778 // Invariants valid at this point.
779
780 trace!(
781 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
782 self.latest_branches,
783 self.labels_at_tail,
784 self.fixup_records
785 );
786
787 // We continue to munch on branches at the tail of the buffer until no
788 // more rules apply. Note that the loop only continues if a branch is
789 // actually truncated (or if labels are redirected away from a branch),
790 // so this always makes progress.
791 while let Some(b) = self.latest_branches.last() {
792 let cur_off = self.cur_offset();
793 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
794 // If there has been any code emission since the end of the last branch or
795 // label definition, then there's nothing we can edit (because we
796 // don't move code once placed, only back up and overwrite), so
797 // clear the records and finish.
798 if b.end < cur_off {
799 break;
800 }
801
802 // If the "labels at this branch" list on this branch is
803 // longer than a threshold, don't do any simplification,
804 // and let the branch remain to separate those labels from
805 // the current tail. This avoids quadratic behavior (see
806 // #3468): otherwise, if a long string of "goto next;
807 // next:" patterns are emitted, all of the labels will
808 // coalesce into a long list of aliases for the current
809 // buffer tail. We must track all aliases of the current
810 // tail for correctness, but we are also allowed to skip
811 // optimization (removal) of any branch, so we take the
812 // escape hatch here and let it stand. In effect this
813 // "spreads" the many thousands of labels in the
814 // pathological case among an actual (harmless but
815 // suboptimal) instruction once per N labels.
816 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
817 break;
818 }
819
820 // Invariant: we are looking at a branch that ends at the tail of
821 // the buffer.
822
823 // For any branch, conditional or unconditional:
824 // - If the target is a label at the current offset, then remove
825 // the conditional branch, and reset all labels that targetted
826 // the current offset (end of branch) to the truncated
827 // end-of-code.
828 //
829 // Preserves execution semantics: a branch to its own fallthrough
830 // address is equivalent to a no-op; in both cases, nextPC is the
831 // fallthrough.
832 if self.resolve_label_offset(b.target) == cur_off {
833 trace!("branch with target == cur off; truncating");
834 self.truncate_last_branch();
835 continue;
836 }
837
838 // If latest is an unconditional branch:
839 //
840 // - If the branch's target is not its own start address, then for
841 // each label at the start of branch, make the label an alias of the
842 // branch target, and remove the label from the "labels at this
843 // branch" list.
844 //
845 // - Preserves execution semantics: an unconditional branch's
846 // only effect is to set PC to a new PC; this change simply
847 // collapses one step in the step-semantics.
848 //
849 // - Post-invariant: the labels that were bound to the start of
850 // this branch become aliases, so they must not be present in any
851 // labels-at-this-branch list or the labels-at-tail list. The
852 // labels are removed form the latest-branch record's
853 // labels-at-this-branch list, and are never placed in the
854 // labels-at-tail list. Furthermore, it is correct that they are
855 // not in either list, because they are now aliases, and labels
856 // that are aliases remain aliases forever.
857 //
858 // - If there is a prior unconditional branch that ends just before
859 // this one begins, and this branch has no labels bound to its
860 // start, then we can truncate this branch, because it is entirely
861 // unreachable (we have redirected all labels that make it
862 // reachable otherwise). Do so and continue around the loop.
863 //
864 // - Preserves execution semantics: the branch is unreachable,
865 // because execution can only flow into an instruction from the
866 // prior instruction's fallthrough or from a branch bound to that
867 // instruction's start offset. Unconditional branches have no
868 // fallthrough, so if the prior instruction is an unconditional
869 // branch, no fallthrough entry can happen. The
870 // labels-at-this-branch list is complete (by invariant), so if it
871 // is empty, then the instruction is entirely unreachable. Thus,
872 // it can be removed.
873 //
874 // - Post-invariant: ensured by truncate_last_branch().
875 //
876 // - If there is a prior conditional branch whose target label
877 // resolves to the current offset (branches around the
878 // unconditional branch), then remove the unconditional branch,
879 // and make the target of the unconditional the target of the
880 // conditional instead.
881 //
882 // - Preserves execution semantics: previously we had:
883 //
884 // L1:
885 // cond_br L2
886 // br L3
887 // L2:
888 // (end of buffer)
889 //
890 // by removing the last branch, we have:
891 //
892 // L1:
893 // cond_br L2
894 // L2:
895 // (end of buffer)
896 //
897 // we then fix up the records for the conditional branch to
898 // have:
899 //
900 // L1:
901 // cond_br.inverted L3
902 // L2:
903 //
904 // In the original code, control flow reaches L2 when the
905 // conditional branch's predicate is true, and L3 otherwise. In
906 // the optimized code, the same is true.
907 //
908 // - Post-invariant: all edits to latest_branches and
909 // labels_at_tail are performed by `truncate_last_branch()`,
910 // which maintains the invariants at each step.
911
912 if b.is_uncond() {
913 // Set any label equal to current branch's start as an alias of
914 // the branch's target, if the target is not the branch itself
915 // (i.e., an infinite loop).
916 //
917 // We cannot perform this aliasing if the target of this branch
918 // ultimately aliases back here; if so, we need to keep this
919 // branch, so break out of this loop entirely (and clear the
920 // latest-branches list below).
921 //
922 // Note that this check is what prevents cycles from forming in
923 // `self.label_aliases`. To see why, consider an arbitrary start
924 // state:
925 //
926 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
927 // Ln, which is not aliased.
928 //
929 // We would create a cycle if we assigned label_aliases[Ln]
930 // = L1. Note that the below assignment is the only write
931 // to label_aliases.
932 //
933 // By our other invariants, we have that Ln (`l` below)
934 // resolves to the offset `b.start`, because it is in the
935 // set `b.labels_at_this_branch`.
936 //
937 // If L1 were already aliased, through some arbitrarily deep
938 // chain, to Ln, then it must also resolve to this offset
939 // `b.start`.
940 //
941 // By checking the resolution of `L1` against this offset,
942 // and aborting this branch-simplification if they are
943 // equal, we prevent the below assignment from ever creating
944 // a cycle.
945 if self.resolve_label_offset(b.target) != b.start {
946 let redirected = b.labels_at_this_branch.len();
947 for &l in &b.labels_at_this_branch {
948 trace!(
949 " -> label at start of branch {:?} redirected to target {:?}",
950 l,
951 b.target
952 );
953 self.label_aliases[l.0 as usize] = b.target;
954 // NOTE: we continue to ensure the invariant that labels
955 // pointing to tail of buffer are in `labels_at_tail`
956 // because we already ensured above that the last branch
957 // cannot have a target of `cur_off`; so we never have
958 // to put the label into `labels_at_tail` when moving it
959 // here.
960 }
961 // Maintain invariant: all branches have been redirected
962 // and are no longer pointing at the start of this branch.
963 let mut_b = self.latest_branches.last_mut().unwrap();
964 mut_b.labels_at_this_branch.clear();
965
966 if redirected > 0 {
967 trace!(" -> after label redirects, restarting loop");
968 continue;
969 }
970 } else {
971 break;
972 }
973
974 let b = self.latest_branches.last().unwrap();
975
976 // Examine any immediately preceding branch.
977 if self.latest_branches.len() > 1 {
978 let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
979 trace!(" -> more than one branch; prev_b = {:?}", prev_b);
980 // This uncond is immediately after another uncond; we
981 // should have already redirected labels to this uncond away
982 // (but check to be sure); so we can truncate this uncond.
983 if prev_b.is_uncond()
984 && prev_b.end == b.start
985 && b.labels_at_this_branch.is_empty()
986 {
987 trace!(" -> uncond follows another uncond; truncating");
988 self.truncate_last_branch();
989 continue;
990 }
991
992 // This uncond is immediately after a conditional, and the
993 // conditional's target is the end of this uncond, and we've
994 // already redirected labels to this uncond away; so we can
995 // truncate this uncond, flip the sense of the conditional, and
996 // set the conditional's target (in `latest_branches` and in
997 // `fixup_records`) to the uncond's target.
998 if prev_b.is_cond()
999 && prev_b.end == b.start
1000 && self.resolve_label_offset(prev_b.target) == cur_off
1001 {
1002 trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
1003 // Save the target of the uncond (this becomes the
1004 // target of the cond), and truncate the uncond.
1005 let target = b.target;
1006 let data = prev_b.inverted.clone().unwrap();
1007 self.truncate_last_branch();
1008
1009 // Mutate the code and cond branch.
1010 let off_before_edit = self.cur_offset();
1011 let prev_b = self.latest_branches.last_mut().unwrap();
1012 let not_inverted = SmallVec::from(
1013 &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1014 );
1015
1016 // Low-level edit: replaces bytes of branch with
1017 // inverted form. cur_off remains the same afterward, so
1018 // we do not need to modify label data structures.
1019 self.data.truncate(prev_b.start as usize);
1020 self.data.extend_from_slice(&data[..]);
1021
1022 // Save the original code as the inversion of the
1023 // inverted branch, in case we later edit this branch
1024 // again.
1025 prev_b.inverted = Some(not_inverted);
1026 self.fixup_records[prev_b.fixup].label = target;
1027 trace!(" -> reassigning target of condbr to {:?}", target);
1028 prev_b.target = target;
1029 debug_assert_eq!(off_before_edit, self.cur_offset());
1030 continue;
1031 }
1032 }
1033 }
1034
1035 // If we couldn't do anything with the last branch, then break.
1036 break;
1037 }
1038
1039 self.purge_latest_branches();
1040
1041 trace!(
1042 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1043 self.latest_branches,
1044 self.labels_at_tail,
1045 self.fixup_records
1046 );
1047 }
1048
1049 fn purge_latest_branches(&mut self) {
1050 // All of our branch simplification rules work only if a branch ends at
1051 // the tail of the buffer, with no following code; and branches are in
1052 // order in latest_branches; so if the last entry ends prior to
1053 // cur_offset, then clear all entries.
1054 let cur_off = self.cur_offset();
1055 if let Some(l) = self.latest_branches.last() {
1056 if l.end < cur_off {
1057 trace!("purge_latest_branches: removing branch {:?}", l);
1058 self.latest_branches.clear();
1059 }
1060 }
1061
1062 // Post-invariant: no invariant requires any branch to appear in
1063 // `latest_branches`; it is always optional. The list-clear above thus
1064 // preserves all semantics.
1065 }
1066
1067 /// Emit a constant at some point in the future, binding the given label to
1068 /// its offset. The constant will be placed at most `max_distance` from the
1069 /// current offset.
1070 pub fn defer_constant(
1071 &mut self,
1072 label: MachLabel,
1073 align: CodeOffset,
1074 data: &[u8],
1075 max_distance: CodeOffset,
1076 ) {
1077 trace!(
1078 "defer_constant: eventually emit {} bytes aligned to {} at label {:?}",
1079 data.len(),
1080 align,
1081 label
1082 );
1083 self.update_deadline(data.len(), max_distance);
1084 self.pending_constants.push(MachLabelConstant {
1085 label,
1086 align,
1087 data: SmallVec::from(data),
1088 });
1089 }
1090
1091 /// Emit a trap at some point in the future with the specified code and
1092 /// stack map.
1093 ///
1094 /// This function returns a [`MachLabel`] which will be the future address
1095 /// of the trap. Jumps should refer to this label, likely by using the
1096 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1097 /// patched in once the address of the trap is known.
1098 ///
1099 /// This will batch all traps into the end of the function.
1100 pub fn defer_trap(&mut self, code: TrapCode, stack_map: Option<StackMap>) -> MachLabel {
1101 let label = self.get_label();
1102 self.update_deadline(I::TRAP_OPCODE.len(), u32::MAX);
1103 self.pending_traps.push(MachLabelTrap {
1104 label,
1105 code,
1106 stack_map,
1107 loc: self.cur_srcloc.map(|(_start, loc)| loc),
1108 });
1109 label
1110 }
1111
1112 fn update_deadline(&mut self, len: usize, max_distance: CodeOffset) {
1113 trace!("defer: eventually emit {} bytes", len);
1114 let deadline = self.cur_offset().saturating_add(max_distance);
1115 self.island_worst_case_size += len as CodeOffset;
1116 self.island_worst_case_size =
1117 (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1);
1118 if deadline < self.island_deadline {
1119 self.island_deadline = deadline;
1120 }
1121 }
1122
1123 /// Is an island needed within the next N bytes?
1124 pub fn island_needed(&self, distance: CodeOffset) -> bool {
1125 self.worst_case_end_of_island(distance) > self.island_deadline
1126 }
1127
1128 /// Returns the maximal offset that islands can reach if `distance` more
1129 /// bytes are appended.
1130 ///
1131 /// This is used to determine if veneers need insertions since jumps that
1132 /// can't reach past this point must get a veneer of some form.
1133 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1134 self.cur_offset()
1135 .saturating_add(distance)
1136 .saturating_add(self.island_worst_case_size)
1137 }
1138
1139 /// Emit all pending constants and required pending veneers.
1140 ///
1141 /// Should only be called if `island_needed()` returns true, i.e., if we
1142 /// actually reach a deadline. It's not necessarily a problem to do so
1143 /// otherwise but it may result in unnecessary work during emission.
1144 pub fn emit_island(&mut self, distance: CodeOffset) {
1145 self.emit_island_maybe_forced(false, distance);
1146 }
1147
1148 /// Same as `emit_island`, but an internal API with a `force_veneers`
1149 /// argument to force all veneers to always get emitted for debugging.
1150 fn emit_island_maybe_forced(&mut self, force_veneers: bool, distance: CodeOffset) {
1151 // We're going to purge fixups, so no latest-branch editing can happen
1152 // anymore.
1153 self.latest_branches.clear();
1154
1155 // Reset internal calculations about islands since we're going to
1156 // change the calculus as we apply fixups. The `forced_threshold` is
1157 // used here to determine whether jumps to unknown labels will require
1158 // a veneer or not.
1159 let forced_threshold = self.worst_case_end_of_island(distance);
1160 self.island_deadline = UNKNOWN_LABEL_OFFSET;
1161 self.island_worst_case_size = 0;
1162
1163 // End the current location tracking since anything emitted during this
1164 // function shouldn't be attributed to whatever the current source
1165 // location is.
1166 //
1167 // Note that the current source location, if it's set right now, will be
1168 // restored at the end of this island emission.
1169 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1170 if cur_loc.is_some() {
1171 self.end_srcloc();
1172 }
1173
1174 // First flush out all traps/constants so we have more labels in case
1175 // fixups are applied against these labels.
1176 //
1177 // Note that traps are placed first since this typically happens at the
1178 // end of the function and for disassemblers we try to keep all the code
1179 // contiguously together.
1180 for MachLabelTrap {
1181 label,
1182 code,
1183 stack_map,
1184 loc,
1185 } in mem::take(&mut self.pending_traps)
1186 {
1187 // If this trap has source information associated with it then
1188 // emit this information for the trap instruction going out now too.
1189 if let Some(loc) = loc {
1190 self.start_srcloc(loc);
1191 }
1192 self.align_to(I::LabelUse::ALIGN);
1193 self.bind_label(label);
1194 self.add_trap(code);
1195 if let Some(map) = stack_map {
1196 let extent = StackMapExtent::UpcomingBytes(I::TRAP_OPCODE.len() as u32);
1197 self.add_stack_map(extent, map);
1198 }
1199 self.put_data(I::TRAP_OPCODE);
1200 if loc.is_some() {
1201 self.end_srcloc();
1202 }
1203 }
1204
1205 for MachLabelConstant { label, align, data } in mem::take(&mut self.pending_constants) {
1206 self.align_to(align);
1207 self.bind_label(label);
1208 self.put_data(&data[..]);
1209 }
1210
1211 for fixup in mem::take(&mut self.fixup_records) {
1212 trace!("emit_island: fixup {:?}", fixup);
1213 let MachLabelFixup {
1214 label,
1215 offset,
1216 kind,
1217 } = fixup;
1218 let label_offset = self.resolve_label_offset(label);
1219 let start = offset as usize;
1220 let end = (offset + kind.patch_size()) as usize;
1221
1222 if label_offset != UNKNOWN_LABEL_OFFSET {
1223 // If the offset of the label for this fixup is known then
1224 // we're going to do something here-and-now. We're either going
1225 // to patch the original offset because it's an in-bounds jump,
1226 // or we're going to generate a veneer, patch the fixup to jump
1227 // to the veneer, and then keep going.
1228 //
1229 // If the label comes after the original fixup, then we should
1230 // be guaranteed that the jump is in-bounds. Otherwise there's
1231 // a bug somewhere because this method wasn't called soon
1232 // enough. All forward-jumps are tracked and should get veneers
1233 // before their deadline comes and they're unable to jump
1234 // further.
1235 //
1236 // Otherwise if the label is before the fixup, then that's a
1237 // backwards jump. If it's past the maximum negative range
1238 // then we'll emit a veneer that to jump forward to which can
1239 // then jump backwards.
1240 let veneer_required = if label_offset >= offset {
1241 assert!((label_offset - offset) <= kind.max_pos_range());
1242 false
1243 } else {
1244 (offset - label_offset) > kind.max_neg_range()
1245 };
1246 trace!(
1247 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1248 label_offset,
1249 veneer_required,
1250 kind.max_pos_range(),
1251 kind.max_neg_range()
1252 );
1253
1254 if (force_veneers && kind.supports_veneer()) || veneer_required {
1255 self.emit_veneer(label, offset, kind);
1256 } else {
1257 let slice = &mut self.data[start..end];
1258 trace!("patching in-range!");
1259 kind.patch(slice, offset, label_offset);
1260 }
1261 } else {
1262 // If the offset of this label is not known at this time then
1263 // there's one of two possibilities:
1264 //
1265 // * First we may be about to exceed the maximum jump range of
1266 // this fixup. In that case a veneer is inserted to buy some
1267 // more budget for the forward-jump. It's guaranteed that the
1268 // label will eventually come after where we're at, so we know
1269 // that the forward jump is necessary.
1270 //
1271 // * Otherwise we're still within range of the forward jump but
1272 // the precise target isn't known yet. In that case we
1273 // enqueue the fixup to get processed later.
1274 if forced_threshold - offset > kind.max_pos_range() {
1275 self.emit_veneer(label, offset, kind);
1276 } else {
1277 self.use_label_at_offset(offset, label, kind);
1278 }
1279 }
1280 }
1281
1282 if let Some(loc) = cur_loc {
1283 self.start_srcloc(loc);
1284 }
1285 }
1286
1287 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1288 ///
1289 /// This will generate extra machine code, using `kind`, to get a
1290 /// larger-jump-kind than `kind` allows. The code at `offset` is then
1291 /// patched to jump to our new code, and then the new code is enqueued for
1292 /// a fixup to get processed at some later time.
1293 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1294 // If this `kind` doesn't support a veneer then that's a bug in the
1295 // backend because we need to implement support for such a veneer.
1296 assert!(
1297 kind.supports_veneer(),
1298 "jump beyond the range of {:?} but a veneer isn't supported",
1299 kind,
1300 );
1301
1302 // Allocate space for a veneer in the island.
1303 self.align_to(I::LabelUse::ALIGN);
1304 let veneer_offset = self.cur_offset();
1305 trace!("making a veneer at {}", veneer_offset);
1306 let start = offset as usize;
1307 let end = (offset + kind.patch_size()) as usize;
1308 let slice = &mut self.data[start..end];
1309 // Patch the original label use to refer to the veneer.
1310 trace!(
1311 "patching original at offset {} to veneer offset {}",
1312 offset,
1313 veneer_offset
1314 );
1315 kind.patch(slice, offset, veneer_offset);
1316 // Generate the veneer.
1317 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1318 let (veneer_fixup_off, veneer_label_use) =
1319 kind.generate_veneer(veneer_slice, veneer_offset);
1320 trace!(
1321 "generated veneer; fixup offset {}, label_use {:?}",
1322 veneer_fixup_off,
1323 veneer_label_use
1324 );
1325 // Register a new use of `label` with our new veneer fixup and offset.
1326 // This'll recalculate deadlines accordingly and enqueue this fixup to
1327 // get processed at some later time.
1328 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1329 }
1330
1331 fn finish_emission_maybe_forcing_veneers(&mut self, force_veneers: bool) {
1332 while !self.pending_constants.is_empty()
1333 || !self.pending_traps.is_empty()
1334 || !self.fixup_records.is_empty()
1335 {
1336 // `emit_island()` will emit any pending veneers and constants, and
1337 // as a side-effect, will also take care of any fixups with resolved
1338 // labels eagerly.
1339 self.emit_island_maybe_forced(force_veneers, u32::MAX);
1340 }
1341
1342 // Ensure that all labels have been fixed up after the last island is emitted. This is a
1343 // full (release-mode) assert because an unresolved label means the emitted code is
1344 // incorrect.
1345 assert!(self.fixup_records.is_empty());
1346 }
1347
1348 /// Finish any deferred emissions and/or fixups.
1349 pub fn finish(mut self) -> MachBufferFinalized<Stencil> {
1350 let _tt = timing::vcode_emit_finish();
1351
1352 // Do any optimizations on branches at tail of buffer, as if we
1353 // had bound one last label.
1354 self.optimize_branches();
1355
1356 self.finish_emission_maybe_forcing_veneers(false);
1357
1358 let mut srclocs = self.srclocs;
1359 srclocs.sort_by_key(|entry| entry.start);
1360
1361 MachBufferFinalized {
1362 data: self.data,
1363 relocs: self.relocs,
1364 traps: self.traps,
1365 call_sites: self.call_sites,
1366 srclocs,
1367 stack_maps: self.stack_maps,
1368 unwind_info: self.unwind_info,
1369 }
1370 }
1371
1372 /// Add an external relocation at the current offset.
1373 pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) {
1374 let name = name.clone();
1375 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1376 // generate a label-use statement to track whether an island is possibly
1377 // needed to escape this function to actually get to the external name.
1378 // This is most likely to come up on AArch64 where calls between
1379 // functions use a 26-bit signed offset which gives +/- 64MB. This means
1380 // that if a function is 128MB in size and there's a call in the middle
1381 // it's impossible to reach the actual target. Also, while it's
1382 // technically possible to jump to the start of a function and then jump
1383 // further, island insertion below always inserts islands after
1384 // previously appended code so for Cranelift's own implementation this
1385 // is also a problem for 64MB functions on AArch64 which start with a
1386 // call instruction, those won't be able to escape.
1387 //
1388 // Ideally what needs to happen here is that a `LabelUse` is
1389 // transparently generated (or call-sites of this function are audited
1390 // to generate a `LabelUse` instead) and tracked internally. The actual
1391 // relocation would then change over time if and when a veneer is
1392 // inserted, where the relocation here would be patched by this
1393 // `MachBuffer` to jump to the veneer. The problem, though, is that all
1394 // this still needs to end up, in the case of a singular function,
1395 // generating a final relocation pointing either to this particular
1396 // relocation or to the veneer inserted. Additionally
1397 // `MachBuffer` needs the concept of a label which will never be
1398 // resolved, so `emit_island` doesn't trip over not actually ever
1399 // knowning what some labels are. Currently the loop in
1400 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1401 // loop.
1402 //
1403 // For now this means that because relocs aren't tracked at all that
1404 // AArch64 functions have a rough size limits of 64MB. For now that's
1405 // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1406 // when a relocation can't otherwise be resolved later, so it shouldn't
1407 // actually result in any memory unsafety or anything like that.
1408 self.relocs.push(MachReloc {
1409 offset: self.data.len() as CodeOffset,
1410 kind,
1411 name,
1412 addend,
1413 });
1414 }
1415
1416 /// Add a trap record at the current offset.
1417 pub fn add_trap(&mut self, code: TrapCode) {
1418 self.traps.push(MachTrap {
1419 offset: self.data.len() as CodeOffset,
1420 code,
1421 });
1422 }
1423
1424 /// Add a call-site record at the current offset.
1425 pub fn add_call_site(&mut self, opcode: Opcode) {
1426 debug_assert!(
1427 opcode.is_call(),
1428 "adding call site info for a non-call instruction."
1429 );
1430 self.call_sites.push(MachCallSite {
1431 ret_addr: self.data.len() as CodeOffset,
1432 opcode,
1433 });
1434 }
1435
1436 /// Add an unwind record at the current offset.
1437 pub fn add_unwind(&mut self, unwind: UnwindInst) {
1438 self.unwind_info.push((self.cur_offset(), unwind));
1439 }
1440
1441 /// Set the `SourceLoc` for code from this offset until the offset at the
1442 /// next call to `end_srcloc()`.
1443 pub fn start_srcloc(&mut self, loc: RelSourceLoc) {
1444 self.cur_srcloc = Some((self.cur_offset(), loc));
1445 }
1446
1447 /// Mark the end of the `SourceLoc` segment started at the last
1448 /// `start_srcloc()` call.
1449 pub fn end_srcloc(&mut self) {
1450 let (start, loc) = self
1451 .cur_srcloc
1452 .take()
1453 .expect("end_srcloc() called without start_srcloc()");
1454 let end = self.cur_offset();
1455 // Skip zero-length extends.
1456 debug_assert!(end >= start);
1457 if end > start {
1458 self.srclocs.push(MachSrcLoc { start, end, loc });
1459 }
1460 }
1461
1462 /// Add stack map metadata for this program point: a set of stack offsets
1463 /// (from SP upward) that contain live references.
1464 ///
1465 /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets`
1466 /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets`
1467 /// element, one can obtain live-reference offsets from FP instead.
1468 pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
1469 let (start, end) = match extent {
1470 StackMapExtent::UpcomingBytes(insn_len) => {
1471 let start_offset = self.cur_offset();
1472 (start_offset, start_offset + insn_len)
1473 }
1474 StackMapExtent::StartedAtOffset(start_offset) => {
1475 let end_offset = self.cur_offset();
1476 debug_assert!(end_offset >= start_offset);
1477 (start_offset, end_offset)
1478 }
1479 };
1480 trace!("Adding stack map for offsets {start:#x}..{end:#x}");
1481 self.stack_maps.push(MachStackMap {
1482 offset: start,
1483 offset_end: end,
1484 stack_map,
1485 });
1486 }
1487}
1488
1489impl<T: CompilePhase> MachBufferFinalized<T> {
1490 /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1491 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1492 &self.srclocs[..]
1493 }
1494
1495 /// Get the total required size for the code.
1496 pub fn total_size(&self) -> CodeOffset {
1497 self.data.len() as CodeOffset
1498 }
1499
1500 /// Return the code in this mach buffer as a hex string for testing purposes.
1501 pub fn stringify_code_bytes(&self) -> String {
1502 // This is pretty lame, but whatever ..
1503 use std::fmt::Write;
1504 let mut s = String::with_capacity(self.data.len() * 2);
1505 for b in &self.data {
1506 write!(&mut s, "{:02X}", b).unwrap();
1507 }
1508 s
1509 }
1510
1511 /// Get the code bytes.
1512 pub fn data(&self) -> &[u8] {
1513 // N.B.: we emit every section into the .text section as far as
1514 // the `CodeSink` is concerned; we do not bother to segregate
1515 // the contents into the actual program text, the jumptable and the
1516 // rodata (constant pool). This allows us to generate code assuming
1517 // that these will not be relocated relative to each other, and avoids
1518 // having to designate each section as belonging in one of the three
1519 // fixed categories defined by `CodeSink`. If this becomes a problem
1520 // later (e.g. because of memory permissions or similar), we can
1521 // add this designation and segregate the output; take care, however,
1522 // to add the appropriate relocations in this case.
1523
1524 &self.data[..]
1525 }
1526
1527 /// Get the list of external relocations for this code.
1528 pub fn relocs(&self) -> &[MachReloc] {
1529 &self.relocs[..]
1530 }
1531
1532 /// Get the list of trap records for this code.
1533 pub fn traps(&self) -> &[MachTrap] {
1534 &self.traps[..]
1535 }
1536
1537 /// Get the stack map metadata for this code.
1538 pub fn stack_maps(&self) -> &[MachStackMap] {
1539 &self.stack_maps[..]
1540 }
1541
1542 /// Get the list of call sites for this code.
1543 pub fn call_sites(&self) -> &[MachCallSite] {
1544 &self.call_sites[..]
1545 }
1546}
1547
1548/// A constant that is deferred to the next constant-pool opportunity.
1549struct MachLabelConstant {
1550 /// This label will refer to the constant's offset.
1551 label: MachLabel,
1552 /// Required alignment.
1553 align: CodeOffset,
1554 /// This data will be emitted when able.
1555 data: SmallVec<[u8; 16]>,
1556}
1557
1558/// A trap that is deferred to the next time an island is emitted for either
1559/// traps, constants, or fixups.
1560struct MachLabelTrap {
1561 /// This label will refer to the trap's offset.
1562 label: MachLabel,
1563 /// The code associated with this trap.
1564 code: TrapCode,
1565 /// An optional stack map to associate with this trap.
1566 stack_map: Option<StackMap>,
1567 /// An optional source location to assign for this trap.
1568 loc: Option<RelSourceLoc>,
1569}
1570
1571/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1572/// to labels and patch the code based on label offsets. Hence, they are like
1573/// relocations, but internal to one buffer.
1574#[derive(Debug)]
1575struct MachLabelFixup<I: VCodeInst> {
1576 /// The label whose offset controls this fixup.
1577 label: MachLabel,
1578 /// The offset to fix up / patch to refer to this label.
1579 offset: CodeOffset,
1580 /// The kind of fixup. This is architecture-specific; each architecture may have,
1581 /// e.g., several types of branch instructions, each with differently-sized
1582 /// offset fields and different places within the instruction to place the
1583 /// bits.
1584 kind: I::LabelUse,
1585}
1586
1587/// A relocation resulting from a compilation.
1588#[derive(Clone, Debug, PartialEq)]
1589#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1590pub struct MachReloc {
1591 /// The offset at which the relocation applies, *relative to the
1592 /// containing section*.
1593 pub offset: CodeOffset,
1594 /// The kind of relocation.
1595 pub kind: Reloc,
1596 /// The external symbol / name to which this relocation refers.
1597 pub name: ExternalName,
1598 /// The addend to add to the symbol value.
1599 pub addend: i64,
1600}
1601
1602/// A trap record resulting from a compilation.
1603#[derive(Clone, Debug, PartialEq)]
1604#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1605pub struct MachTrap {
1606 /// The offset at which the trap instruction occurs, *relative to the
1607 /// containing section*.
1608 pub offset: CodeOffset,
1609 /// The trap code.
1610 pub code: TrapCode,
1611}
1612
1613/// A call site record resulting from a compilation.
1614#[derive(Clone, Debug, PartialEq)]
1615#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1616pub struct MachCallSite {
1617 /// The offset of the call's return address, *relative to the containing section*.
1618 pub ret_addr: CodeOffset,
1619 /// The call's opcode.
1620 pub opcode: Opcode,
1621}
1622
1623/// A source-location mapping resulting from a compilation.
1624#[derive(PartialEq, Debug, Clone)]
1625#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1626pub struct MachSrcLoc<T: CompilePhase> {
1627 /// The start of the region of code corresponding to a source location.
1628 /// This is relative to the start of the function, not to the start of the
1629 /// section.
1630 pub start: CodeOffset,
1631 /// The end of the region of code corresponding to a source location.
1632 /// This is relative to the start of the section, not to the start of the
1633 /// section.
1634 pub end: CodeOffset,
1635 /// The source location.
1636 pub loc: T::SourceLocType,
1637}
1638
1639impl MachSrcLoc<Stencil> {
1640 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1641 MachSrcLoc {
1642 start: self.start,
1643 end: self.end,
1644 loc: self.loc.expand(base_srcloc),
1645 }
1646 }
1647}
1648
1649/// Record of stack map metadata: stack offsets containing references.
1650#[derive(Clone, Debug, PartialEq)]
1651#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1652pub struct MachStackMap {
1653 /// The code offset at which this stack map applies.
1654 pub offset: CodeOffset,
1655 /// The code offset just past the "end" of the instruction: that is, the
1656 /// offset of the first byte of the following instruction, or equivalently,
1657 /// the start offset plus the instruction length.
1658 pub offset_end: CodeOffset,
1659 /// The stack map itself.
1660 pub stack_map: StackMap,
1661}
1662
1663/// Record of branch instruction in the buffer, to facilitate editing.
1664#[derive(Clone, Debug)]
1665struct MachBranch {
1666 start: CodeOffset,
1667 end: CodeOffset,
1668 target: MachLabel,
1669 fixup: usize,
1670 inverted: Option<SmallVec<[u8; 8]>>,
1671 /// All labels pointing to the start of this branch. For correctness, this
1672 /// *must* be complete (i.e., must contain all labels whose resolved offsets
1673 /// are at the start of this branch): we rely on being able to redirect all
1674 /// labels that could jump to this branch before removing it, if it is
1675 /// otherwise unreachable.
1676 labels_at_this_branch: SmallVec<[MachLabel; 4]>,
1677}
1678
1679impl MachBranch {
1680 fn is_cond(&self) -> bool {
1681 self.inverted.is_some()
1682 }
1683 fn is_uncond(&self) -> bool {
1684 self.inverted.is_none()
1685 }
1686}
1687
1688/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
1689///
1690/// Note that `MachBuffer` was primarily written for intra-function references
1691/// of jumps between basic blocks, but it's also quite usable for entire text
1692/// sections and resolving references between functions themselves. This
1693/// builder interprets "blocks" as labeled functions for the purposes of
1694/// resolving labels internally in the buffer.
1695pub struct MachTextSectionBuilder<I: VCodeInst> {
1696 buf: MachBuffer<I>,
1697 next_func: usize,
1698 force_veneers: bool,
1699}
1700
1701impl<I: VCodeInst> MachTextSectionBuilder<I> {
1702 /// Creates a new text section builder which will have `num_funcs` functions
1703 /// pushed into it.
1704 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
1705 let mut buf = MachBuffer::new();
1706 buf.reserve_labels_for_blocks(num_funcs);
1707 MachTextSectionBuilder {
1708 buf,
1709 next_func: 0,
1710 force_veneers: false,
1711 }
1712 }
1713}
1714
1715impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
1716 fn append(&mut self, labeled: bool, func: &[u8], align: u32) -> u64 {
1717 // Conditionally emit an island if it's necessary to resolve jumps
1718 // between functions which are too far away.
1719 let size = func.len() as u32;
1720 if self.force_veneers || self.buf.island_needed(size) {
1721 self.buf.emit_island_maybe_forced(self.force_veneers, size);
1722 }
1723
1724 self.buf.align_to(align);
1725 let pos = self.buf.cur_offset();
1726 if labeled {
1727 self.buf
1728 .bind_label(MachLabel::from_block(BlockIndex::new(self.next_func)));
1729 self.next_func += 1;
1730 }
1731 self.buf.put_data(func);
1732 u64::from(pos)
1733 }
1734
1735 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
1736 let label = MachLabel::from_block(BlockIndex::new(target));
1737 let offset = u32::try_from(offset).unwrap();
1738 match I::LabelUse::from_reloc(reloc, addend) {
1739 Some(label_use) => {
1740 self.buf.use_label_at_offset(offset, label, label_use);
1741 true
1742 }
1743 None => false,
1744 }
1745 }
1746
1747 fn force_veneers(&mut self) {
1748 self.force_veneers = true;
1749 }
1750
1751 fn finish(&mut self) -> Vec<u8> {
1752 // Double-check all functions were pushed.
1753 assert_eq!(self.next_func, self.buf.label_offsets.len());
1754
1755 // Finish up any veneers, if necessary.
1756 self.buf
1757 .finish_emission_maybe_forcing_veneers(self.force_veneers);
1758
1759 // We don't need the data any more, so return it to the caller.
1760 mem::take(&mut self.buf.data).into_vec()
1761 }
1762}
1763
1764// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
1765#[cfg(all(test, feature = "arm64"))]
1766mod test {
1767 use cranelift_entity::EntityRef as _;
1768
1769 use super::*;
1770 use crate::ir::UserExternalNameRef;
1771 use crate::isa::aarch64::inst::xreg;
1772 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
1773 use crate::machinst::MachInstEmit;
1774 use crate::settings;
1775 use std::default::Default;
1776 use std::vec::Vec;
1777
1778 fn label(n: u32) -> MachLabel {
1779 MachLabel::from_block(BlockIndex::new(n as usize))
1780 }
1781 fn target(n: u32) -> BranchTarget {
1782 BranchTarget::Label(label(n))
1783 }
1784
1785 #[test]
1786 fn test_elide_jump_to_next() {
1787 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1788 let mut buf = MachBuffer::new();
1789 let mut state = Default::default();
1790
1791 buf.reserve_labels_for_blocks(2);
1792 buf.bind_label(label(0));
1793 let inst = Inst::Jump { dest: target(1) };
1794 inst.emit(&[], &mut buf, &info, &mut state);
1795 buf.bind_label(label(1));
1796 let buf = buf.finish();
1797 assert_eq!(0, buf.total_size());
1798 }
1799
1800 #[test]
1801 fn test_elide_trivial_jump_blocks() {
1802 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1803 let mut buf = MachBuffer::new();
1804 let mut state = Default::default();
1805
1806 buf.reserve_labels_for_blocks(4);
1807
1808 buf.bind_label(label(0));
1809 let inst = Inst::CondBr {
1810 kind: CondBrKind::NotZero(xreg(0)),
1811 taken: target(1),
1812 not_taken: target(2),
1813 };
1814 inst.emit(&[], &mut buf, &info, &mut state);
1815
1816 buf.bind_label(label(1));
1817 let inst = Inst::Jump { dest: target(3) };
1818 inst.emit(&[], &mut buf, &info, &mut state);
1819
1820 buf.bind_label(label(2));
1821 let inst = Inst::Jump { dest: target(3) };
1822 inst.emit(&[], &mut buf, &info, &mut state);
1823
1824 buf.bind_label(label(3));
1825
1826 let buf = buf.finish();
1827 assert_eq!(0, buf.total_size());
1828 }
1829
1830 #[test]
1831 fn test_flip_cond() {
1832 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1833 let mut buf = MachBuffer::new();
1834 let mut state = Default::default();
1835
1836 buf.reserve_labels_for_blocks(4);
1837
1838 buf.bind_label(label(0));
1839 let inst = Inst::CondBr {
1840 kind: CondBrKind::Zero(xreg(0)),
1841 taken: target(1),
1842 not_taken: target(2),
1843 };
1844 inst.emit(&[], &mut buf, &info, &mut state);
1845
1846 buf.bind_label(label(1));
1847 let inst = Inst::Nop4;
1848 inst.emit(&[], &mut buf, &info, &mut state);
1849
1850 buf.bind_label(label(2));
1851 let inst = Inst::Udf {
1852 trap_code: TrapCode::Interrupt,
1853 };
1854 inst.emit(&[], &mut buf, &info, &mut state);
1855
1856 buf.bind_label(label(3));
1857
1858 let buf = buf.finish();
1859
1860 let mut buf2 = MachBuffer::new();
1861 let mut state = Default::default();
1862 let inst = Inst::TrapIf {
1863 kind: CondBrKind::NotZero(xreg(0)),
1864 trap_code: TrapCode::Interrupt,
1865 };
1866 inst.emit(&[], &mut buf2, &info, &mut state);
1867 let inst = Inst::Nop4;
1868 inst.emit(&[], &mut buf2, &info, &mut state);
1869
1870 let buf2 = buf2.finish();
1871
1872 assert_eq!(buf.data, buf2.data);
1873 }
1874
1875 #[test]
1876 fn test_island() {
1877 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1878 let mut buf = MachBuffer::new();
1879 let mut state = Default::default();
1880
1881 buf.reserve_labels_for_blocks(4);
1882
1883 buf.bind_label(label(0));
1884 let inst = Inst::CondBr {
1885 kind: CondBrKind::NotZero(xreg(0)),
1886 taken: target(2),
1887 not_taken: target(3),
1888 };
1889 inst.emit(&[], &mut buf, &info, &mut state);
1890
1891 buf.bind_label(label(1));
1892 while buf.cur_offset() < 2000000 {
1893 if buf.island_needed(0) {
1894 buf.emit_island(0);
1895 }
1896 let inst = Inst::Nop4;
1897 inst.emit(&[], &mut buf, &info, &mut state);
1898 }
1899
1900 buf.bind_label(label(2));
1901 let inst = Inst::Nop4;
1902 inst.emit(&[], &mut buf, &info, &mut state);
1903
1904 buf.bind_label(label(3));
1905 let inst = Inst::Nop4;
1906 inst.emit(&[], &mut buf, &info, &mut state);
1907
1908 let buf = buf.finish();
1909
1910 assert_eq!(2000000 + 8, buf.total_size());
1911
1912 let mut buf2 = MachBuffer::new();
1913 let mut state = Default::default();
1914 let inst = Inst::CondBr {
1915 kind: CondBrKind::NotZero(xreg(0)),
1916
1917 // This conditionally taken branch has a 19-bit constant, shifted
1918 // to the left by two, giving us a 21-bit range in total. Half of
1919 // this range positive so the we should be around 1 << 20 bytes
1920 // away for our jump target.
1921 //
1922 // There are two pending fixups by the time we reach this point,
1923 // one for this 19-bit jump and one for the unconditional 26-bit
1924 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
1925 // veneer is 20 bytes large, which means that pessimistically
1926 // assuming we'll need two veneers we need 24 bytes of extra
1927 // space, meaning that the actual island should come 24-bytes
1928 // before the deadline.
1929 taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20),
1930
1931 // This branch is in-range so no veneers should be needed, it should
1932 // go directly to the target.
1933 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
1934 };
1935 inst.emit(&[], &mut buf2, &info, &mut state);
1936
1937 let buf2 = buf2.finish();
1938
1939 assert_eq!(&buf.data[0..8], &buf2.data[..]);
1940 }
1941
1942 #[test]
1943 fn test_island_backward() {
1944 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1945 let mut buf = MachBuffer::new();
1946 let mut state = Default::default();
1947
1948 buf.reserve_labels_for_blocks(4);
1949
1950 buf.bind_label(label(0));
1951 let inst = Inst::Nop4;
1952 inst.emit(&[], &mut buf, &info, &mut state);
1953
1954 buf.bind_label(label(1));
1955 let inst = Inst::Nop4;
1956 inst.emit(&[], &mut buf, &info, &mut state);
1957
1958 buf.bind_label(label(2));
1959 while buf.cur_offset() < 2000000 {
1960 let inst = Inst::Nop4;
1961 inst.emit(&[], &mut buf, &info, &mut state);
1962 }
1963
1964 buf.bind_label(label(3));
1965 let inst = Inst::CondBr {
1966 kind: CondBrKind::NotZero(xreg(0)),
1967 taken: target(0),
1968 not_taken: target(1),
1969 };
1970 inst.emit(&[], &mut buf, &info, &mut state);
1971
1972 let buf = buf.finish();
1973
1974 assert_eq!(2000000 + 12, buf.total_size());
1975
1976 let mut buf2 = MachBuffer::new();
1977 let mut state = Default::default();
1978 let inst = Inst::CondBr {
1979 kind: CondBrKind::NotZero(xreg(0)),
1980 taken: BranchTarget::ResolvedOffset(8),
1981 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
1982 };
1983 inst.emit(&[], &mut buf2, &info, &mut state);
1984 let inst = Inst::Jump {
1985 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
1986 };
1987 inst.emit(&[], &mut buf2, &info, &mut state);
1988
1989 let buf2 = buf2.finish();
1990
1991 assert_eq!(&buf.data[2000000..], &buf2.data[..]);
1992 }
1993
1994 #[test]
1995 fn test_multiple_redirect() {
1996 // label0:
1997 // cbz x0, label1
1998 // b label2
1999 // label1:
2000 // b label3
2001 // label2:
2002 // nop
2003 // nop
2004 // b label0
2005 // label3:
2006 // b label4
2007 // label4:
2008 // b label5
2009 // label5:
2010 // b label7
2011 // label6:
2012 // nop
2013 // label7:
2014 // ret
2015 //
2016 // -- should become:
2017 //
2018 // label0:
2019 // cbz x0, label7
2020 // label2:
2021 // nop
2022 // nop
2023 // b label0
2024 // label6:
2025 // nop
2026 // label7:
2027 // ret
2028
2029 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2030 let mut buf = MachBuffer::new();
2031 let mut state = Default::default();
2032
2033 buf.reserve_labels_for_blocks(8);
2034
2035 buf.bind_label(label(0));
2036 let inst = Inst::CondBr {
2037 kind: CondBrKind::Zero(xreg(0)),
2038 taken: target(1),
2039 not_taken: target(2),
2040 };
2041 inst.emit(&[], &mut buf, &info, &mut state);
2042
2043 buf.bind_label(label(1));
2044 let inst = Inst::Jump { dest: target(3) };
2045 inst.emit(&[], &mut buf, &info, &mut state);
2046
2047 buf.bind_label(label(2));
2048 let inst = Inst::Nop4;
2049 inst.emit(&[], &mut buf, &info, &mut state);
2050 inst.emit(&[], &mut buf, &info, &mut state);
2051 let inst = Inst::Jump { dest: target(0) };
2052 inst.emit(&[], &mut buf, &info, &mut state);
2053
2054 buf.bind_label(label(3));
2055 let inst = Inst::Jump { dest: target(4) };
2056 inst.emit(&[], &mut buf, &info, &mut state);
2057
2058 buf.bind_label(label(4));
2059 let inst = Inst::Jump { dest: target(5) };
2060 inst.emit(&[], &mut buf, &info, &mut state);
2061
2062 buf.bind_label(label(5));
2063 let inst = Inst::Jump { dest: target(7) };
2064 inst.emit(&[], &mut buf, &info, &mut state);
2065
2066 buf.bind_label(label(6));
2067 let inst = Inst::Nop4;
2068 inst.emit(&[], &mut buf, &info, &mut state);
2069
2070 buf.bind_label(label(7));
2071 let inst = Inst::Ret { rets: vec![] };
2072 inst.emit(&[], &mut buf, &info, &mut state);
2073
2074 let buf = buf.finish();
2075
2076 let golden_data = vec![
2077 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2078 0x1f, 0x20, 0x03, 0xd5, // nop
2079 0x1f, 0x20, 0x03, 0xd5, // nop
2080 0xfd, 0xff, 0xff, 0x17, // b 0
2081 0x1f, 0x20, 0x03, 0xd5, // nop
2082 0xc0, 0x03, 0x5f, 0xd6, // ret
2083 ];
2084
2085 assert_eq!(&golden_data[..], &buf.data[..]);
2086 }
2087
2088 #[test]
2089 fn test_handle_branch_cycle() {
2090 // label0:
2091 // b label1
2092 // label1:
2093 // b label2
2094 // label2:
2095 // b label3
2096 // label3:
2097 // b label4
2098 // label4:
2099 // b label1 // note: not label0 (to make it interesting).
2100 //
2101 // -- should become:
2102 //
2103 // label0, label1, ..., label4:
2104 // b label0
2105 let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2106 let mut buf = MachBuffer::new();
2107 let mut state = Default::default();
2108
2109 buf.reserve_labels_for_blocks(5);
2110
2111 buf.bind_label(label(0));
2112 let inst = Inst::Jump { dest: target(1) };
2113 inst.emit(&[], &mut buf, &info, &mut state);
2114
2115 buf.bind_label(label(1));
2116 let inst = Inst::Jump { dest: target(2) };
2117 inst.emit(&[], &mut buf, &info, &mut state);
2118
2119 buf.bind_label(label(2));
2120 let inst = Inst::Jump { dest: target(3) };
2121 inst.emit(&[], &mut buf, &info, &mut state);
2122
2123 buf.bind_label(label(3));
2124 let inst = Inst::Jump { dest: target(4) };
2125 inst.emit(&[], &mut buf, &info, &mut state);
2126
2127 buf.bind_label(label(4));
2128 let inst = Inst::Jump { dest: target(1) };
2129 inst.emit(&[], &mut buf, &info, &mut state);
2130
2131 let buf = buf.finish();
2132
2133 let golden_data = vec![
2134 0x00, 0x00, 0x00, 0x14, // b 0
2135 ];
2136
2137 assert_eq!(&golden_data[..], &buf.data[..]);
2138 }
2139
2140 #[test]
2141 fn metadata_records() {
2142 let mut buf = MachBuffer::<Inst>::new();
2143
2144 buf.reserve_labels_for_blocks(1);
2145
2146 buf.bind_label(label(0));
2147 buf.put1(1);
2148 buf.add_trap(TrapCode::HeapOutOfBounds);
2149 buf.put1(2);
2150 buf.add_trap(TrapCode::IntegerOverflow);
2151 buf.add_trap(TrapCode::IntegerDivisionByZero);
2152 buf.add_call_site(Opcode::Call);
2153 buf.add_reloc(
2154 Reloc::Abs4,
2155 &ExternalName::User(UserExternalNameRef::new(0)),
2156 0,
2157 );
2158 buf.put1(3);
2159 buf.add_reloc(
2160 Reloc::Abs8,
2161 &ExternalName::User(UserExternalNameRef::new(1)),
2162 1,
2163 );
2164 buf.put1(4);
2165
2166 let buf = buf.finish();
2167
2168 assert_eq!(buf.data(), &[1, 2, 3, 4]);
2169 assert_eq!(
2170 buf.traps()
2171 .iter()
2172 .map(|trap| (trap.offset, trap.code))
2173 .collect::<Vec<_>>(),
2174 vec![
2175 (1, TrapCode::HeapOutOfBounds),
2176 (2, TrapCode::IntegerOverflow),
2177 (2, TrapCode::IntegerDivisionByZero)
2178 ]
2179 );
2180 assert_eq!(
2181 buf.call_sites()
2182 .iter()
2183 .map(|call_site| (call_site.ret_addr, call_site.opcode))
2184 .collect::<Vec<_>>(),
2185 vec![(2, Opcode::Call)]
2186 );
2187 assert_eq!(
2188 buf.relocs()
2189 .iter()
2190 .map(|reloc| (reloc.offset, reloc.kind))
2191 .collect::<Vec<_>>(),
2192 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2193 );
2194 }
2195}