cranelift_codegen/machinst/
buffer.rs

1//! In-memory representation of compiled machine code, with labels and fixups to
2//! refer to those labels. Handles constant-pool island insertion and also
3//! veneer insertion for out-of-range jumps.
4//!
5//! This code exists to solve three problems:
6//!
7//! - Branch targets for forward branches are not known until later, when we
8//!   emit code in a single pass through the instruction structs.
9//!
10//! - On many architectures, address references or offsets have limited range.
11//!   For example, on AArch64, conditional branches can only target code +/- 1MB
12//!   from the branch itself.
13//!
14//! - The lowering of control flow from the CFG-with-edges produced by
15//!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16//!   edge blocks when the register allocator does not need to insert any
17//!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18//!   patterns. The lowering also pays no attention to block order, and so
19//!   two-target conditional forms (cond-br followed by uncond-br) can often by
20//!   avoided because one of the targets is the fallthrough. There are several
21//!   cases here where we can simplify to use fewer branches.
22//!
23//! This "buffer" implements a single-pass code emission strategy (with a later
24//! "fixup" pass, but only through recorded fixups, not all instructions). The
25//! basic idea is:
26//!
27//! - Emit branches as they are, including two-target (cond/uncond) compound
28//!   forms, but with zero offsets and optimistically assuming the target will be
29//!   in range. Record the "fixup" for later. Targets are denoted instead by
30//!   symbolic "labels" that are then bound to certain offsets in the buffer as
31//!   we emit code. (Nominally, there is a label at the start of every basic
32//!   block.)
33//!
34//! - As we do this, track the offset in the buffer at which the first label
35//!   reference "goes out of range". We call this the "deadline". If we reach the
36//!   deadline and we still have not bound the label to which an unresolved branch
37//!   refers, we have a problem!
38//!
39//! - To solve this problem, we emit "islands" full of "veneers". An island is
40//!   simply a chunk of code inserted in the middle of the code actually produced
41//!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42//!   has some awareness of this: it either asks for an island between blocks, so
43//!   it is not accidentally executed, or else it emits a branch around the island
44//!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45//!
46//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47//!   that implements a longer-range reference to a label. The idea is that, for
48//!   example, a branch with a limited range can branch to a "veneer" instead,
49//!   which is simply a branch in a form that can use a longer-range reference. On
50//!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51//!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52//!   conditional branch's label reference can be fixed up with a "veneer" to
53//!   achieve a longer range.
54//!
55//! - To implement all of this, we require the backend to provide a `LabelUse`
56//!   type that implements a trait. This is nominally an enum that records one of
57//!   several kinds of references to an offset in code -- basically, a relocation
58//!   type -- and will usually correspond to different instruction formats. The
59//!   `LabelUse` implementation specifies the maximum range, how to patch in the
60//!   actual label location when known, and how to generate a veneer to extend the
61//!   range.
62//!
63//! That satisfies label references, but we still may have suboptimal branch
64//! patterns. To clean up the branches, we do a simple "peephole"-style
65//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66//! informs the buffer of branches in the code and, in the case of conditionals,
67//! the code that would have been emitted to invert this branch's condition. We
68//! track the "latest branches": these are branches that are contiguous up to
69//! the current offset. (If any code is emitted after a branch, that branch or
70//! run of contiguous branches is no longer "latest".) The latest branches are
71//! those that we can edit by simply truncating the buffer and doing something
72//! else instead.
73//!
74//! To optimize branches, we implement several simple rules, and try to apply
75//! them to the "latest branches" when possible:
76//!
77//! - A branch with a label target, when that label is bound to the ending
78//!   offset of the branch (the fallthrough location), can be removed altogether,
79//!   because the branch would have no effect).
80//!
81//! - An unconditional branch that starts at a label location, and branches to
82//!   another label, results in a "label alias": all references to the label bound
83//!   *to* this branch instruction are instead resolved to the *target* of the
84//!   branch instruction. This effectively removes empty blocks that just
85//!   unconditionally branch to the next block. We call this "branch threading".
86//!
87//! - A conditional followed by an unconditional, when the conditional branches
88//!   to the unconditional's fallthrough, results in (i) the truncation of the
89//!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90//!   replacement of the conditional's target (using the original target of the
91//!   unconditional). This is a fancy way of saying "we can flip a two-target
92//!   conditional branch's taken/not-taken targets if it works better with our
93//!   fallthrough". To make this work, the emitter actually gives the buffer
94//!   *both* forms of every conditional branch: the true form is emitted into the
95//!   buffer, and the "inverted" machine-code bytes are provided as part of the
96//!   branch-fixup metadata.
97//!
98//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99//!   been redirected to target(B), can be removed entirely. This is an extension
100//!   of the branch-threading optimization, and is valid because if we know there
101//!   will be no fallthrough into this branch instruction (the prior instruction
102//!   is an unconditional jump), and if we know we have successfully redirected
103//!   all labels, then this branch instruction is unreachable. Note that this
104//!   works because the redirection happens before the label is ever resolved
105//!   (fixups happen at island emission time, at which point latest-branches are
106//!   cleared, or at the end of emission), so we are sure to catch and redirect
107//!   all possible paths to this instruction.
108//!
109//! # Branch-optimization Correctness
110//!
111//! The branch-optimization mechanism depends on a few data structures with
112//! invariants, which are always held outside the scope of top-level public
113//! methods:
114//!
115//! - The latest-branches list. Each entry describes a span of the buffer
116//!   (start/end offsets), the label target, the corresponding fixup-list entry
117//!   index, and the bytes (must be the same length) for the inverted form, if
118//!   conditional. The list of labels that are bound to the start-offset of this
119//!   branch is *complete* (if any label has a resolved offset equal to `start`
120//!   and is not an alias, it must appear in this list) and *precise* (no label
121//!   in this list can be bound to another offset). No label in this list should
122//!   be an alias.  No two branch ranges can overlap, and branches are in
123//!   ascending-offset order.
124//!
125//! - The labels-at-tail list. This contains all MachLabels that have been bound
126//!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127//!   No label in this list should be an alias.
128//!
129//! - The label_offsets array, containing the bound offset of a label or
130//!   UNKNOWN. No label can be bound at an offset greater than the current
131//!   buffer tail.
132//!
133//! - The label_aliases array, containing another label to which a label is
134//!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135//!   of the label it is aliased to, if this is set.
136//!
137//! We argue below, at each method, how the invariants in these data structures
138//! are maintained (grep for "Post-invariant").
139//!
140//! Given these invariants, we argue why each optimization preserves execution
141//! semantics below (grep for "Preserves execution semantics").
142
143use crate::binemit::{Addend, CodeOffset, Reloc, StackMap};
144use crate::ir::{ExternalName, Opcode, RelSourceLoc, SourceLoc, TrapCode};
145use crate::isa::unwind::UnwindInst;
146use crate::machinst::{
147    BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
148};
149use crate::timing;
150use crate::trace;
151use cranelift_entity::{entity_impl, SecondaryMap};
152use smallvec::SmallVec;
153use std::convert::TryFrom;
154use std::mem;
155use std::string::String;
156use std::vec::Vec;
157
158#[cfg(feature = "enable-serde")]
159use serde::{Deserialize, Serialize};
160
161#[cfg(feature = "enable-serde")]
162pub trait CompilePhase {
163    type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
164    type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
165}
166
167#[cfg(not(feature = "enable-serde"))]
168pub trait CompilePhase {
169    type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
170    type SourceLocType: core::fmt::Debug + PartialEq + Clone;
171}
172
173/// Status of a compiled artifact that needs patching before being used.
174#[derive(Clone, Debug, PartialEq)]
175#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
176pub struct Stencil;
177
178/// Status of a compiled artifact ready to use.
179#[derive(Clone, Debug, PartialEq)]
180pub struct Final;
181
182impl CompilePhase for Stencil {
183    type MachSrcLocType = MachSrcLoc<Stencil>;
184    type SourceLocType = RelSourceLoc;
185}
186
187impl CompilePhase for Final {
188    type MachSrcLocType = MachSrcLoc<Final>;
189    type SourceLocType = SourceLoc;
190}
191
192/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
193/// in bulk.
194///
195/// This struct uses `SmallVec`s to support small-ish function bodies without
196/// any heap allocation. As such, it will be several kilobytes large. This is
197/// likely fine as long as it is stack-allocated for function emission then
198/// thrown away; but beware if many buffer objects are retained persistently.
199pub struct MachBuffer<I: VCodeInst> {
200    /// The buffer contents, as raw bytes.
201    data: SmallVec<[u8; 1024]>,
202    /// Any relocations referring to this code. Note that only *external*
203    /// relocations are tracked here; references to labels within the buffer are
204    /// resolved before emission.
205    relocs: SmallVec<[MachReloc; 16]>,
206    /// Any trap records referring to this code.
207    traps: SmallVec<[MachTrap; 16]>,
208    /// Any call site records referring to this code.
209    call_sites: SmallVec<[MachCallSite; 16]>,
210    /// Any source location mappings referring to this code.
211    srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
212    /// Any stack maps referring to this code.
213    stack_maps: SmallVec<[MachStackMap; 8]>,
214    /// Any unwind info at a given location.
215    unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
216    /// The current source location in progress (after `start_srcloc()` and
217    /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
218    cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
219    /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
220    label_offsets: SmallVec<[CodeOffset; 16]>,
221    /// Label aliases: when one label points to an unconditional jump, and that
222    /// jump points to another label, we can redirect references to the first
223    /// label immediately to the second.
224    ///
225    /// Invariant: we don't have label-alias cycles. We ensure this by,
226    /// before setting label A to alias label B, resolving B's alias
227    /// target (iteratively until a non-aliased label); if B is already
228    /// aliased to A, then we cannot alias A back to B.
229    label_aliases: SmallVec<[MachLabel; 16]>,
230    /// Constants that must be emitted at some point.
231    pending_constants: SmallVec<[MachLabelConstant; 16]>,
232    /// Traps that must be emitted at some point.
233    pending_traps: SmallVec<[MachLabelTrap; 16]>,
234    /// Fixups that must be performed after all code is emitted.
235    fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
236    /// Current deadline at which all constants are flushed and all code labels
237    /// are extended by emitting long-range jumps in an island. This flush
238    /// should be rare (e.g., on AArch64, the shortest-range PC-rel references
239    /// are +/- 1MB for conditional jumps and load-literal instructions), so
240    /// it's acceptable to track a minimum and flush-all rather than doing more
241    /// detailed "current minimum" / sort-by-deadline trickery.
242    island_deadline: CodeOffset,
243    /// How many bytes are needed in the worst case for an island, given all
244    /// pending constants and fixups.
245    island_worst_case_size: CodeOffset,
246    /// Latest branches, to facilitate in-place editing for better fallthrough
247    /// behavior and empty-block removal.
248    latest_branches: SmallVec<[MachBranch; 4]>,
249    /// All labels at the current offset (emission tail). This is lazily
250    /// cleared: it is actually accurate as long as the current offset is
251    /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
252    /// be considered as empty.
253    ///
254    /// For correctness, this *must* be complete (i.e., the vector must contain
255    /// all labels whose offsets are resolved to the current tail), because we
256    /// rely on it to update labels when we truncate branches.
257    labels_at_tail: SmallVec<[MachLabel; 4]>,
258    /// The last offset at which `labels_at_tail` is valid. It is conceptually
259    /// always describing the tail of the buffer, but we do not clear
260    /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
261    /// when the offset has grown past this (`labels_at_tail_off`) point.
262    /// Always <= `cur_offset()`.
263    labels_at_tail_off: CodeOffset,
264    /// Map used constants to their [MachLabel].
265    constant_labels: SecondaryMap<VCodeConstant, MachLabel>,
266}
267
268impl MachBufferFinalized<Stencil> {
269    /// Get a finalized machine buffer by applying the function's base source location.
270    pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
271        MachBufferFinalized {
272            data: self.data,
273            relocs: self.relocs,
274            traps: self.traps,
275            call_sites: self.call_sites,
276            srclocs: self
277                .srclocs
278                .into_iter()
279                .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
280                .collect(),
281            stack_maps: self.stack_maps,
282            unwind_info: self.unwind_info,
283        }
284    }
285}
286
287/// A `MachBuffer` once emission is completed: holds generated code and records,
288/// without fixups. This allows the type to be independent of the backend.
289#[derive(PartialEq, Debug, Clone)]
290#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
291pub struct MachBufferFinalized<T: CompilePhase> {
292    /// The buffer contents, as raw bytes.
293    pub(crate) data: SmallVec<[u8; 1024]>,
294    /// Any relocations referring to this code. Note that only *external*
295    /// relocations are tracked here; references to labels within the buffer are
296    /// resolved before emission.
297    pub(crate) relocs: SmallVec<[MachReloc; 16]>,
298    /// Any trap records referring to this code.
299    pub(crate) traps: SmallVec<[MachTrap; 16]>,
300    /// Any call site records referring to this code.
301    pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
302    /// Any source location mappings referring to this code.
303    pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
304    /// Any stack maps referring to this code.
305    pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>,
306    /// Any unwind info at a given location.
307    pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
308}
309
310const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
311const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
312
313/// Threshold on max length of `labels_at_this_branch` list to avoid
314/// unbounded quadratic behavior (see comment below at use-site).
315const LABEL_LIST_THRESHOLD: usize = 100;
316
317/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
318/// the point at which it is used by emitted code; the buffer records "fixups"
319/// for references to the label, and will come back and patch the code
320/// appropriately when the label's location is eventually known.
321#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
322pub struct MachLabel(u32);
323entity_impl!(MachLabel);
324
325impl MachLabel {
326    /// Get a label for a block. (The first N MachLabels are always reseved for
327    /// the N blocks in the vcode.)
328    pub fn from_block(bindex: BlockIndex) -> MachLabel {
329        MachLabel(bindex.index() as u32)
330    }
331
332    /// Get the numeric label index.
333    pub fn get(self) -> u32 {
334        self.0
335    }
336
337    /// Creates a string representing this label, for convenience.
338    pub fn to_string(&self) -> String {
339        format!("label{}", self.0)
340    }
341}
342
343impl Default for MachLabel {
344    fn default() -> Self {
345        UNKNOWN_LABEL
346    }
347}
348
349/// A stack map extent, when creating a stack map.
350pub enum StackMapExtent {
351    /// The stack map starts at this instruction, and ends after the number of upcoming bytes
352    /// (note: this is a code offset diff).
353    UpcomingBytes(CodeOffset),
354
355    /// The stack map started at the given offset and ends at the current one. This helps
356    /// architectures where the instruction size has not a fixed length.
357    StartedAtOffset(CodeOffset),
358}
359
360impl<I: VCodeInst> MachBuffer<I> {
361    /// Create a new section, known to start at `start_offset` and with a size limited to
362    /// `length_limit`.
363    pub fn new() -> MachBuffer<I> {
364        MachBuffer {
365            data: SmallVec::new(),
366            relocs: SmallVec::new(),
367            traps: SmallVec::new(),
368            call_sites: SmallVec::new(),
369            srclocs: SmallVec::new(),
370            stack_maps: SmallVec::new(),
371            unwind_info: SmallVec::new(),
372            cur_srcloc: None,
373            label_offsets: SmallVec::new(),
374            label_aliases: SmallVec::new(),
375            pending_constants: SmallVec::new(),
376            pending_traps: SmallVec::new(),
377            fixup_records: SmallVec::new(),
378            island_deadline: UNKNOWN_LABEL_OFFSET,
379            island_worst_case_size: 0,
380            latest_branches: SmallVec::new(),
381            labels_at_tail: SmallVec::new(),
382            labels_at_tail_off: 0,
383            constant_labels: SecondaryMap::new(),
384        }
385    }
386
387    /// Current offset from start of buffer.
388    pub fn cur_offset(&self) -> CodeOffset {
389        self.data.len() as CodeOffset
390    }
391
392    /// Add a byte.
393    pub fn put1(&mut self, value: u8) {
394        trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value);
395        self.data.push(value);
396
397        // Post-invariant: conceptual-labels_at_tail contains a complete and
398        // precise list of labels bound at `cur_offset()`. We have advanced
399        // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
400        // before, it is not anymore (and it cannot become equal, because
401        // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
402        // conceptually empty (even though it is only lazily cleared). No labels
403        // can be bound at this new offset (by invariant on `label_offsets`).
404        // Hence the invariant holds.
405    }
406
407    /// Add 2 bytes.
408    pub fn put2(&mut self, value: u16) {
409        trace!(
410            "MachBuffer: put 16-bit word @ {}: {:x}",
411            self.cur_offset(),
412            value
413        );
414        let bytes = value.to_le_bytes();
415        self.data.extend_from_slice(&bytes[..]);
416
417        // Post-invariant: as for `put1()`.
418    }
419
420    /// Add 4 bytes.
421    pub fn put4(&mut self, value: u32) {
422        trace!(
423            "MachBuffer: put 32-bit word @ {}: {:x}",
424            self.cur_offset(),
425            value
426        );
427        let bytes = value.to_le_bytes();
428        self.data.extend_from_slice(&bytes[..]);
429
430        // Post-invariant: as for `put1()`.
431    }
432
433    /// Add 8 bytes.
434    pub fn put8(&mut self, value: u64) {
435        trace!(
436            "MachBuffer: put 64-bit word @ {}: {:x}",
437            self.cur_offset(),
438            value
439        );
440        let bytes = value.to_le_bytes();
441        self.data.extend_from_slice(&bytes[..]);
442
443        // Post-invariant: as for `put1()`.
444    }
445
446    /// Add a slice of bytes.
447    pub fn put_data(&mut self, data: &[u8]) {
448        trace!(
449            "MachBuffer: put data @ {}: len {}",
450            self.cur_offset(),
451            data.len()
452        );
453        self.data.extend_from_slice(data);
454
455        // Post-invariant: as for `put1()`.
456    }
457
458    /// Reserve appended space and return a mutable slice referring to it.
459    pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
460        trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len);
461        let off = self.data.len();
462        let new_len = self.data.len() + len;
463        self.data.resize(new_len, 0);
464        &mut self.data[off..]
465
466        // Post-invariant: as for `put1()`.
467    }
468
469    /// Align up to the given alignment.
470    pub fn align_to(&mut self, align_to: CodeOffset) {
471        trace!("MachBuffer: align to {}", align_to);
472        assert!(
473            align_to.is_power_of_two(),
474            "{} is not a power of two",
475            align_to
476        );
477        while self.cur_offset() & (align_to - 1) != 0 {
478            self.put1(0);
479        }
480
481        // Post-invariant: as for `put1()`.
482    }
483
484    /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
485    /// offset yet.
486    pub fn get_label(&mut self) -> MachLabel {
487        let l = self.label_offsets.len() as u32;
488        self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
489        self.label_aliases.push(UNKNOWN_LABEL);
490        trace!("MachBuffer: new label -> {:?}", MachLabel(l));
491        MachLabel(l)
492
493        // Post-invariant: the only mutation is to add a new label; it has no
494        // bound offset yet, so it trivially satisfies all invariants.
495    }
496
497    /// Reserve the first N MachLabels for blocks.
498    pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
499        trace!("MachBuffer: first {} labels are for blocks", blocks);
500        debug_assert!(self.label_offsets.is_empty());
501        self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
502        self.label_aliases.resize(blocks, UNKNOWN_LABEL);
503
504        // Post-invariant: as for `get_label()`.
505    }
506
507    /// Reserve the next N MachLabels for constants.
508    pub fn reserve_labels_for_constants(&mut self, constants: &VCodeConstants) {
509        trace!(
510            "MachBuffer: next {} labels are for constants",
511            constants.len()
512        );
513        for c in constants.keys() {
514            self.constant_labels[c] = self.get_label();
515        }
516
517        // Post-invariant: as for `get_label()`.
518    }
519
520    /// Retrieve the reserved label for a constant.
521    pub fn get_label_for_constant(&self, constant: VCodeConstant) -> MachLabel {
522        self.constant_labels[constant]
523    }
524
525    /// Bind a label to the current offset. A label can only be bound once.
526    pub fn bind_label(&mut self, label: MachLabel) {
527        trace!(
528            "MachBuffer: bind label {:?} at offset {}",
529            label,
530            self.cur_offset()
531        );
532        debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
533        debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
534        let offset = self.cur_offset();
535        self.label_offsets[label.0 as usize] = offset;
536        self.lazily_clear_labels_at_tail();
537        self.labels_at_tail.push(label);
538
539        // Invariants hold: bound offset of label is <= cur_offset (in fact it
540        // is equal). If the `labels_at_tail` list was complete and precise
541        // before, it is still, because we have bound this label to the current
542        // offset and added it to the list (which contains all labels at the
543        // current offset).
544
545        self.optimize_branches();
546
547        // Post-invariant: by `optimize_branches()` (see argument there).
548    }
549
550    /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
551    /// offset that it applies to.
552    fn lazily_clear_labels_at_tail(&mut self) {
553        let offset = self.cur_offset();
554        if offset > self.labels_at_tail_off {
555            self.labels_at_tail_off = offset;
556            self.labels_at_tail.clear();
557        }
558
559        // Post-invariant: either labels_at_tail_off was at cur_offset, and
560        // state is untouched, or was less than cur_offset, in which case the
561        // labels_at_tail list was conceptually empty, and is now actually
562        // empty.
563    }
564
565    /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
566    pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
567        let mut iters = 0;
568        while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
569            label = self.label_aliases[label.0 as usize];
570            // To protect against an infinite loop (despite our assurances to
571            // ourselves that the invariants make this impossible), assert out
572            // after 1M iterations. The number of basic blocks is limited
573            // in most contexts anyway so this should be impossible to hit with
574            // a legitimate input.
575            iters += 1;
576            assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
577        }
578        self.label_offsets[label.0 as usize]
579
580        // Post-invariant: no mutations.
581    }
582
583    /// Emit a reference to the given label with the given reference type (i.e.,
584    /// branch-instruction format) at the current offset.  This is like a
585    /// relocation, but handled internally.
586    ///
587    /// This can be called before the branch is actually emitted; fixups will
588    /// not happen until an island is emitted or the buffer is finished.
589    pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
590        trace!(
591            "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
592            offset,
593            label,
594            kind
595        );
596
597        // Add the fixup, and update the worst-case island size based on a
598        // veneer for this label use.
599        self.fixup_records.push(MachLabelFixup {
600            label,
601            offset,
602            kind,
603        });
604        if kind.supports_veneer() {
605            self.island_worst_case_size += kind.veneer_size();
606            self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1);
607        }
608        let deadline = offset.saturating_add(kind.max_pos_range());
609        if deadline < self.island_deadline {
610            self.island_deadline = deadline;
611        }
612
613        // Post-invariant: no mutations to branches/labels data structures.
614    }
615
616    /// Inform the buffer of an unconditional branch at the given offset,
617    /// targetting the given label. May be used to optimize branches.
618    /// The last added label-use must correspond to this branch.
619    /// This must be called when the current offset is equal to `start`; i.e.,
620    /// before actually emitting the branch. This implies that for a branch that
621    /// uses a label and is eligible for optimizations by the MachBuffer, the
622    /// proper sequence is:
623    ///
624    /// - Call `use_label_at_offset()` to emit the fixup record.
625    /// - Call `add_uncond_branch()` to make note of the branch.
626    /// - Emit the bytes for the branch's machine code.
627    ///
628    /// Additional requirement: no labels may be bound between `start` and `end`
629    /// (exclusive on both ends).
630    pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
631        assert!(self.cur_offset() == start);
632        debug_assert!(end > start);
633        assert!(!self.fixup_records.is_empty());
634        let fixup = self.fixup_records.len() - 1;
635        self.lazily_clear_labels_at_tail();
636        self.latest_branches.push(MachBranch {
637            start,
638            end,
639            target,
640            fixup,
641            inverted: None,
642            labels_at_this_branch: self.labels_at_tail.clone(),
643        });
644
645        // Post-invariant: we asserted branch start is current tail; the list of
646        // labels at branch is cloned from list of labels at current tail.
647    }
648
649    /// Inform the buffer of a conditional branch at the given offset,
650    /// targetting the given label. May be used to optimize branches.
651    /// The last added label-use must correspond to this branch.
652    ///
653    /// Additional requirement: no labels may be bound between `start` and `end`
654    /// (exclusive on both ends).
655    pub fn add_cond_branch(
656        &mut self,
657        start: CodeOffset,
658        end: CodeOffset,
659        target: MachLabel,
660        inverted: &[u8],
661    ) {
662        assert!(self.cur_offset() == start);
663        debug_assert!(end > start);
664        assert!(!self.fixup_records.is_empty());
665        debug_assert!(inverted.len() == (end - start) as usize);
666        let fixup = self.fixup_records.len() - 1;
667        let inverted = Some(SmallVec::from(inverted));
668        self.lazily_clear_labels_at_tail();
669        self.latest_branches.push(MachBranch {
670            start,
671            end,
672            target,
673            fixup,
674            inverted,
675            labels_at_this_branch: self.labels_at_tail.clone(),
676        });
677
678        // Post-invariant: we asserted branch start is current tail; labels at
679        // branch list is cloned from list of labels at current tail.
680    }
681
682    fn truncate_last_branch(&mut self) {
683        self.lazily_clear_labels_at_tail();
684        // Invariants hold at this point.
685
686        let b = self.latest_branches.pop().unwrap();
687        assert!(b.end == self.cur_offset());
688
689        // State:
690        //    [PRE CODE]
691        //  Offset b.start, b.labels_at_this_branch:
692        //    [BRANCH CODE]
693        //  cur_off, self.labels_at_tail -->
694        //    (end of buffer)
695        self.data.truncate(b.start as usize);
696        self.fixup_records.truncate(b.fixup);
697        while let Some(mut last_srcloc) = self.srclocs.last_mut() {
698            if last_srcloc.end <= b.start {
699                break;
700            }
701            if last_srcloc.start < b.start {
702                last_srcloc.end = b.start;
703                break;
704            }
705            self.srclocs.pop();
706        }
707        // State:
708        //    [PRE CODE]
709        //  cur_off, Offset b.start, b.labels_at_this_branch:
710        //    (end of buffer)
711        //
712        //  self.labels_at_tail -->  (past end of buffer)
713        let cur_off = self.cur_offset();
714        self.labels_at_tail_off = cur_off;
715        // State:
716        //    [PRE CODE]
717        //  cur_off, Offset b.start, b.labels_at_this_branch,
718        //  self.labels_at_tail:
719        //    (end of buffer)
720        //
721        // resolve_label_offset(l) for l in labels_at_tail:
722        //    (past end of buffer)
723
724        trace!(
725            "truncate_last_branch: truncated {:?}; off now {}",
726            b,
727            cur_off
728        );
729
730        // Fix up resolved label offsets for labels at tail.
731        for &l in &self.labels_at_tail {
732            self.label_offsets[l.0 as usize] = cur_off;
733        }
734        // Old labels_at_this_branch are now at cur_off.
735        self.labels_at_tail
736            .extend(b.labels_at_this_branch.into_iter());
737
738        // Post-invariant: this operation is defined to truncate the buffer,
739        // which moves cur_off backward, and to move labels at the end of the
740        // buffer back to the start-of-branch offset.
741        //
742        // latest_branches satisfies all invariants:
743        // - it has no branches past the end of the buffer (branches are in
744        //   order, we removed the last one, and we truncated the buffer to just
745        //   before the start of that branch)
746        // - no labels were moved to lower offsets than the (new) cur_off, so
747        //   the labels_at_this_branch list for any other branch need not change.
748        //
749        // labels_at_tail satisfies all invariants:
750        // - all labels that were at the tail after the truncated branch are
751        //   moved backward to just before the branch, which becomes the new tail;
752        //   thus every element in the list should remain (ensured by `.extend()`
753        //   above).
754        // - all labels that refer to the new tail, which is the start-offset of
755        //   the truncated branch, must be present. The `labels_at_this_branch`
756        //   list in the truncated branch's record is a complete and precise list
757        //   of exactly these labels; we append these to labels_at_tail.
758        // - labels_at_tail_off is at cur_off after truncation occurs, so the
759        //   list is valid (not to be lazily cleared).
760        //
761        // The stated operation was performed:
762        // - For each label at the end of the buffer prior to this method, it
763        //   now resolves to the new (truncated) end of the buffer: it must have
764        //   been in `labels_at_tail` (this list is precise and complete, and
765        //   the tail was at the end of the truncated branch on entry), and we
766        //   iterate over this list and set `label_offsets` to the new tail.
767        //   None of these labels could have been an alias (by invariant), so
768        //   `label_offsets` is authoritative for each.
769        // - No other labels will be past the end of the buffer, because of the
770        //   requirement that no labels be bound to the middle of branch ranges
771        //   (see comments to `add_{cond,uncond}_branch()`).
772        // - The buffer is truncated to just before the last branch, and the
773        //   fixup record referring to that last branch is removed.
774    }
775
776    fn optimize_branches(&mut self) {
777        self.lazily_clear_labels_at_tail();
778        // Invariants valid at this point.
779
780        trace!(
781            "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
782            self.latest_branches,
783            self.labels_at_tail,
784            self.fixup_records
785        );
786
787        // We continue to munch on branches at the tail of the buffer until no
788        // more rules apply. Note that the loop only continues if a branch is
789        // actually truncated (or if labels are redirected away from a branch),
790        // so this always makes progress.
791        while let Some(b) = self.latest_branches.last() {
792            let cur_off = self.cur_offset();
793            trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
794            // If there has been any code emission since the end of the last branch or
795            // label definition, then there's nothing we can edit (because we
796            // don't move code once placed, only back up and overwrite), so
797            // clear the records and finish.
798            if b.end < cur_off {
799                break;
800            }
801
802            // If the "labels at this branch" list on this branch is
803            // longer than a threshold, don't do any simplification,
804            // and let the branch remain to separate those labels from
805            // the current tail. This avoids quadratic behavior (see
806            // #3468): otherwise, if a long string of "goto next;
807            // next:" patterns are emitted, all of the labels will
808            // coalesce into a long list of aliases for the current
809            // buffer tail. We must track all aliases of the current
810            // tail for correctness, but we are also allowed to skip
811            // optimization (removal) of any branch, so we take the
812            // escape hatch here and let it stand. In effect this
813            // "spreads" the many thousands of labels in the
814            // pathological case among an actual (harmless but
815            // suboptimal) instruction once per N labels.
816            if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
817                break;
818            }
819
820            // Invariant: we are looking at a branch that ends at the tail of
821            // the buffer.
822
823            // For any branch, conditional or unconditional:
824            // - If the target is a label at the current offset, then remove
825            //   the conditional branch, and reset all labels that targetted
826            //   the current offset (end of branch) to the truncated
827            //   end-of-code.
828            //
829            // Preserves execution semantics: a branch to its own fallthrough
830            // address is equivalent to a no-op; in both cases, nextPC is the
831            // fallthrough.
832            if self.resolve_label_offset(b.target) == cur_off {
833                trace!("branch with target == cur off; truncating");
834                self.truncate_last_branch();
835                continue;
836            }
837
838            // If latest is an unconditional branch:
839            //
840            // - If the branch's target is not its own start address, then for
841            //   each label at the start of branch, make the label an alias of the
842            //   branch target, and remove the label from the "labels at this
843            //   branch" list.
844            //
845            //   - Preserves execution semantics: an unconditional branch's
846            //     only effect is to set PC to a new PC; this change simply
847            //     collapses one step in the step-semantics.
848            //
849            //   - Post-invariant: the labels that were bound to the start of
850            //     this branch become aliases, so they must not be present in any
851            //     labels-at-this-branch list or the labels-at-tail list. The
852            //     labels are removed form the latest-branch record's
853            //     labels-at-this-branch list, and are never placed in the
854            //     labels-at-tail list. Furthermore, it is correct that they are
855            //     not in either list, because they are now aliases, and labels
856            //     that are aliases remain aliases forever.
857            //
858            // - If there is a prior unconditional branch that ends just before
859            //   this one begins, and this branch has no labels bound to its
860            //   start, then we can truncate this branch, because it is entirely
861            //   unreachable (we have redirected all labels that make it
862            //   reachable otherwise). Do so and continue around the loop.
863            //
864            //   - Preserves execution semantics: the branch is unreachable,
865            //     because execution can only flow into an instruction from the
866            //     prior instruction's fallthrough or from a branch bound to that
867            //     instruction's start offset. Unconditional branches have no
868            //     fallthrough, so if the prior instruction is an unconditional
869            //     branch, no fallthrough entry can happen. The
870            //     labels-at-this-branch list is complete (by invariant), so if it
871            //     is empty, then the instruction is entirely unreachable. Thus,
872            //     it can be removed.
873            //
874            //   - Post-invariant: ensured by truncate_last_branch().
875            //
876            // - If there is a prior conditional branch whose target label
877            //   resolves to the current offset (branches around the
878            //   unconditional branch), then remove the unconditional branch,
879            //   and make the target of the unconditional the target of the
880            //   conditional instead.
881            //
882            //   - Preserves execution semantics: previously we had:
883            //
884            //         L1:
885            //            cond_br L2
886            //            br L3
887            //         L2:
888            //            (end of buffer)
889            //
890            //     by removing the last branch, we have:
891            //
892            //         L1:
893            //            cond_br L2
894            //         L2:
895            //            (end of buffer)
896            //
897            //     we then fix up the records for the conditional branch to
898            //     have:
899            //
900            //         L1:
901            //           cond_br.inverted L3
902            //         L2:
903            //
904            //     In the original code, control flow reaches L2 when the
905            //     conditional branch's predicate is true, and L3 otherwise. In
906            //     the optimized code, the same is true.
907            //
908            //   - Post-invariant: all edits to latest_branches and
909            //     labels_at_tail are performed by `truncate_last_branch()`,
910            //     which maintains the invariants at each step.
911
912            if b.is_uncond() {
913                // Set any label equal to current branch's start as an alias of
914                // the branch's target, if the target is not the branch itself
915                // (i.e., an infinite loop).
916                //
917                // We cannot perform this aliasing if the target of this branch
918                // ultimately aliases back here; if so, we need to keep this
919                // branch, so break out of this loop entirely (and clear the
920                // latest-branches list below).
921                //
922                // Note that this check is what prevents cycles from forming in
923                // `self.label_aliases`. To see why, consider an arbitrary start
924                // state:
925                //
926                // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
927                // Ln, which is not aliased.
928                //
929                // We would create a cycle if we assigned label_aliases[Ln]
930                // = L1.  Note that the below assignment is the only write
931                // to label_aliases.
932                //
933                // By our other invariants, we have that Ln (`l` below)
934                // resolves to the offset `b.start`, because it is in the
935                // set `b.labels_at_this_branch`.
936                //
937                // If L1 were already aliased, through some arbitrarily deep
938                // chain, to Ln, then it must also resolve to this offset
939                // `b.start`.
940                //
941                // By checking the resolution of `L1` against this offset,
942                // and aborting this branch-simplification if they are
943                // equal, we prevent the below assignment from ever creating
944                // a cycle.
945                if self.resolve_label_offset(b.target) != b.start {
946                    let redirected = b.labels_at_this_branch.len();
947                    for &l in &b.labels_at_this_branch {
948                        trace!(
949                            " -> label at start of branch {:?} redirected to target {:?}",
950                            l,
951                            b.target
952                        );
953                        self.label_aliases[l.0 as usize] = b.target;
954                        // NOTE: we continue to ensure the invariant that labels
955                        // pointing to tail of buffer are in `labels_at_tail`
956                        // because we already ensured above that the last branch
957                        // cannot have a target of `cur_off`; so we never have
958                        // to put the label into `labels_at_tail` when moving it
959                        // here.
960                    }
961                    // Maintain invariant: all branches have been redirected
962                    // and are no longer pointing at the start of this branch.
963                    let mut_b = self.latest_branches.last_mut().unwrap();
964                    mut_b.labels_at_this_branch.clear();
965
966                    if redirected > 0 {
967                        trace!(" -> after label redirects, restarting loop");
968                        continue;
969                    }
970                } else {
971                    break;
972                }
973
974                let b = self.latest_branches.last().unwrap();
975
976                // Examine any immediately preceding branch.
977                if self.latest_branches.len() > 1 {
978                    let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
979                    trace!(" -> more than one branch; prev_b = {:?}", prev_b);
980                    // This uncond is immediately after another uncond; we
981                    // should have already redirected labels to this uncond away
982                    // (but check to be sure); so we can truncate this uncond.
983                    if prev_b.is_uncond()
984                        && prev_b.end == b.start
985                        && b.labels_at_this_branch.is_empty()
986                    {
987                        trace!(" -> uncond follows another uncond; truncating");
988                        self.truncate_last_branch();
989                        continue;
990                    }
991
992                    // This uncond is immediately after a conditional, and the
993                    // conditional's target is the end of this uncond, and we've
994                    // already redirected labels to this uncond away; so we can
995                    // truncate this uncond, flip the sense of the conditional, and
996                    // set the conditional's target (in `latest_branches` and in
997                    // `fixup_records`) to the uncond's target.
998                    if prev_b.is_cond()
999                        && prev_b.end == b.start
1000                        && self.resolve_label_offset(prev_b.target) == cur_off
1001                    {
1002                        trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
1003                        // Save the target of the uncond (this becomes the
1004                        // target of the cond), and truncate the uncond.
1005                        let target = b.target;
1006                        let data = prev_b.inverted.clone().unwrap();
1007                        self.truncate_last_branch();
1008
1009                        // Mutate the code and cond branch.
1010                        let off_before_edit = self.cur_offset();
1011                        let prev_b = self.latest_branches.last_mut().unwrap();
1012                        let not_inverted = SmallVec::from(
1013                            &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1014                        );
1015
1016                        // Low-level edit: replaces bytes of branch with
1017                        // inverted form. cur_off remains the same afterward, so
1018                        // we do not need to modify label data structures.
1019                        self.data.truncate(prev_b.start as usize);
1020                        self.data.extend_from_slice(&data[..]);
1021
1022                        // Save the original code as the inversion of the
1023                        // inverted branch, in case we later edit this branch
1024                        // again.
1025                        prev_b.inverted = Some(not_inverted);
1026                        self.fixup_records[prev_b.fixup].label = target;
1027                        trace!(" -> reassigning target of condbr to {:?}", target);
1028                        prev_b.target = target;
1029                        debug_assert_eq!(off_before_edit, self.cur_offset());
1030                        continue;
1031                    }
1032                }
1033            }
1034
1035            // If we couldn't do anything with the last branch, then break.
1036            break;
1037        }
1038
1039        self.purge_latest_branches();
1040
1041        trace!(
1042            "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1043            self.latest_branches,
1044            self.labels_at_tail,
1045            self.fixup_records
1046        );
1047    }
1048
1049    fn purge_latest_branches(&mut self) {
1050        // All of our branch simplification rules work only if a branch ends at
1051        // the tail of the buffer, with no following code; and branches are in
1052        // order in latest_branches; so if the last entry ends prior to
1053        // cur_offset, then clear all entries.
1054        let cur_off = self.cur_offset();
1055        if let Some(l) = self.latest_branches.last() {
1056            if l.end < cur_off {
1057                trace!("purge_latest_branches: removing branch {:?}", l);
1058                self.latest_branches.clear();
1059            }
1060        }
1061
1062        // Post-invariant: no invariant requires any branch to appear in
1063        // `latest_branches`; it is always optional. The list-clear above thus
1064        // preserves all semantics.
1065    }
1066
1067    /// Emit a constant at some point in the future, binding the given label to
1068    /// its offset. The constant will be placed at most `max_distance` from the
1069    /// current offset.
1070    pub fn defer_constant(
1071        &mut self,
1072        label: MachLabel,
1073        align: CodeOffset,
1074        data: &[u8],
1075        max_distance: CodeOffset,
1076    ) {
1077        trace!(
1078            "defer_constant: eventually emit {} bytes aligned to {} at label {:?}",
1079            data.len(),
1080            align,
1081            label
1082        );
1083        self.update_deadline(data.len(), max_distance);
1084        self.pending_constants.push(MachLabelConstant {
1085            label,
1086            align,
1087            data: SmallVec::from(data),
1088        });
1089    }
1090
1091    /// Emit a trap at some point in the future with the specified code and
1092    /// stack map.
1093    ///
1094    /// This function returns a [`MachLabel`] which will be the future address
1095    /// of the trap. Jumps should refer to this label, likely by using the
1096    /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1097    /// patched in once the address of the trap is known.
1098    ///
1099    /// This will batch all traps into the end of the function.
1100    pub fn defer_trap(&mut self, code: TrapCode, stack_map: Option<StackMap>) -> MachLabel {
1101        let label = self.get_label();
1102        self.update_deadline(I::TRAP_OPCODE.len(), u32::MAX);
1103        self.pending_traps.push(MachLabelTrap {
1104            label,
1105            code,
1106            stack_map,
1107            loc: self.cur_srcloc.map(|(_start, loc)| loc),
1108        });
1109        label
1110    }
1111
1112    fn update_deadline(&mut self, len: usize, max_distance: CodeOffset) {
1113        trace!("defer: eventually emit {} bytes", len);
1114        let deadline = self.cur_offset().saturating_add(max_distance);
1115        self.island_worst_case_size += len as CodeOffset;
1116        self.island_worst_case_size =
1117            (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1);
1118        if deadline < self.island_deadline {
1119            self.island_deadline = deadline;
1120        }
1121    }
1122
1123    /// Is an island needed within the next N bytes?
1124    pub fn island_needed(&self, distance: CodeOffset) -> bool {
1125        self.worst_case_end_of_island(distance) > self.island_deadline
1126    }
1127
1128    /// Returns the maximal offset that islands can reach if `distance` more
1129    /// bytes are appended.
1130    ///
1131    /// This is used to determine if veneers need insertions since jumps that
1132    /// can't reach past this point must get a veneer of some form.
1133    fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1134        self.cur_offset()
1135            .saturating_add(distance)
1136            .saturating_add(self.island_worst_case_size)
1137    }
1138
1139    /// Emit all pending constants and required pending veneers.
1140    ///
1141    /// Should only be called if `island_needed()` returns true, i.e., if we
1142    /// actually reach a deadline. It's not necessarily a problem to do so
1143    /// otherwise but it may result in unnecessary work during emission.
1144    pub fn emit_island(&mut self, distance: CodeOffset) {
1145        self.emit_island_maybe_forced(false, distance);
1146    }
1147
1148    /// Same as `emit_island`, but an internal API with a `force_veneers`
1149    /// argument to force all veneers to always get emitted for debugging.
1150    fn emit_island_maybe_forced(&mut self, force_veneers: bool, distance: CodeOffset) {
1151        // We're going to purge fixups, so no latest-branch editing can happen
1152        // anymore.
1153        self.latest_branches.clear();
1154
1155        // Reset internal calculations about islands since we're going to
1156        // change the calculus as we apply fixups. The `forced_threshold` is
1157        // used here to determine whether jumps to unknown labels will require
1158        // a veneer or not.
1159        let forced_threshold = self.worst_case_end_of_island(distance);
1160        self.island_deadline = UNKNOWN_LABEL_OFFSET;
1161        self.island_worst_case_size = 0;
1162
1163        // End the current location tracking since anything emitted during this
1164        // function shouldn't be attributed to whatever the current source
1165        // location is.
1166        //
1167        // Note that the current source location, if it's set right now, will be
1168        // restored at the end of this island emission.
1169        let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1170        if cur_loc.is_some() {
1171            self.end_srcloc();
1172        }
1173
1174        // First flush out all traps/constants so we have more labels in case
1175        // fixups are applied against these labels.
1176        //
1177        // Note that traps are placed first since this typically happens at the
1178        // end of the function and for disassemblers we try to keep all the code
1179        // contiguously together.
1180        for MachLabelTrap {
1181            label,
1182            code,
1183            stack_map,
1184            loc,
1185        } in mem::take(&mut self.pending_traps)
1186        {
1187            // If this trap has source information associated with it then
1188            // emit this information for the trap instruction going out now too.
1189            if let Some(loc) = loc {
1190                self.start_srcloc(loc);
1191            }
1192            self.align_to(I::LabelUse::ALIGN);
1193            self.bind_label(label);
1194            self.add_trap(code);
1195            if let Some(map) = stack_map {
1196                let extent = StackMapExtent::UpcomingBytes(I::TRAP_OPCODE.len() as u32);
1197                self.add_stack_map(extent, map);
1198            }
1199            self.put_data(I::TRAP_OPCODE);
1200            if loc.is_some() {
1201                self.end_srcloc();
1202            }
1203        }
1204
1205        for MachLabelConstant { label, align, data } in mem::take(&mut self.pending_constants) {
1206            self.align_to(align);
1207            self.bind_label(label);
1208            self.put_data(&data[..]);
1209        }
1210
1211        for fixup in mem::take(&mut self.fixup_records) {
1212            trace!("emit_island: fixup {:?}", fixup);
1213            let MachLabelFixup {
1214                label,
1215                offset,
1216                kind,
1217            } = fixup;
1218            let label_offset = self.resolve_label_offset(label);
1219            let start = offset as usize;
1220            let end = (offset + kind.patch_size()) as usize;
1221
1222            if label_offset != UNKNOWN_LABEL_OFFSET {
1223                // If the offset of the label for this fixup is known then
1224                // we're going to do something here-and-now. We're either going
1225                // to patch the original offset because it's an in-bounds jump,
1226                // or we're going to generate a veneer, patch the fixup to jump
1227                // to the veneer, and then keep going.
1228                //
1229                // If the label comes after the original fixup, then we should
1230                // be guaranteed that the jump is in-bounds. Otherwise there's
1231                // a bug somewhere because this method wasn't called soon
1232                // enough. All forward-jumps are tracked and should get veneers
1233                // before their deadline comes and they're unable to jump
1234                // further.
1235                //
1236                // Otherwise if the label is before the fixup, then that's a
1237                // backwards jump. If it's past the maximum negative range
1238                // then we'll emit a veneer that to jump forward to which can
1239                // then jump backwards.
1240                let veneer_required = if label_offset >= offset {
1241                    assert!((label_offset - offset) <= kind.max_pos_range());
1242                    false
1243                } else {
1244                    (offset - label_offset) > kind.max_neg_range()
1245                };
1246                trace!(
1247                    " -> label_offset = {}, known, required = {} (pos {} neg {})",
1248                    label_offset,
1249                    veneer_required,
1250                    kind.max_pos_range(),
1251                    kind.max_neg_range()
1252                );
1253
1254                if (force_veneers && kind.supports_veneer()) || veneer_required {
1255                    self.emit_veneer(label, offset, kind);
1256                } else {
1257                    let slice = &mut self.data[start..end];
1258                    trace!("patching in-range!");
1259                    kind.patch(slice, offset, label_offset);
1260                }
1261            } else {
1262                // If the offset of this label is not known at this time then
1263                // there's one of two possibilities:
1264                //
1265                // * First we may be about to exceed the maximum jump range of
1266                //   this fixup. In that case a veneer is inserted to buy some
1267                //   more budget for the forward-jump. It's guaranteed that the
1268                //   label will eventually come after where we're at, so we know
1269                //   that the forward jump is necessary.
1270                //
1271                // * Otherwise we're still within range of the forward jump but
1272                //   the precise target isn't known yet. In that case we
1273                //   enqueue the fixup to get processed later.
1274                if forced_threshold - offset > kind.max_pos_range() {
1275                    self.emit_veneer(label, offset, kind);
1276                } else {
1277                    self.use_label_at_offset(offset, label, kind);
1278                }
1279            }
1280        }
1281
1282        if let Some(loc) = cur_loc {
1283            self.start_srcloc(loc);
1284        }
1285    }
1286
1287    /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1288    ///
1289    /// This will generate extra machine code, using `kind`, to get a
1290    /// larger-jump-kind than `kind` allows. The code at `offset` is then
1291    /// patched to jump to our new code, and then the new code is enqueued for
1292    /// a fixup to get processed at some later time.
1293    fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1294        // If this `kind` doesn't support a veneer then that's a bug in the
1295        // backend because we need to implement support for such a veneer.
1296        assert!(
1297            kind.supports_veneer(),
1298            "jump beyond the range of {:?} but a veneer isn't supported",
1299            kind,
1300        );
1301
1302        // Allocate space for a veneer in the island.
1303        self.align_to(I::LabelUse::ALIGN);
1304        let veneer_offset = self.cur_offset();
1305        trace!("making a veneer at {}", veneer_offset);
1306        let start = offset as usize;
1307        let end = (offset + kind.patch_size()) as usize;
1308        let slice = &mut self.data[start..end];
1309        // Patch the original label use to refer to the veneer.
1310        trace!(
1311            "patching original at offset {} to veneer offset {}",
1312            offset,
1313            veneer_offset
1314        );
1315        kind.patch(slice, offset, veneer_offset);
1316        // Generate the veneer.
1317        let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1318        let (veneer_fixup_off, veneer_label_use) =
1319            kind.generate_veneer(veneer_slice, veneer_offset);
1320        trace!(
1321            "generated veneer; fixup offset {}, label_use {:?}",
1322            veneer_fixup_off,
1323            veneer_label_use
1324        );
1325        // Register a new use of `label` with our new veneer fixup and offset.
1326        // This'll recalculate deadlines accordingly and enqueue this fixup to
1327        // get processed at some later time.
1328        self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1329    }
1330
1331    fn finish_emission_maybe_forcing_veneers(&mut self, force_veneers: bool) {
1332        while !self.pending_constants.is_empty()
1333            || !self.pending_traps.is_empty()
1334            || !self.fixup_records.is_empty()
1335        {
1336            // `emit_island()` will emit any pending veneers and constants, and
1337            // as a side-effect, will also take care of any fixups with resolved
1338            // labels eagerly.
1339            self.emit_island_maybe_forced(force_veneers, u32::MAX);
1340        }
1341
1342        // Ensure that all labels have been fixed up after the last island is emitted. This is a
1343        // full (release-mode) assert because an unresolved label means the emitted code is
1344        // incorrect.
1345        assert!(self.fixup_records.is_empty());
1346    }
1347
1348    /// Finish any deferred emissions and/or fixups.
1349    pub fn finish(mut self) -> MachBufferFinalized<Stencil> {
1350        let _tt = timing::vcode_emit_finish();
1351
1352        // Do any optimizations on branches at tail of buffer, as if we
1353        // had bound one last label.
1354        self.optimize_branches();
1355
1356        self.finish_emission_maybe_forcing_veneers(false);
1357
1358        let mut srclocs = self.srclocs;
1359        srclocs.sort_by_key(|entry| entry.start);
1360
1361        MachBufferFinalized {
1362            data: self.data,
1363            relocs: self.relocs,
1364            traps: self.traps,
1365            call_sites: self.call_sites,
1366            srclocs,
1367            stack_maps: self.stack_maps,
1368            unwind_info: self.unwind_info,
1369        }
1370    }
1371
1372    /// Add an external relocation at the current offset.
1373    pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) {
1374        let name = name.clone();
1375        // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1376        // generate a label-use statement to track whether an island is possibly
1377        // needed to escape this function to actually get to the external name.
1378        // This is most likely to come up on AArch64 where calls between
1379        // functions use a 26-bit signed offset which gives +/- 64MB. This means
1380        // that if a function is 128MB in size and there's a call in the middle
1381        // it's impossible to reach the actual target. Also, while it's
1382        // technically possible to jump to the start of a function and then jump
1383        // further, island insertion below always inserts islands after
1384        // previously appended code so for Cranelift's own implementation this
1385        // is also a problem for 64MB functions on AArch64 which start with a
1386        // call instruction, those won't be able to escape.
1387        //
1388        // Ideally what needs to happen here is that a `LabelUse` is
1389        // transparently generated (or call-sites of this function are audited
1390        // to generate a `LabelUse` instead) and tracked internally. The actual
1391        // relocation would then change over time if and when a veneer is
1392        // inserted, where the relocation here would be patched by this
1393        // `MachBuffer` to jump to the veneer. The problem, though, is that all
1394        // this still needs to end up, in the case of a singular function,
1395        // generating a final relocation pointing either to this particular
1396        // relocation or to the veneer inserted. Additionally
1397        // `MachBuffer` needs the concept of a label which will never be
1398        // resolved, so `emit_island` doesn't trip over not actually ever
1399        // knowning what some labels are. Currently the loop in
1400        // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1401        // loop.
1402        //
1403        // For now this means that because relocs aren't tracked at all that
1404        // AArch64 functions have a rough size limits of 64MB. For now that's
1405        // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1406        // when a relocation can't otherwise be resolved later, so it shouldn't
1407        // actually result in any memory unsafety or anything like that.
1408        self.relocs.push(MachReloc {
1409            offset: self.data.len() as CodeOffset,
1410            kind,
1411            name,
1412            addend,
1413        });
1414    }
1415
1416    /// Add a trap record at the current offset.
1417    pub fn add_trap(&mut self, code: TrapCode) {
1418        self.traps.push(MachTrap {
1419            offset: self.data.len() as CodeOffset,
1420            code,
1421        });
1422    }
1423
1424    /// Add a call-site record at the current offset.
1425    pub fn add_call_site(&mut self, opcode: Opcode) {
1426        debug_assert!(
1427            opcode.is_call(),
1428            "adding call site info for a non-call instruction."
1429        );
1430        self.call_sites.push(MachCallSite {
1431            ret_addr: self.data.len() as CodeOffset,
1432            opcode,
1433        });
1434    }
1435
1436    /// Add an unwind record at the current offset.
1437    pub fn add_unwind(&mut self, unwind: UnwindInst) {
1438        self.unwind_info.push((self.cur_offset(), unwind));
1439    }
1440
1441    /// Set the `SourceLoc` for code from this offset until the offset at the
1442    /// next call to `end_srcloc()`.
1443    pub fn start_srcloc(&mut self, loc: RelSourceLoc) {
1444        self.cur_srcloc = Some((self.cur_offset(), loc));
1445    }
1446
1447    /// Mark the end of the `SourceLoc` segment started at the last
1448    /// `start_srcloc()` call.
1449    pub fn end_srcloc(&mut self) {
1450        let (start, loc) = self
1451            .cur_srcloc
1452            .take()
1453            .expect("end_srcloc() called without start_srcloc()");
1454        let end = self.cur_offset();
1455        // Skip zero-length extends.
1456        debug_assert!(end >= start);
1457        if end > start {
1458            self.srclocs.push(MachSrcLoc { start, end, loc });
1459        }
1460    }
1461
1462    /// Add stack map metadata for this program point: a set of stack offsets
1463    /// (from SP upward) that contain live references.
1464    ///
1465    /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets`
1466    /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets`
1467    /// element, one can obtain live-reference offsets from FP instead.
1468    pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
1469        let (start, end) = match extent {
1470            StackMapExtent::UpcomingBytes(insn_len) => {
1471                let start_offset = self.cur_offset();
1472                (start_offset, start_offset + insn_len)
1473            }
1474            StackMapExtent::StartedAtOffset(start_offset) => {
1475                let end_offset = self.cur_offset();
1476                debug_assert!(end_offset >= start_offset);
1477                (start_offset, end_offset)
1478            }
1479        };
1480        trace!("Adding stack map for offsets {start:#x}..{end:#x}");
1481        self.stack_maps.push(MachStackMap {
1482            offset: start,
1483            offset_end: end,
1484            stack_map,
1485        });
1486    }
1487}
1488
1489impl<T: CompilePhase> MachBufferFinalized<T> {
1490    /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1491    pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1492        &self.srclocs[..]
1493    }
1494
1495    /// Get the total required size for the code.
1496    pub fn total_size(&self) -> CodeOffset {
1497        self.data.len() as CodeOffset
1498    }
1499
1500    /// Return the code in this mach buffer as a hex string for testing purposes.
1501    pub fn stringify_code_bytes(&self) -> String {
1502        // This is pretty lame, but whatever ..
1503        use std::fmt::Write;
1504        let mut s = String::with_capacity(self.data.len() * 2);
1505        for b in &self.data {
1506            write!(&mut s, "{:02X}", b).unwrap();
1507        }
1508        s
1509    }
1510
1511    /// Get the code bytes.
1512    pub fn data(&self) -> &[u8] {
1513        // N.B.: we emit every section into the .text section as far as
1514        // the `CodeSink` is concerned; we do not bother to segregate
1515        // the contents into the actual program text, the jumptable and the
1516        // rodata (constant pool). This allows us to generate code assuming
1517        // that these will not be relocated relative to each other, and avoids
1518        // having to designate each section as belonging in one of the three
1519        // fixed categories defined by `CodeSink`. If this becomes a problem
1520        // later (e.g. because of memory permissions or similar), we can
1521        // add this designation and segregate the output; take care, however,
1522        // to add the appropriate relocations in this case.
1523
1524        &self.data[..]
1525    }
1526
1527    /// Get the list of external relocations for this code.
1528    pub fn relocs(&self) -> &[MachReloc] {
1529        &self.relocs[..]
1530    }
1531
1532    /// Get the list of trap records for this code.
1533    pub fn traps(&self) -> &[MachTrap] {
1534        &self.traps[..]
1535    }
1536
1537    /// Get the stack map metadata for this code.
1538    pub fn stack_maps(&self) -> &[MachStackMap] {
1539        &self.stack_maps[..]
1540    }
1541
1542    /// Get the list of call sites for this code.
1543    pub fn call_sites(&self) -> &[MachCallSite] {
1544        &self.call_sites[..]
1545    }
1546}
1547
1548/// A constant that is deferred to the next constant-pool opportunity.
1549struct MachLabelConstant {
1550    /// This label will refer to the constant's offset.
1551    label: MachLabel,
1552    /// Required alignment.
1553    align: CodeOffset,
1554    /// This data will be emitted when able.
1555    data: SmallVec<[u8; 16]>,
1556}
1557
1558/// A trap that is deferred to the next time an island is emitted for either
1559/// traps, constants, or fixups.
1560struct MachLabelTrap {
1561    /// This label will refer to the trap's offset.
1562    label: MachLabel,
1563    /// The code associated with this trap.
1564    code: TrapCode,
1565    /// An optional stack map to associate with this trap.
1566    stack_map: Option<StackMap>,
1567    /// An optional source location to assign for this trap.
1568    loc: Option<RelSourceLoc>,
1569}
1570
1571/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1572/// to labels and patch the code based on label offsets. Hence, they are like
1573/// relocations, but internal to one buffer.
1574#[derive(Debug)]
1575struct MachLabelFixup<I: VCodeInst> {
1576    /// The label whose offset controls this fixup.
1577    label: MachLabel,
1578    /// The offset to fix up / patch to refer to this label.
1579    offset: CodeOffset,
1580    /// The kind of fixup. This is architecture-specific; each architecture may have,
1581    /// e.g., several types of branch instructions, each with differently-sized
1582    /// offset fields and different places within the instruction to place the
1583    /// bits.
1584    kind: I::LabelUse,
1585}
1586
1587/// A relocation resulting from a compilation.
1588#[derive(Clone, Debug, PartialEq)]
1589#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1590pub struct MachReloc {
1591    /// The offset at which the relocation applies, *relative to the
1592    /// containing section*.
1593    pub offset: CodeOffset,
1594    /// The kind of relocation.
1595    pub kind: Reloc,
1596    /// The external symbol / name to which this relocation refers.
1597    pub name: ExternalName,
1598    /// The addend to add to the symbol value.
1599    pub addend: i64,
1600}
1601
1602/// A trap record resulting from a compilation.
1603#[derive(Clone, Debug, PartialEq)]
1604#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1605pub struct MachTrap {
1606    /// The offset at which the trap instruction occurs, *relative to the
1607    /// containing section*.
1608    pub offset: CodeOffset,
1609    /// The trap code.
1610    pub code: TrapCode,
1611}
1612
1613/// A call site record resulting from a compilation.
1614#[derive(Clone, Debug, PartialEq)]
1615#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1616pub struct MachCallSite {
1617    /// The offset of the call's return address, *relative to the containing section*.
1618    pub ret_addr: CodeOffset,
1619    /// The call's opcode.
1620    pub opcode: Opcode,
1621}
1622
1623/// A source-location mapping resulting from a compilation.
1624#[derive(PartialEq, Debug, Clone)]
1625#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1626pub struct MachSrcLoc<T: CompilePhase> {
1627    /// The start of the region of code corresponding to a source location.
1628    /// This is relative to the start of the function, not to the start of the
1629    /// section.
1630    pub start: CodeOffset,
1631    /// The end of the region of code corresponding to a source location.
1632    /// This is relative to the start of the section, not to the start of the
1633    /// section.
1634    pub end: CodeOffset,
1635    /// The source location.
1636    pub loc: T::SourceLocType,
1637}
1638
1639impl MachSrcLoc<Stencil> {
1640    fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1641        MachSrcLoc {
1642            start: self.start,
1643            end: self.end,
1644            loc: self.loc.expand(base_srcloc),
1645        }
1646    }
1647}
1648
1649/// Record of stack map metadata: stack offsets containing references.
1650#[derive(Clone, Debug, PartialEq)]
1651#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1652pub struct MachStackMap {
1653    /// The code offset at which this stack map applies.
1654    pub offset: CodeOffset,
1655    /// The code offset just past the "end" of the instruction: that is, the
1656    /// offset of the first byte of the following instruction, or equivalently,
1657    /// the start offset plus the instruction length.
1658    pub offset_end: CodeOffset,
1659    /// The stack map itself.
1660    pub stack_map: StackMap,
1661}
1662
1663/// Record of branch instruction in the buffer, to facilitate editing.
1664#[derive(Clone, Debug)]
1665struct MachBranch {
1666    start: CodeOffset,
1667    end: CodeOffset,
1668    target: MachLabel,
1669    fixup: usize,
1670    inverted: Option<SmallVec<[u8; 8]>>,
1671    /// All labels pointing to the start of this branch. For correctness, this
1672    /// *must* be complete (i.e., must contain all labels whose resolved offsets
1673    /// are at the start of this branch): we rely on being able to redirect all
1674    /// labels that could jump to this branch before removing it, if it is
1675    /// otherwise unreachable.
1676    labels_at_this_branch: SmallVec<[MachLabel; 4]>,
1677}
1678
1679impl MachBranch {
1680    fn is_cond(&self) -> bool {
1681        self.inverted.is_some()
1682    }
1683    fn is_uncond(&self) -> bool {
1684        self.inverted.is_none()
1685    }
1686}
1687
1688/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
1689///
1690/// Note that `MachBuffer` was primarily written for intra-function references
1691/// of jumps between basic blocks, but it's also quite usable for entire text
1692/// sections and resolving references between functions themselves. This
1693/// builder interprets "blocks" as labeled functions for the purposes of
1694/// resolving labels internally in the buffer.
1695pub struct MachTextSectionBuilder<I: VCodeInst> {
1696    buf: MachBuffer<I>,
1697    next_func: usize,
1698    force_veneers: bool,
1699}
1700
1701impl<I: VCodeInst> MachTextSectionBuilder<I> {
1702    /// Creates a new text section builder which will have `num_funcs` functions
1703    /// pushed into it.
1704    pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
1705        let mut buf = MachBuffer::new();
1706        buf.reserve_labels_for_blocks(num_funcs);
1707        MachTextSectionBuilder {
1708            buf,
1709            next_func: 0,
1710            force_veneers: false,
1711        }
1712    }
1713}
1714
1715impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
1716    fn append(&mut self, labeled: bool, func: &[u8], align: u32) -> u64 {
1717        // Conditionally emit an island if it's necessary to resolve jumps
1718        // between functions which are too far away.
1719        let size = func.len() as u32;
1720        if self.force_veneers || self.buf.island_needed(size) {
1721            self.buf.emit_island_maybe_forced(self.force_veneers, size);
1722        }
1723
1724        self.buf.align_to(align);
1725        let pos = self.buf.cur_offset();
1726        if labeled {
1727            self.buf
1728                .bind_label(MachLabel::from_block(BlockIndex::new(self.next_func)));
1729            self.next_func += 1;
1730        }
1731        self.buf.put_data(func);
1732        u64::from(pos)
1733    }
1734
1735    fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
1736        let label = MachLabel::from_block(BlockIndex::new(target));
1737        let offset = u32::try_from(offset).unwrap();
1738        match I::LabelUse::from_reloc(reloc, addend) {
1739            Some(label_use) => {
1740                self.buf.use_label_at_offset(offset, label, label_use);
1741                true
1742            }
1743            None => false,
1744        }
1745    }
1746
1747    fn force_veneers(&mut self) {
1748        self.force_veneers = true;
1749    }
1750
1751    fn finish(&mut self) -> Vec<u8> {
1752        // Double-check all functions were pushed.
1753        assert_eq!(self.next_func, self.buf.label_offsets.len());
1754
1755        // Finish up any veneers, if necessary.
1756        self.buf
1757            .finish_emission_maybe_forcing_veneers(self.force_veneers);
1758
1759        // We don't need the data any more, so return it to the caller.
1760        mem::take(&mut self.buf.data).into_vec()
1761    }
1762}
1763
1764// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
1765#[cfg(all(test, feature = "arm64"))]
1766mod test {
1767    use cranelift_entity::EntityRef as _;
1768
1769    use super::*;
1770    use crate::ir::UserExternalNameRef;
1771    use crate::isa::aarch64::inst::xreg;
1772    use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
1773    use crate::machinst::MachInstEmit;
1774    use crate::settings;
1775    use std::default::Default;
1776    use std::vec::Vec;
1777
1778    fn label(n: u32) -> MachLabel {
1779        MachLabel::from_block(BlockIndex::new(n as usize))
1780    }
1781    fn target(n: u32) -> BranchTarget {
1782        BranchTarget::Label(label(n))
1783    }
1784
1785    #[test]
1786    fn test_elide_jump_to_next() {
1787        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1788        let mut buf = MachBuffer::new();
1789        let mut state = Default::default();
1790
1791        buf.reserve_labels_for_blocks(2);
1792        buf.bind_label(label(0));
1793        let inst = Inst::Jump { dest: target(1) };
1794        inst.emit(&[], &mut buf, &info, &mut state);
1795        buf.bind_label(label(1));
1796        let buf = buf.finish();
1797        assert_eq!(0, buf.total_size());
1798    }
1799
1800    #[test]
1801    fn test_elide_trivial_jump_blocks() {
1802        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1803        let mut buf = MachBuffer::new();
1804        let mut state = Default::default();
1805
1806        buf.reserve_labels_for_blocks(4);
1807
1808        buf.bind_label(label(0));
1809        let inst = Inst::CondBr {
1810            kind: CondBrKind::NotZero(xreg(0)),
1811            taken: target(1),
1812            not_taken: target(2),
1813        };
1814        inst.emit(&[], &mut buf, &info, &mut state);
1815
1816        buf.bind_label(label(1));
1817        let inst = Inst::Jump { dest: target(3) };
1818        inst.emit(&[], &mut buf, &info, &mut state);
1819
1820        buf.bind_label(label(2));
1821        let inst = Inst::Jump { dest: target(3) };
1822        inst.emit(&[], &mut buf, &info, &mut state);
1823
1824        buf.bind_label(label(3));
1825
1826        let buf = buf.finish();
1827        assert_eq!(0, buf.total_size());
1828    }
1829
1830    #[test]
1831    fn test_flip_cond() {
1832        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1833        let mut buf = MachBuffer::new();
1834        let mut state = Default::default();
1835
1836        buf.reserve_labels_for_blocks(4);
1837
1838        buf.bind_label(label(0));
1839        let inst = Inst::CondBr {
1840            kind: CondBrKind::Zero(xreg(0)),
1841            taken: target(1),
1842            not_taken: target(2),
1843        };
1844        inst.emit(&[], &mut buf, &info, &mut state);
1845
1846        buf.bind_label(label(1));
1847        let inst = Inst::Nop4;
1848        inst.emit(&[], &mut buf, &info, &mut state);
1849
1850        buf.bind_label(label(2));
1851        let inst = Inst::Udf {
1852            trap_code: TrapCode::Interrupt,
1853        };
1854        inst.emit(&[], &mut buf, &info, &mut state);
1855
1856        buf.bind_label(label(3));
1857
1858        let buf = buf.finish();
1859
1860        let mut buf2 = MachBuffer::new();
1861        let mut state = Default::default();
1862        let inst = Inst::TrapIf {
1863            kind: CondBrKind::NotZero(xreg(0)),
1864            trap_code: TrapCode::Interrupt,
1865        };
1866        inst.emit(&[], &mut buf2, &info, &mut state);
1867        let inst = Inst::Nop4;
1868        inst.emit(&[], &mut buf2, &info, &mut state);
1869
1870        let buf2 = buf2.finish();
1871
1872        assert_eq!(buf.data, buf2.data);
1873    }
1874
1875    #[test]
1876    fn test_island() {
1877        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1878        let mut buf = MachBuffer::new();
1879        let mut state = Default::default();
1880
1881        buf.reserve_labels_for_blocks(4);
1882
1883        buf.bind_label(label(0));
1884        let inst = Inst::CondBr {
1885            kind: CondBrKind::NotZero(xreg(0)),
1886            taken: target(2),
1887            not_taken: target(3),
1888        };
1889        inst.emit(&[], &mut buf, &info, &mut state);
1890
1891        buf.bind_label(label(1));
1892        while buf.cur_offset() < 2000000 {
1893            if buf.island_needed(0) {
1894                buf.emit_island(0);
1895            }
1896            let inst = Inst::Nop4;
1897            inst.emit(&[], &mut buf, &info, &mut state);
1898        }
1899
1900        buf.bind_label(label(2));
1901        let inst = Inst::Nop4;
1902        inst.emit(&[], &mut buf, &info, &mut state);
1903
1904        buf.bind_label(label(3));
1905        let inst = Inst::Nop4;
1906        inst.emit(&[], &mut buf, &info, &mut state);
1907
1908        let buf = buf.finish();
1909
1910        assert_eq!(2000000 + 8, buf.total_size());
1911
1912        let mut buf2 = MachBuffer::new();
1913        let mut state = Default::default();
1914        let inst = Inst::CondBr {
1915            kind: CondBrKind::NotZero(xreg(0)),
1916
1917            // This conditionally taken branch has a 19-bit constant, shifted
1918            // to the left by two, giving us a 21-bit range in total. Half of
1919            // this range positive so the we should be around 1 << 20 bytes
1920            // away for our jump target.
1921            //
1922            // There are two pending fixups by the time we reach this point,
1923            // one for this 19-bit jump and one for the unconditional 26-bit
1924            // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
1925            // veneer is 20 bytes large, which means that pessimistically
1926            // assuming we'll need two veneers we need 24 bytes of extra
1927            // space, meaning that the actual island should come 24-bytes
1928            // before the deadline.
1929            taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20),
1930
1931            // This branch is in-range so no veneers should be needed, it should
1932            // go directly to the target.
1933            not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
1934        };
1935        inst.emit(&[], &mut buf2, &info, &mut state);
1936
1937        let buf2 = buf2.finish();
1938
1939        assert_eq!(&buf.data[0..8], &buf2.data[..]);
1940    }
1941
1942    #[test]
1943    fn test_island_backward() {
1944        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1945        let mut buf = MachBuffer::new();
1946        let mut state = Default::default();
1947
1948        buf.reserve_labels_for_blocks(4);
1949
1950        buf.bind_label(label(0));
1951        let inst = Inst::Nop4;
1952        inst.emit(&[], &mut buf, &info, &mut state);
1953
1954        buf.bind_label(label(1));
1955        let inst = Inst::Nop4;
1956        inst.emit(&[], &mut buf, &info, &mut state);
1957
1958        buf.bind_label(label(2));
1959        while buf.cur_offset() < 2000000 {
1960            let inst = Inst::Nop4;
1961            inst.emit(&[], &mut buf, &info, &mut state);
1962        }
1963
1964        buf.bind_label(label(3));
1965        let inst = Inst::CondBr {
1966            kind: CondBrKind::NotZero(xreg(0)),
1967            taken: target(0),
1968            not_taken: target(1),
1969        };
1970        inst.emit(&[], &mut buf, &info, &mut state);
1971
1972        let buf = buf.finish();
1973
1974        assert_eq!(2000000 + 12, buf.total_size());
1975
1976        let mut buf2 = MachBuffer::new();
1977        let mut state = Default::default();
1978        let inst = Inst::CondBr {
1979            kind: CondBrKind::NotZero(xreg(0)),
1980            taken: BranchTarget::ResolvedOffset(8),
1981            not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
1982        };
1983        inst.emit(&[], &mut buf2, &info, &mut state);
1984        let inst = Inst::Jump {
1985            dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
1986        };
1987        inst.emit(&[], &mut buf2, &info, &mut state);
1988
1989        let buf2 = buf2.finish();
1990
1991        assert_eq!(&buf.data[2000000..], &buf2.data[..]);
1992    }
1993
1994    #[test]
1995    fn test_multiple_redirect() {
1996        // label0:
1997        //   cbz x0, label1
1998        //   b label2
1999        // label1:
2000        //   b label3
2001        // label2:
2002        //   nop
2003        //   nop
2004        //   b label0
2005        // label3:
2006        //   b label4
2007        // label4:
2008        //   b label5
2009        // label5:
2010        //   b label7
2011        // label6:
2012        //   nop
2013        // label7:
2014        //   ret
2015        //
2016        // -- should become:
2017        //
2018        // label0:
2019        //   cbz x0, label7
2020        // label2:
2021        //   nop
2022        //   nop
2023        //   b label0
2024        // label6:
2025        //   nop
2026        // label7:
2027        //   ret
2028
2029        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2030        let mut buf = MachBuffer::new();
2031        let mut state = Default::default();
2032
2033        buf.reserve_labels_for_blocks(8);
2034
2035        buf.bind_label(label(0));
2036        let inst = Inst::CondBr {
2037            kind: CondBrKind::Zero(xreg(0)),
2038            taken: target(1),
2039            not_taken: target(2),
2040        };
2041        inst.emit(&[], &mut buf, &info, &mut state);
2042
2043        buf.bind_label(label(1));
2044        let inst = Inst::Jump { dest: target(3) };
2045        inst.emit(&[], &mut buf, &info, &mut state);
2046
2047        buf.bind_label(label(2));
2048        let inst = Inst::Nop4;
2049        inst.emit(&[], &mut buf, &info, &mut state);
2050        inst.emit(&[], &mut buf, &info, &mut state);
2051        let inst = Inst::Jump { dest: target(0) };
2052        inst.emit(&[], &mut buf, &info, &mut state);
2053
2054        buf.bind_label(label(3));
2055        let inst = Inst::Jump { dest: target(4) };
2056        inst.emit(&[], &mut buf, &info, &mut state);
2057
2058        buf.bind_label(label(4));
2059        let inst = Inst::Jump { dest: target(5) };
2060        inst.emit(&[], &mut buf, &info, &mut state);
2061
2062        buf.bind_label(label(5));
2063        let inst = Inst::Jump { dest: target(7) };
2064        inst.emit(&[], &mut buf, &info, &mut state);
2065
2066        buf.bind_label(label(6));
2067        let inst = Inst::Nop4;
2068        inst.emit(&[], &mut buf, &info, &mut state);
2069
2070        buf.bind_label(label(7));
2071        let inst = Inst::Ret { rets: vec![] };
2072        inst.emit(&[], &mut buf, &info, &mut state);
2073
2074        let buf = buf.finish();
2075
2076        let golden_data = vec![
2077            0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2078            0x1f, 0x20, 0x03, 0xd5, // nop
2079            0x1f, 0x20, 0x03, 0xd5, // nop
2080            0xfd, 0xff, 0xff, 0x17, // b 0
2081            0x1f, 0x20, 0x03, 0xd5, // nop
2082            0xc0, 0x03, 0x5f, 0xd6, // ret
2083        ];
2084
2085        assert_eq!(&golden_data[..], &buf.data[..]);
2086    }
2087
2088    #[test]
2089    fn test_handle_branch_cycle() {
2090        // label0:
2091        //   b label1
2092        // label1:
2093        //   b label2
2094        // label2:
2095        //   b label3
2096        // label3:
2097        //   b label4
2098        // label4:
2099        //   b label1  // note: not label0 (to make it interesting).
2100        //
2101        // -- should become:
2102        //
2103        // label0, label1, ..., label4:
2104        //   b label0
2105        let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2106        let mut buf = MachBuffer::new();
2107        let mut state = Default::default();
2108
2109        buf.reserve_labels_for_blocks(5);
2110
2111        buf.bind_label(label(0));
2112        let inst = Inst::Jump { dest: target(1) };
2113        inst.emit(&[], &mut buf, &info, &mut state);
2114
2115        buf.bind_label(label(1));
2116        let inst = Inst::Jump { dest: target(2) };
2117        inst.emit(&[], &mut buf, &info, &mut state);
2118
2119        buf.bind_label(label(2));
2120        let inst = Inst::Jump { dest: target(3) };
2121        inst.emit(&[], &mut buf, &info, &mut state);
2122
2123        buf.bind_label(label(3));
2124        let inst = Inst::Jump { dest: target(4) };
2125        inst.emit(&[], &mut buf, &info, &mut state);
2126
2127        buf.bind_label(label(4));
2128        let inst = Inst::Jump { dest: target(1) };
2129        inst.emit(&[], &mut buf, &info, &mut state);
2130
2131        let buf = buf.finish();
2132
2133        let golden_data = vec![
2134            0x00, 0x00, 0x00, 0x14, // b 0
2135        ];
2136
2137        assert_eq!(&golden_data[..], &buf.data[..]);
2138    }
2139
2140    #[test]
2141    fn metadata_records() {
2142        let mut buf = MachBuffer::<Inst>::new();
2143
2144        buf.reserve_labels_for_blocks(1);
2145
2146        buf.bind_label(label(0));
2147        buf.put1(1);
2148        buf.add_trap(TrapCode::HeapOutOfBounds);
2149        buf.put1(2);
2150        buf.add_trap(TrapCode::IntegerOverflow);
2151        buf.add_trap(TrapCode::IntegerDivisionByZero);
2152        buf.add_call_site(Opcode::Call);
2153        buf.add_reloc(
2154            Reloc::Abs4,
2155            &ExternalName::User(UserExternalNameRef::new(0)),
2156            0,
2157        );
2158        buf.put1(3);
2159        buf.add_reloc(
2160            Reloc::Abs8,
2161            &ExternalName::User(UserExternalNameRef::new(1)),
2162            1,
2163        );
2164        buf.put1(4);
2165
2166        let buf = buf.finish();
2167
2168        assert_eq!(buf.data(), &[1, 2, 3, 4]);
2169        assert_eq!(
2170            buf.traps()
2171                .iter()
2172                .map(|trap| (trap.offset, trap.code))
2173                .collect::<Vec<_>>(),
2174            vec![
2175                (1, TrapCode::HeapOutOfBounds),
2176                (2, TrapCode::IntegerOverflow),
2177                (2, TrapCode::IntegerDivisionByZero)
2178            ]
2179        );
2180        assert_eq!(
2181            buf.call_sites()
2182                .iter()
2183                .map(|call_site| (call_site.ret_addr, call_site.opcode))
2184                .collect::<Vec<_>>(),
2185            vec![(2, Opcode::Call)]
2186        );
2187        assert_eq!(
2188            buf.relocs()
2189                .iter()
2190                .map(|reloc| (reloc.offset, reloc.kind))
2191                .collect::<Vec<_>>(),
2192            vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2193        );
2194    }
2195}