wasmtime_runtime/
externref.rs

1//! # `VMExternRef`
2//!
3//! `VMExternRef` is a reference-counted box for any kind of data that is
4//! external and opaque to running Wasm. Sometimes it might hold a Wasmtime
5//! thing, other times it might hold something from a Wasmtime embedder and is
6//! opaque even to us. It is morally equivalent to `Rc<dyn Any>` in Rust, but
7//! additionally always fits in a pointer-sized word. `VMExternRef` is
8//! non-nullable, but `Option<VMExternRef>` is a null pointer.
9//!
10//! The one part of `VMExternRef` that can't ever be opaque to us is the
11//! reference count. Even when we don't know what's inside an `VMExternRef`, we
12//! need to be able to manipulate its reference count as we add and remove
13//! references to it. And we need to do this from compiled Wasm code, so it must
14//! be `repr(C)`!
15//!
16//! ## Memory Layout
17//!
18//! `VMExternRef` itself is just a pointer to an `VMExternData`, which holds the
19//! opaque, boxed value, its reference count, and its vtable pointer.
20//!
21//! The `VMExternData` struct is *preceded* by the dynamically-sized value boxed
22//! up and referenced by one or more `VMExternRef`s:
23//!
24//! ```text
25//!      ,-------------------------------------------------------.
26//!      |                                                       |
27//!      V                                                       |
28//!     +----------------------------+-----------+-----------+   |
29//!     | dynamically-sized value... | ref_count | value_ptr |---'
30//!     +----------------------------+-----------+-----------+
31//!                                  | VMExternData          |
32//!                                  +-----------------------+
33//!                                   ^
34//! +-------------+                   |
35//! | VMExternRef |-------------------+
36//! +-------------+                   |
37//!                                   |
38//! +-------------+                   |
39//! | VMExternRef |-------------------+
40//! +-------------+                   |
41//!                                   |
42//!   ...                            ===
43//!                                   |
44//! +-------------+                   |
45//! | VMExternRef |-------------------'
46//! +-------------+
47//! ```
48//!
49//! The `value_ptr` member always points backwards to the start of the
50//! dynamically-sized value (which is also the start of the heap allocation for
51//! this value-and-`VMExternData` pair). Because it is a `dyn` pointer, it is
52//! fat, and also points to the value's `Any` vtable.
53//!
54//! The boxed value and the `VMExternRef` footer are held a single heap
55//! allocation. The layout described above is used to make satisfying the
56//! value's alignment easy: we just need to ensure that the heap allocation used
57//! to hold everything satisfies its alignment. It also ensures that we don't
58//! need a ton of excess padding between the `VMExternData` and the value for
59//! values with large alignment.
60//!
61//! ## Reference Counting, Wasm Functions, and Garbage Collection
62//!
63//! For host VM code, we use plain reference counting, where cloning increments
64//! the reference count, and dropping decrements it. We can avoid many of the
65//! on-stack increment/decrement operations that typically plague the
66//! performance of reference counting via Rust's ownership and borrowing system.
67//! Moving a `VMExternRef` avoids mutating its reference count, and borrowing it
68//! either avoids the reference count increment or delays it until if/when the
69//! `VMExternRef` is cloned.
70//!
71//! When passing a `VMExternRef` into compiled Wasm code, we don't want to do
72//! reference count mutations for every compiled `local.{get,set}`, nor for
73//! every function call. Therefore, we use a variation of **deferred reference
74//! counting**, where we only mutate reference counts when storing
75//! `VMExternRef`s somewhere that outlives the activation: into a global or
76//! table. Simultaneously, we over-approximate the set of `VMExternRef`s that
77//! are inside Wasm function activations. Periodically, we walk the stack at GC
78//! safe points, and use stack map information to precisely identify the set of
79//! `VMExternRef`s inside Wasm activations. Then we take the difference between
80//! this precise set and our over-approximation, and decrement the reference
81//! count for each of the `VMExternRef`s that are in our over-approximation but
82//! not in the precise set. Finally, the over-approximation is replaced with the
83//! precise set.
84//!
85//! The `VMExternRefActivationsTable` implements the over-approximized set of
86//! `VMExternRef`s referenced by Wasm activations. Calling a Wasm function and
87//! passing it a `VMExternRef` moves the `VMExternRef` into the table, and the
88//! compiled Wasm function logically "borrows" the `VMExternRef` from the
89//! table. Similarly, `global.get` and `table.get` operations clone the gotten
90//! `VMExternRef` into the `VMExternRefActivationsTable` and then "borrow" the
91//! reference out of the table.
92//!
93//! When a `VMExternRef` is returned to host code from a Wasm function, the host
94//! increments the reference count (because the reference is logically
95//! "borrowed" from the `VMExternRefActivationsTable` and the reference count
96//! from the table will be dropped at the next GC).
97//!
98//! For more general information on deferred reference counting, see *An
99//! Examination of Deferred Reference Counting and Cycle Detection* by Quinane:
100//! <https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf>
101
102use std::alloc::Layout;
103use std::any::Any;
104use std::cell::UnsafeCell;
105use std::cmp;
106use std::collections::HashSet;
107use std::hash::{Hash, Hasher};
108use std::mem;
109use std::ops::Deref;
110use std::ptr::{self, NonNull};
111use std::sync::atomic::{self, AtomicUsize, Ordering};
112use wasmtime_environ::StackMap;
113
114use crate::Backtrace;
115
116/// An external reference to some opaque data.
117///
118/// `VMExternRef`s dereference to their underlying opaque data as `dyn Any`.
119///
120/// Unlike the `externref` in the Wasm spec, `VMExternRef`s are non-nullable,
121/// and always point to a valid value. You may use `Option<VMExternRef>` to
122/// represent nullable references, and `Option<VMExternRef>` is guaranteed to
123/// have the same size and alignment as a raw pointer, with `None` represented
124/// with the null pointer.
125///
126/// `VMExternRef`s are reference counted, so cloning is a cheap, shallow
127/// operation. It also means they are inherently shared, so you may not get a
128/// mutable, exclusive reference to their inner contents, only a shared,
129/// immutable reference. You may use interior mutability with `RefCell` or
130/// `Mutex` to work around this restriction, if necessary.
131///
132/// `VMExternRef`s have pointer-equality semantics, not structural-equality
133/// semantics. Given two `VMExternRef`s `a` and `b`, `a == b` only if `a` and
134/// `b` point to the same allocation. `a` and `b` are considered not equal, even
135/// if `a` and `b` are two different identical copies of the same data, if they
136/// are in two different allocations. The hashing and ordering implementations
137/// also only operate on the pointer.
138///
139/// # Example
140///
141/// ```
142/// # fn foo() -> Result<(), Box<dyn std::error::Error>> {
143/// use std::cell::RefCell;
144/// use wasmtime_runtime::VMExternRef;
145///
146/// // Open a file. Wasm doesn't know about files, but we can let Wasm instances
147/// // work with files via opaque `externref` handles.
148/// let file = std::fs::File::create("some/file/path")?;
149///
150/// // Wrap the file up as an `VMExternRef` that can be passed to Wasm.
151/// let extern_ref_to_file = VMExternRef::new(file);
152///
153/// // `VMExternRef`s dereference to `dyn Any`, so you can use `Any` methods to
154/// // perform runtime type checks and downcasts.
155///
156/// assert!(extern_ref_to_file.is::<std::fs::File>());
157/// assert!(!extern_ref_to_file.is::<String>());
158///
159/// if let Some(mut file) = extern_ref_to_file.downcast_ref::<std::fs::File>() {
160///     use std::io::Write;
161///     writeln!(&mut file, "Hello, `VMExternRef`!")?;
162/// }
163/// # Ok(())
164/// # }
165/// ```
166#[derive(Debug)]
167#[repr(transparent)]
168pub struct VMExternRef(NonNull<VMExternData>);
169
170impl std::fmt::Pointer for VMExternRef {
171    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
172        std::fmt::Pointer::fmt(&self.0, f)
173    }
174}
175
176// Data contained is always Send+Sync so these should be safe.
177unsafe impl Send for VMExternRef {}
178unsafe impl Sync for VMExternRef {}
179
180#[repr(C)]
181pub(crate) struct VMExternData {
182    // Implicit, dynamically-sized member that always preceded an
183    // `VMExternData`.
184    //
185    // value: [u8],
186    //
187    /// The reference count for this `VMExternData` and value. When it reaches
188    /// zero, we can safely destroy the value and free this heap
189    /// allocation. This is an `UnsafeCell`, rather than plain `Cell`, because
190    /// it can be modified by compiled Wasm code.
191    ///
192    /// Note: this field's offset must be kept in sync with
193    /// `wasmtime_environ::VMOffsets::vm_extern_data_ref_count()` which is
194    /// currently always zero.
195    ref_count: AtomicUsize,
196
197    /// Always points to the implicit, dynamically-sized `value` member that
198    /// precedes this `VMExternData`.
199    value_ptr: NonNull<dyn Any + Send + Sync>,
200}
201
202impl Clone for VMExternRef {
203    #[inline]
204    fn clone(&self) -> VMExternRef {
205        self.extern_data().increment_ref_count();
206        VMExternRef(self.0)
207    }
208}
209
210impl Drop for VMExternRef {
211    #[inline]
212    fn drop(&mut self) {
213        let data = self.extern_data();
214
215        // Note that the memory orderings here also match the standard library
216        // itself. Documentation is more available in the implementation of
217        // `Arc`, but the general idea is that this is a special pattern allowed
218        // by the C standard with atomic orderings where we "release" for all
219        // the decrements and only the final decrementer performs an acquire
220        // fence. This properly ensures that the final thread, which actually
221        // destroys the data, sees all the updates from all other threads.
222        if data.ref_count.fetch_sub(1, Ordering::Release) != 1 {
223            return;
224        }
225        atomic::fence(Ordering::Acquire);
226
227        // Drop our live reference to `data` before we drop it itself.
228        drop(data);
229        unsafe {
230            VMExternData::drop_and_dealloc(self.0);
231        }
232    }
233}
234
235impl VMExternData {
236    /// Get the `Layout` for a value with the given size and alignment, and the
237    /// offset within that layout where the `VMExternData` footer resides.
238    ///
239    /// This doesn't take a `value: &T` because `VMExternRef::new_with` hasn't
240    /// constructed a `T` value yet, and it isn't generic over `T` because
241    /// `VMExternData::drop_and_dealloc` doesn't know what `T` to use, and has
242    /// to use `std::mem::{size,align}_of_val` instead.
243    unsafe fn layout_for(value_size: usize, value_align: usize) -> (Layout, usize) {
244        let extern_data_size = mem::size_of::<VMExternData>();
245        let extern_data_align = mem::align_of::<VMExternData>();
246
247        let value_and_padding_size = round_up_to_align(value_size, extern_data_align).unwrap();
248
249        let alloc_align = std::cmp::max(value_align, extern_data_align);
250        let alloc_size = value_and_padding_size + extern_data_size;
251
252        debug_assert!(
253            Layout::from_size_align(alloc_size, alloc_align).is_ok(),
254            "should create a `Layout` for size={} and align={} okay",
255            alloc_size,
256            alloc_align,
257        );
258        (
259            Layout::from_size_align_unchecked(alloc_size, alloc_align),
260            value_and_padding_size,
261        )
262    }
263
264    /// Drop the inner value and then free this `VMExternData` heap allocation.
265    pub(crate) unsafe fn drop_and_dealloc(mut data: NonNull<VMExternData>) {
266        log::trace!("Dropping externref data @ {:p}", data);
267
268        // Note: we introduce a block scope so that we drop the live
269        // reference to the data before we free the heap allocation it
270        // resides within after this block.
271        let (alloc_ptr, layout) = {
272            let data = data.as_mut();
273            debug_assert_eq!(data.ref_count.load(Ordering::SeqCst), 0);
274
275            // Same thing, but for the dropping the reference to `value` before
276            // we drop it itself.
277            let (layout, _) = {
278                let value = data.value_ptr.as_ref();
279                Self::layout_for(mem::size_of_val(value), mem::align_of_val(value))
280            };
281
282            ptr::drop_in_place(data.value_ptr.as_ptr());
283            let alloc_ptr = data.value_ptr.cast::<u8>();
284
285            (alloc_ptr, layout)
286        };
287
288        ptr::drop_in_place(data.as_ptr());
289        std::alloc::dealloc(alloc_ptr.as_ptr(), layout);
290    }
291
292    #[inline]
293    fn increment_ref_count(&self) {
294        // This is only using during cloning operations, and like the standard
295        // library we use `Relaxed` here. The rationale is better documented in
296        // libstd's implementation of `Arc`, but the general gist is that we're
297        // creating a new pointer for our own thread, so there's no need to have
298        // any synchronization with orderings. The synchronization with other
299        // threads with respect to orderings happens when the pointer is sent to
300        // another thread.
301        self.ref_count.fetch_add(1, Ordering::Relaxed);
302    }
303}
304
305#[inline]
306fn round_up_to_align(n: usize, align: usize) -> Option<usize> {
307    debug_assert!(align.is_power_of_two());
308    let align_minus_one = align - 1;
309    Some(n.checked_add(align_minus_one)? & !align_minus_one)
310}
311
312impl VMExternRef {
313    /// Wrap the given value inside an `VMExternRef`.
314    pub fn new<T>(value: T) -> VMExternRef
315    where
316        T: 'static + Any + Send + Sync,
317    {
318        VMExternRef::new_with(|| value)
319    }
320
321    /// Construct a new `VMExternRef` in place by invoking `make_value`.
322    pub fn new_with<T>(make_value: impl FnOnce() -> T) -> VMExternRef
323    where
324        T: 'static + Any + Send + Sync,
325    {
326        unsafe {
327            let (layout, footer_offset) =
328                VMExternData::layout_for(mem::size_of::<T>(), mem::align_of::<T>());
329
330            let alloc_ptr = std::alloc::alloc(layout);
331            let alloc_ptr = NonNull::new(alloc_ptr).unwrap_or_else(|| {
332                std::alloc::handle_alloc_error(layout);
333            });
334
335            let value_ptr = alloc_ptr.cast::<T>();
336            ptr::write(value_ptr.as_ptr(), make_value());
337
338            let extern_data_ptr =
339                alloc_ptr.cast::<u8>().as_ptr().add(footer_offset) as *mut VMExternData;
340            ptr::write(
341                extern_data_ptr,
342                VMExternData {
343                    ref_count: AtomicUsize::new(1),
344                    // Cast from `*mut T` to `*mut dyn Any` here.
345                    value_ptr: NonNull::new_unchecked(value_ptr.as_ptr()),
346                },
347            );
348
349            log::trace!("New externref data @ {:p}", extern_data_ptr);
350            VMExternRef(NonNull::new_unchecked(extern_data_ptr))
351        }
352    }
353
354    /// Turn this `VMExternRef` into a raw, untyped pointer.
355    ///
356    /// Unlike `into_raw`, this does not consume and forget `self`. It is *not*
357    /// safe to use `from_raw` on pointers returned from this method; only use
358    /// `clone_from_raw`!
359    ///
360    ///  Nor does this method increment the reference count. You must ensure
361    ///  that `self` (or some other clone of `self`) stays alive until
362    ///  `clone_from_raw` is called.
363    #[inline]
364    pub fn as_raw(&self) -> *mut u8 {
365        let ptr = self.0.cast::<u8>().as_ptr();
366        ptr
367    }
368
369    /// Consume this `VMExternRef` into a raw, untyped pointer.
370    ///
371    /// # Safety
372    ///
373    /// This method forgets self, so it is possible to create a leak of the
374    /// underlying reference counted data if not used carefully.
375    ///
376    /// Use `from_raw` to recreate the `VMExternRef`.
377    pub unsafe fn into_raw(self) -> *mut u8 {
378        let ptr = self.0.cast::<u8>().as_ptr();
379        std::mem::forget(self);
380        ptr
381    }
382
383    /// Recreate a `VMExternRef` from a pointer returned from a previous call to
384    /// `as_raw`.
385    ///
386    /// # Safety
387    ///
388    /// Unlike `clone_from_raw`, this does not increment the reference count of the
389    /// underlying data.  It is not safe to continue to use the pointer passed to this
390    /// function.
391    pub unsafe fn from_raw(ptr: *mut u8) -> Self {
392        debug_assert!(!ptr.is_null());
393        VMExternRef(NonNull::new_unchecked(ptr).cast())
394    }
395
396    /// Recreate a `VMExternRef` from a pointer returned from a previous call to
397    /// `as_raw`.
398    ///
399    /// # Safety
400    ///
401    /// Wildly unsafe to use with anything other than the result of a previous
402    /// `as_raw` call!
403    ///
404    /// Additionally, it is your responsibility to ensure that this raw
405    /// `VMExternRef`'s reference count has not dropped to zero. Failure to do
406    /// so will result in use after free!
407    pub unsafe fn clone_from_raw(ptr: *mut u8) -> Self {
408        debug_assert!(!ptr.is_null());
409        let x = VMExternRef(NonNull::new_unchecked(ptr).cast());
410        x.extern_data().increment_ref_count();
411        x
412    }
413
414    /// Get the strong reference count for this `VMExternRef`.
415    ///
416    /// Note that this loads with a `SeqCst` ordering to synchronize with other
417    /// threads.
418    pub fn strong_count(&self) -> usize {
419        self.extern_data().ref_count.load(Ordering::SeqCst)
420    }
421
422    #[inline]
423    fn extern_data(&self) -> &VMExternData {
424        unsafe { self.0.as_ref() }
425    }
426}
427
428/// Methods that would normally be trait implementations, but aren't to avoid
429/// potential footguns around `VMExternRef`'s pointer-equality semantics.
430///
431/// Note that none of these methods are on `&self`, they all require a
432/// fully-qualified `VMExternRef::foo(my_ref)` invocation.
433impl VMExternRef {
434    /// Check whether two `VMExternRef`s point to the same inner allocation.
435    ///
436    /// Note that this uses pointer-equality semantics, not structural-equality
437    /// semantics, and so only pointers are compared, and doesn't use any `Eq`
438    /// or `PartialEq` implementation of the pointed-to values.
439    #[inline]
440    pub fn eq(a: &Self, b: &Self) -> bool {
441        ptr::eq(a.0.as_ptr(), b.0.as_ptr())
442    }
443
444    /// Hash a given `VMExternRef`.
445    ///
446    /// Note that this just hashes the pointer to the inner value, it does *not*
447    /// use the inner value's `Hash` implementation (if any).
448    #[inline]
449    pub fn hash<H>(externref: &Self, hasher: &mut H)
450    where
451        H: Hasher,
452    {
453        ptr::hash(externref.0.as_ptr(), hasher);
454    }
455
456    /// Compare two `VMExternRef`s.
457    ///
458    /// Note that this uses pointer-equality semantics, not structural-equality
459    /// semantics, and so only pointers are compared, and doesn't use any `Cmp`
460    /// or `PartialCmp` implementation of the pointed-to values.
461    #[inline]
462    pub fn cmp(a: &Self, b: &Self) -> cmp::Ordering {
463        let a = a.0.as_ptr() as usize;
464        let b = b.0.as_ptr() as usize;
465        a.cmp(&b)
466    }
467}
468
469impl Deref for VMExternRef {
470    type Target = dyn Any;
471
472    fn deref(&self) -> &dyn Any {
473        unsafe { self.extern_data().value_ptr.as_ref() }
474    }
475}
476
477/// A wrapper around a `VMExternRef` that implements `Eq` and `Hash` with
478/// pointer semantics.
479///
480/// We use this so that we can morally put `VMExternRef`s inside of `HashSet`s
481/// even though they don't implement `Eq` and `Hash` to avoid foot guns.
482#[derive(Clone, Debug)]
483struct VMExternRefWithTraits(VMExternRef);
484
485impl Hash for VMExternRefWithTraits {
486    fn hash<H>(&self, hasher: &mut H)
487    where
488        H: Hasher,
489    {
490        VMExternRef::hash(&self.0, hasher)
491    }
492}
493
494impl PartialEq for VMExternRefWithTraits {
495    fn eq(&self, other: &Self) -> bool {
496        VMExternRef::eq(&self.0, &other.0)
497    }
498}
499
500impl Eq for VMExternRefWithTraits {}
501
502type TableElem = UnsafeCell<Option<VMExternRef>>;
503
504/// A table that over-approximizes the set of `VMExternRef`s that any Wasm
505/// activation on this thread is currently using.
506///
507/// Under the covers, this is a simple bump allocator that allows duplicate
508/// entries. Deduplication happens at GC time.
509#[repr(C)] // `alloc` must be the first member, it's accessed from JIT code.
510pub struct VMExternRefActivationsTable {
511    /// Structures used to perform fast bump allocation of storage of externref
512    /// values.
513    ///
514    /// This is the only member of this structure accessed from JIT code.
515    alloc: VMExternRefTableAlloc,
516
517    /// When unioned with `chunk`, this is an over-approximation of the GC roots
518    /// on the stack, inside Wasm frames.
519    ///
520    /// This is used by slow-path insertion, and when a GC cycle finishes, is
521    /// re-initialized to the just-discovered precise set of stack roots (which
522    /// immediately becomes an over-approximation again as soon as Wasm runs and
523    /// potentially drops references).
524    over_approximated_stack_roots: HashSet<VMExternRefWithTraits>,
525
526    /// The precise set of on-stack, inside-Wasm GC roots that we discover via
527    /// walking the stack and interpreting stack maps.
528    ///
529    /// This is *only* used inside the `gc` function, and is empty otherwise. It
530    /// is just part of this struct so that we can reuse the allocation, rather
531    /// than create a new hash set every GC.
532    precise_stack_roots: HashSet<VMExternRefWithTraits>,
533
534    /// A debug-only field for asserting that we are in a region of code where
535    /// GC is okay to preform.
536    #[cfg(debug_assertions)]
537    gc_okay: bool,
538}
539
540#[repr(C)] // This is accessed from JIT code.
541struct VMExternRefTableAlloc {
542    /// Bump-allocation finger within the `chunk`.
543    ///
544    /// NB: this is an `UnsafeCell` because it is written to by compiled Wasm
545    /// code.
546    next: UnsafeCell<NonNull<TableElem>>,
547
548    /// Pointer to just after the `chunk`.
549    ///
550    /// This is *not* within the current chunk and therefore is not a valid
551    /// place to insert a reference!
552    end: NonNull<TableElem>,
553
554    /// Bump allocation chunk that stores fast-path insertions.
555    ///
556    /// This is not accessed from JIT code.
557    chunk: Box<[TableElem]>,
558}
559
560// This gets around the usage of `UnsafeCell` throughout the internals of this
561// allocator, but the storage should all be Send/Sync and synchronization isn't
562// necessary since operations require `&mut self`.
563unsafe impl Send for VMExternRefTableAlloc {}
564unsafe impl Sync for VMExternRefTableAlloc {}
565
566fn _assert_send_sync() {
567    fn _assert<T: Send + Sync>() {}
568    _assert::<VMExternRefActivationsTable>();
569    _assert::<VMExternRef>();
570}
571
572impl VMExternRefActivationsTable {
573    const CHUNK_SIZE: usize = 4096 / mem::size_of::<usize>();
574
575    /// Create a new `VMExternRefActivationsTable`.
576    pub fn new() -> Self {
577        // Start with an empty chunk in case this activations table isn't used.
578        // This means that there's no space in the bump-allocation area which
579        // will force any path trying to use this to the slow gc path. The first
580        // time this happens, though, the slow gc path will allocate a new chunk
581        // for actual fast-bumping.
582        let mut chunk: Box<[TableElem]> = Box::new([]);
583        let next = chunk.as_mut_ptr();
584        let end = unsafe { next.add(chunk.len()) };
585
586        VMExternRefActivationsTable {
587            alloc: VMExternRefTableAlloc {
588                next: UnsafeCell::new(NonNull::new(next).unwrap()),
589                end: NonNull::new(end).unwrap(),
590                chunk,
591            },
592            over_approximated_stack_roots: HashSet::new(),
593            precise_stack_roots: HashSet::new(),
594            #[cfg(debug_assertions)]
595            gc_okay: true,
596        }
597    }
598
599    fn new_chunk(size: usize) -> Box<[UnsafeCell<Option<VMExternRef>>]> {
600        assert!(size >= Self::CHUNK_SIZE);
601        (0..size).map(|_| UnsafeCell::new(None)).collect()
602    }
603
604    /// Get the available capacity in the bump allocation chunk.
605    #[inline]
606    pub fn bump_capacity_remaining(&self) -> usize {
607        let end = self.alloc.end.as_ptr() as usize;
608        let next = unsafe { *self.alloc.next.get() };
609        end - next.as_ptr() as usize
610    }
611
612    /// Try and insert a `VMExternRef` into this table.
613    ///
614    /// This is a fast path that only succeeds when the bump chunk has the
615    /// capacity for the requested insertion.
616    ///
617    /// If the insertion fails, then the `VMExternRef` is given back. Callers
618    /// may attempt a GC to free up space and try again, or may call
619    /// `insert_slow_path` to infallibly insert the reference (potentially
620    /// allocating additional space in the table to hold it).
621    #[inline]
622    pub fn try_insert(&mut self, externref: VMExternRef) -> Result<(), VMExternRef> {
623        unsafe {
624            let next = *self.alloc.next.get();
625            if next == self.alloc.end {
626                return Err(externref);
627            }
628
629            debug_assert!(
630                (*next.as_ref().get()).is_none(),
631                "slots >= the `next` bump finger are always `None`"
632            );
633            ptr::write(next.as_ptr(), UnsafeCell::new(Some(externref)));
634
635            let next = NonNull::new_unchecked(next.as_ptr().add(1));
636            debug_assert!(next <= self.alloc.end);
637            *self.alloc.next.get() = next;
638
639            Ok(())
640        }
641    }
642
643    /// Insert a reference into the table, falling back on a GC to clear up
644    /// space if the table is already full.
645    ///
646    /// # Unsafety
647    ///
648    /// The same as `gc`.
649    #[inline]
650    pub unsafe fn insert_with_gc(
651        &mut self,
652        externref: VMExternRef,
653        module_info_lookup: &dyn ModuleInfoLookup,
654    ) {
655        #[cfg(debug_assertions)]
656        assert!(self.gc_okay);
657
658        if let Err(externref) = self.try_insert(externref) {
659            self.gc_and_insert_slow(externref, module_info_lookup);
660        }
661    }
662
663    #[inline(never)]
664    unsafe fn gc_and_insert_slow(
665        &mut self,
666        externref: VMExternRef,
667        module_info_lookup: &dyn ModuleInfoLookup,
668    ) {
669        gc(module_info_lookup, self);
670
671        // Might as well insert right into the hash set, rather than the bump
672        // chunk, since we are already on a slow path and we get de-duplication
673        // this way.
674        self.over_approximated_stack_roots
675            .insert(VMExternRefWithTraits(externref));
676    }
677
678    /// Insert a reference into the table, without ever performing GC.
679    #[inline]
680    pub fn insert_without_gc(&mut self, externref: VMExternRef) {
681        if let Err(externref) = self.try_insert(externref) {
682            self.insert_slow_without_gc(externref);
683        }
684    }
685
686    #[inline(never)]
687    fn insert_slow_without_gc(&mut self, externref: VMExternRef) {
688        self.over_approximated_stack_roots
689            .insert(VMExternRefWithTraits(externref));
690    }
691
692    fn num_filled_in_bump_chunk(&self) -> usize {
693        let next = unsafe { *self.alloc.next.get() };
694        let bytes_unused = (self.alloc.end.as_ptr() as usize) - (next.as_ptr() as usize);
695        let slots_unused = bytes_unused / mem::size_of::<TableElem>();
696        self.alloc.chunk.len().saturating_sub(slots_unused)
697    }
698
699    fn elements(&self, mut f: impl FnMut(&VMExternRef)) {
700        for elem in self.over_approximated_stack_roots.iter() {
701            f(&elem.0);
702        }
703
704        // The bump chunk is not all the way full, so we only iterate over its
705        // filled-in slots.
706        let num_filled = self.num_filled_in_bump_chunk();
707        for slot in self.alloc.chunk.iter().take(num_filled) {
708            if let Some(elem) = unsafe { &*slot.get() } {
709                f(elem);
710            }
711        }
712    }
713
714    fn insert_precise_stack_root(
715        precise_stack_roots: &mut HashSet<VMExternRefWithTraits>,
716        root: NonNull<VMExternData>,
717    ) {
718        let root = unsafe { VMExternRef::clone_from_raw(root.as_ptr().cast()) };
719        log::trace!("Found externref on stack: {:p}", root);
720        precise_stack_roots.insert(VMExternRefWithTraits(root));
721    }
722
723    /// Sweep the bump allocation table after we've discovered our precise stack
724    /// roots.
725    fn sweep(&mut self) {
726        log::trace!("begin GC sweep");
727
728        // Sweep our bump chunk.
729        let num_filled = self.num_filled_in_bump_chunk();
730        unsafe {
731            *self.alloc.next.get() = self.alloc.end;
732        }
733        for slot in self.alloc.chunk.iter().take(num_filled) {
734            unsafe {
735                *slot.get() = None;
736            }
737        }
738        debug_assert!(
739            self.alloc
740                .chunk
741                .iter()
742                .all(|slot| unsafe { (*slot.get()).as_ref().is_none() }),
743            "after sweeping the bump chunk, all slots should be `None`"
744        );
745
746        // If this is the first instance of gc then the initial chunk is empty,
747        // so we lazily allocate space for fast bump-allocation in the future.
748        if self.alloc.chunk.is_empty() {
749            self.alloc.chunk = Self::new_chunk(Self::CHUNK_SIZE);
750            self.alloc.end =
751                NonNull::new(unsafe { self.alloc.chunk.as_mut_ptr().add(self.alloc.chunk.len()) })
752                    .unwrap();
753        }
754
755        // Reset our `next` finger to the start of the bump allocation chunk.
756        unsafe {
757            let next = self.alloc.chunk.as_mut_ptr();
758            debug_assert!(!next.is_null());
759            *self.alloc.next.get() = NonNull::new_unchecked(next);
760        }
761
762        // The current `precise_stack_roots` becomes our new over-appoximated
763        // set for the next GC cycle.
764        mem::swap(
765            &mut self.precise_stack_roots,
766            &mut self.over_approximated_stack_roots,
767        );
768
769        // And finally, the new `precise_stack_roots` should be cleared and
770        // remain empty until the next GC cycle.
771        //
772        // Note that this may run arbitrary code as we run externref
773        // destructors. Because of our `&mut` borrow above on this table,
774        // though, we're guaranteed that nothing will touch this table.
775        self.precise_stack_roots.clear();
776
777        log::trace!("end GC sweep");
778    }
779
780    /// Set whether it is okay to GC or not right now.
781    ///
782    /// This is provided as a helper for enabling various debug-only assertions
783    /// and checking places where the `wasmtime-runtime` user expects there not
784    /// to be any GCs.
785    #[inline]
786    pub fn set_gc_okay(&mut self, okay: bool) -> bool {
787        #[cfg(debug_assertions)]
788        {
789            return std::mem::replace(&mut self.gc_okay, okay);
790        }
791        #[cfg(not(debug_assertions))]
792        {
793            let _ = okay;
794            return true;
795        }
796    }
797}
798
799/// Used by the runtime to lookup information about a module given a
800/// program counter value.
801pub trait ModuleInfoLookup {
802    /// Lookup the module information from a program counter value.
803    fn lookup(&self, pc: usize) -> Option<&dyn ModuleInfo>;
804}
805
806/// Used by the runtime to query module information.
807pub trait ModuleInfo {
808    /// Lookup the stack map at a program counter value.
809    fn lookup_stack_map(&self, pc: usize) -> Option<&StackMap>;
810}
811
812#[derive(Debug, Default)]
813struct DebugOnly<T> {
814    inner: T,
815}
816
817impl<T> std::ops::Deref for DebugOnly<T> {
818    type Target = T;
819
820    fn deref(&self) -> &T {
821        if cfg!(debug_assertions) {
822            &self.inner
823        } else {
824            panic!(
825                "only deref `DebugOnly` when `cfg(debug_assertions)` or \
826                 inside a `debug_assert!(..)`"
827            )
828        }
829    }
830}
831
832impl<T> std::ops::DerefMut for DebugOnly<T> {
833    fn deref_mut(&mut self) -> &mut T {
834        if cfg!(debug_assertions) {
835            &mut self.inner
836        } else {
837            panic!(
838                "only deref `DebugOnly` when `cfg(debug_assertions)` or \
839                 inside a `debug_assert!(..)`"
840            )
841        }
842    }
843}
844
845/// Perform garbage collection of `VMExternRef`s.
846///
847/// # Unsafety
848///
849/// You must have called `VMExternRefActivationsTable::set_stack_canary` for at
850/// least the oldest host-->Wasm stack frame transition on this thread's stack
851/// (it is idempotent to call it more than once) and keep its return value alive
852/// across the duration of that host-->Wasm call.
853///
854/// Additionally, you must have registered the stack maps for every Wasm module
855/// that has frames on the stack with the given `stack_maps_registry`.
856pub unsafe fn gc(
857    module_info_lookup: &dyn ModuleInfoLookup,
858    externref_activations_table: &mut VMExternRefActivationsTable,
859) {
860    log::debug!("start GC");
861
862    #[cfg(debug_assertions)]
863    assert!(externref_activations_table.gc_okay);
864
865    debug_assert!({
866        // This set is only non-empty within this function. It is built up when
867        // walking the stack and interpreting stack maps, and then drained back
868        // into the activations table's bump-allocated space at the
869        // end. Therefore, it should always be empty upon entering this
870        // function.
871        externref_activations_table.precise_stack_roots.is_empty()
872    });
873
874    // This function proceeds by:
875    //
876    // * walking the stack,
877    //
878    // * finding the precise set of roots inside Wasm frames via our stack maps,
879    //   and
880    //
881    // * resetting our bump-allocated table's over-approximation to the
882    //   newly-discovered precise set.
883
884    // The `activations_table_set` is used for `debug_assert!`s checking that
885    // every reference we read out from the stack via stack maps is actually in
886    // the table. If that weren't true, than either we forgot to insert a
887    // reference in the table when passing it into Wasm (a bug) or we are
888    // reading invalid references from the stack (another bug).
889    let mut activations_table_set: DebugOnly<HashSet<_>> = Default::default();
890    if cfg!(debug_assertions) {
891        externref_activations_table.elements(|elem| {
892            activations_table_set.insert(elem.as_raw() as *mut VMExternData);
893        });
894    }
895
896    log::trace!("begin GC trace");
897    Backtrace::trace(|frame| {
898        let pc = frame.pc();
899        debug_assert!(pc != 0, "we should always get a valid PC for Wasm frames");
900
901        let fp = frame.fp();
902        debug_assert!(
903            fp != 0,
904            "we should always get a valid frame pointer for Wasm frames"
905        );
906
907        let module_info = module_info_lookup
908            .lookup(pc)
909            .expect("should have module info for Wasm frame");
910
911        let stack_map = match module_info.lookup_stack_map(pc) {
912            Some(sm) => sm,
913            None => {
914                log::trace!("No stack map for this Wasm frame");
915                return std::ops::ControlFlow::Continue(());
916            }
917        };
918        log::trace!(
919            "We have a stack map that maps {} words in this Wasm frame",
920            stack_map.mapped_words()
921        );
922
923        let sp = fp - stack_map.mapped_words() as usize * mem::size_of::<usize>();
924
925        for i in 0..(stack_map.mapped_words() as usize) {
926            // Stack maps have one bit per word in the frame, and the
927            // zero^th bit is the *lowest* addressed word in the frame,
928            // i.e. the closest to the SP. So to get the `i`^th word in
929            // this frame, we add `i * sizeof(word)` to the SP.
930            let stack_slot = sp + i * mem::size_of::<usize>();
931
932            if !stack_map.get_bit(i) {
933                log::trace!(
934                    "Stack slot @ {:p} does not contain externrefs",
935                    stack_slot as *const (),
936                );
937                continue;
938            }
939
940            let stack_slot = stack_slot as *const *mut VMExternData;
941            let r = std::ptr::read(stack_slot);
942            log::trace!("Stack slot @ {:p} = {:p}", stack_slot, r);
943
944            debug_assert!(
945                r.is_null() || activations_table_set.contains(&r),
946                "every on-stack externref inside a Wasm frame should \
947                 have an entry in the VMExternRefActivationsTable; \
948                 {:?} is not in the table",
949                r
950            );
951
952            if let Some(r) = NonNull::new(r) {
953                VMExternRefActivationsTable::insert_precise_stack_root(
954                    &mut externref_activations_table.precise_stack_roots,
955                    r,
956                );
957            }
958        }
959
960        std::ops::ControlFlow::Continue(())
961    });
962    log::trace!("end GC trace");
963
964    externref_activations_table.sweep();
965
966    log::debug!("end GC");
967}
968
969#[cfg(test)]
970mod tests {
971    use super::*;
972    use std::convert::TryInto;
973
974    #[test]
975    fn extern_ref_is_pointer_sized_and_aligned() {
976        assert_eq!(mem::size_of::<VMExternRef>(), mem::size_of::<*mut ()>());
977        assert_eq!(mem::align_of::<VMExternRef>(), mem::align_of::<*mut ()>());
978        assert_eq!(
979            mem::size_of::<Option<VMExternRef>>(),
980            mem::size_of::<*mut ()>()
981        );
982        assert_eq!(
983            mem::align_of::<Option<VMExternRef>>(),
984            mem::align_of::<*mut ()>()
985        );
986    }
987
988    #[test]
989    fn ref_count_is_at_correct_offset() {
990        let s = "hi";
991        let s: &(dyn Any + Send + Sync) = &s as _;
992        let s: *const (dyn Any + Send + Sync) = s as _;
993        let s: *mut (dyn Any + Send + Sync) = s as _;
994
995        let extern_data = VMExternData {
996            ref_count: AtomicUsize::new(0),
997            value_ptr: NonNull::new(s).unwrap(),
998        };
999
1000        let extern_data_ptr = &extern_data as *const _;
1001        let ref_count_ptr = &extern_data.ref_count as *const _;
1002
1003        let actual_offset = (ref_count_ptr as usize) - (extern_data_ptr as usize);
1004
1005        let offsets = wasmtime_environ::VMOffsets::from(wasmtime_environ::VMOffsetsFields {
1006            ptr: 8,
1007            num_imported_functions: 0,
1008            num_imported_tables: 0,
1009            num_imported_memories: 0,
1010            num_imported_globals: 0,
1011            num_defined_tables: 0,
1012            num_defined_memories: 0,
1013            num_owned_memories: 0,
1014            num_defined_globals: 0,
1015            num_escaped_funcs: 0,
1016        });
1017        assert_eq!(
1018            offsets.vm_extern_data_ref_count(),
1019            actual_offset.try_into().unwrap(),
1020        );
1021    }
1022
1023    #[test]
1024    fn table_next_is_at_correct_offset() {
1025        let table = VMExternRefActivationsTable::new();
1026
1027        let table_ptr = &table as *const _;
1028        let next_ptr = &table.alloc.next as *const _;
1029
1030        let actual_offset = (next_ptr as usize) - (table_ptr as usize);
1031
1032        let offsets = wasmtime_environ::VMOffsets::from(wasmtime_environ::VMOffsetsFields {
1033            ptr: 8,
1034            num_imported_functions: 0,
1035            num_imported_tables: 0,
1036            num_imported_memories: 0,
1037            num_imported_globals: 0,
1038            num_defined_tables: 0,
1039            num_defined_memories: 0,
1040            num_owned_memories: 0,
1041            num_defined_globals: 0,
1042            num_escaped_funcs: 0,
1043        });
1044        assert_eq!(
1045            offsets.vm_extern_ref_activation_table_next() as usize,
1046            actual_offset
1047        );
1048    }
1049
1050    #[test]
1051    fn table_end_is_at_correct_offset() {
1052        let table = VMExternRefActivationsTable::new();
1053
1054        let table_ptr = &table as *const _;
1055        let end_ptr = &table.alloc.end as *const _;
1056
1057        let actual_offset = (end_ptr as usize) - (table_ptr as usize);
1058
1059        let offsets = wasmtime_environ::VMOffsets::from(wasmtime_environ::VMOffsetsFields {
1060            ptr: 8,
1061            num_imported_functions: 0,
1062            num_imported_tables: 0,
1063            num_imported_memories: 0,
1064            num_imported_globals: 0,
1065            num_defined_tables: 0,
1066            num_defined_memories: 0,
1067            num_owned_memories: 0,
1068            num_defined_globals: 0,
1069            num_escaped_funcs: 0,
1070        });
1071        assert_eq!(
1072            offsets.vm_extern_ref_activation_table_end() as usize,
1073            actual_offset
1074        );
1075    }
1076}