TransWikia.com

Simulated WeakReference class

Code Review Asked on October 27, 2021

I’ve created a fake WeakReference class for VBA by chance. A Weak Reference is not counted (i.e. IUnknown::AddRef method is not called). I knew about the Weak Reference concept from Swift and I accidentaly read about a COM Variant.

In short, I am using a Variant (ByRef) to manipulate the first 2 bytes (var type) in a second Variant in order to flip between an Object and a Long/LongLong Variant.

The advantage of this approach is that only some inital API calls are needed to set things up. When the referenced object is needed only plain VBA calls are done thus making code fast even if called millions of times. Also, the ‘Object’ property safely returns Nothing if the referenced object has been destroyed already. Finally, because the Variants used are ByRef, the Application can safely clean up memory even if state is lost.

The full code with explanation is under MIT license on GitHub at VBA-WeakReference. I’ve been asked by Greedquest to post the code here on Code Review. So, here it is:

WeakReference class:

Option Explicit

#If Mac Then
    #If VBA7 Then
        Private Declare PtrSafe Function CopyMemory Lib "/usr/lib/libc.dylib" Alias "memmove" (Destination As Any, Source As Any, ByVal Length As LongPtr) As LongPtr
    #Else
        Private Declare Function CopyMemory Lib "/usr/lib/libc.dylib" Alias "memmove" (Destination As Any, Source As Any, ByVal Length As Long) As Long
    #End If
#Else 'Windows
    'https://msdn.microsoft.com/en-us/library/mt723419(v=vs.85).aspx
    #If VBA7 Then
        Private Declare PtrSafe Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As LongPtr)
    #Else
        Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
    #End If
#End If

'https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-oaut/3fe7db9f-5803-4dc4-9d14-5425d3f5461f
'https://docs.microsoft.com/en-us/windows/win32/api/oaidl/ns-oaidl-variant?redirectedfrom=MSDN
'Flag used to simulate ByRef Variants in order to avoid memory reclaim
Private Const VT_BYREF As Long = &H4000 'Makes it all possible

'A memory address Long Integer
Private Type MEM_ADDRESS
    #If VBA7 Then
        ptr As LongPtr 'Defaults to LongLong on x64 or Long on x32
    #Else
        ptr As Long    'For VB6
    #End If
End Type

Private Type FAKE_REFERENCE
    remoteVarType As Variant 'Manipulates the variant type for 'reference'
    reference As Variant     'Will be holding the object reference/address
    vTable As MEM_ADDRESS    'Initial address of virtual table
    vTableByRef As Variant   'Address used to check if reference is still valid
    vbLongPtr As Long        'Data type enum (vbLong = 3 or vbLongLong = 20)
    isValid As Boolean       'Indicates if the memory reference is valid
End Type

Private m_fake As FAKE_REFERENCE

'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'Class Constructor
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Private Sub Class_Initialize()
    'Save address of the Variant that will hold the target reference/address
    m_fake.remoteVarType = VarPtr(m_fake.reference)
    '
    'Change remoteVT variant type to Integer ByRef. This will now be linked
    '   to the first 2 bytes of the Variant holding the target reference
    'Setting the VT_BYREF flag makes sure that the 2 bytes are not reclaimed
    '   twice when both 'remoteVarType' and 'reference' go out of scope
    'And most importantly this gives the ability to switch the variant type of
    '   the reference at will, just by changing the Integer value of remoteVT
    CopyMemory ByVal VarPtr(m_fake.remoteVarType), vbInteger + VT_BYREF, 2
    '
    'Store the data type enum for mem addresses (vbLong = 3 or vbLongLong = 20)
    m_fake.vbLongPtr = VBA.VarType(ObjPtr(Nothing))
End Sub

'*******************************************************************************
'Sets the weak/fake reference to an object
'*******************************************************************************
Public Property Let Object(obj As Object)
    'Save memory address of the object
    m_fake.reference = ObjPtr(obj)
    '
    m_fake.isValid = (m_fake.reference <> 0)
    If Not m_fake.isValid Then Exit Property
    '
    'Save the default interface's virtual table address by reference. The vTable
    '   address is found at the first 4 (x32) or 8 (x64) bytes at the referenced
    '   interface address
    m_fake.vTableByRef = m_fake.reference
    CopyMemory ByVal VarPtr(m_fake.vTableByRef), m_fake.vbLongPtr + VT_BYREF, 2
    '
    'Save the current vTable address. This is needed later to compare with the
    '   vTableByRef address in order to establish if the Object has been
    '   destroyed and it's memory reclaimed.
    'vTableByRef can still be read within the scope of this method
    m_fake.vTable.ptr = m_fake.vTableByRef
End Property

'*******************************************************************************
'Safely retrieves the object that the saved reference is pointing to
'No external API calls are needed!
'*******************************************************************************
Public Property Get Object() As Object
    If Not m_fake.isValid Then Exit Property
    '
    'Compare the current vTable address value with the initial address
    'The current redirected value vTableByRef can NOT be read directly anymore
    '   so it must be passed ByRef to an utility function
    m_fake.isValid = (GetRemoteAddress(m_fake.vTableByRef).ptr = m_fake.vTable.ptr)
    '
    If m_fake.isValid Then
        'Address is still valid. Retrive the object
        'Turn the reference into an object (needs to be done ByRef)
        VarType(m_fake.remoteVarType) = vbObject
        Set Object = m_fake.reference
    End If
    '
    'The fake object is not counted (reference count was never incremented by
    '   calling the IUnknown::AddRef method) so a crash will occur if the
    '   Variant type remains as vbObject, because when the Variant goes out
    '   of scope the object count is decremented one more time than it should
    'Meanwhile, as Integer, the Variant can safely go out of scope anytime
    VarType(m_fake.remoteVarType) = m_fake.vbLongPtr 'vbLong or vbLongLong
End Property

'*******************************************************************************
'Utility. Changes the data type for the reference Variant while preserving the
'   level of redirection of remoteVarType
'*******************************************************************************
Private Property Let VarType(ByRef v As Variant, newType As Integer)
    v = newType
End Property

'*******************************************************************************
'Returns the value of a Variant that has the VT_BYREF flag set
'*******************************************************************************
Private Function GetRemoteAddress(ByRef memAddress As Variant) As MEM_ADDRESS
    GetRemoteAddress.ptr = memAddress
End Function

Can this code be better?


A quick demo showing how a reference cycle can be avoided:

DemoParent class:

Option Explicit

Private m_child As DemoChild

Public Property Let Child(ch As DemoChild)
    Set m_child = ch
End Property
Public Property Get Child() As DemoChild
    Set Child = m_child
End Property

Private Sub Class_Terminate()
    Set m_child = Nothing
    Debug.Print "Parent terminated " & Now
End Sub

And a DemoChild class:

Option Explicit

Private m_parent As WeakReference

Public Property Let Parent(newParent As DemoParent)
    Set m_parent = New WeakReference
    m_parent.Object = newParent
End Property
Public Property Get Parent() As DemoParent
    Set Parent = m_parent.Object
End Property

Private Sub Class_Terminate()
    Debug.Print "Child terminated " & Now
End Sub

Here’s a demo:

Sub DemoTerminateParentFirst()
    Dim c As New DemoChild
    Dim p As New DemoParent
    '
    p.Child = c
    c.Parent = p
    '
    Debug.Print TypeName(p.Child.Parent)
End Sub

And another demo:

Sub DemoTerminateChildFirst()
    Dim c As New DemoChild
    Dim p As New DemoParent
    '
    p.Child = c
    c.Parent = p
    Set c = Nothing
    '
    Debug.Print TypeName(p.Child.Parent)
End Sub

Both Parent and Child Class_Terminate events are firing properly.

Updated version and more demos are available at the GitHub repository VBA-WeakReference.

3 Answers

The improvements found in this answer were triggered by the great answer that @Greedo provided on this question. Many thanks for his effort and apologies I took so long to act on his suggestions.


VTable check
By far, the most important aspect touched in the above-mentioned answer is that the VTable check is not sufficient to cover all cases (refer to point 3) and could lead to crashes or worse - pointing to a wrong object. The most likely case is when an instance of an object targeted by a WeakReference is terminated and the same memory address is overwritten with another different instance of the same class. It is very easy to produce:

Sub VTableCheckProblem()
    Dim c As Class1
    Dim w As New WeakReference
    
    Set c = New Class1
    c.x = 1
    Set w.Object = c
    
    Debug.Print w.Object.x 'Prints 1 (correct)
    Set c = Nothing
    Set c = New Class1
    Debug.Print w.Object.x 'Prints 0 (wrong - w.Object should return Nothing)
End Sub

The suggested improvement:

... it may be worth investing in an additional interface for parent classes that exposes something like a GUID, so that once you successfully dereference the Object, you cast it to an IGUID interface and check it has a GUID that matches what you expect, if so then return the parent Object ...

works very nicely (tested) but only if an actual object resides at the referenced address. However, if this happens:

If there parent instance has been overwritten with an invalid object, but it so happens that the first 4/8 bytes of the memory have been reused to store a string of binary that coincidentally matches the VTable pointer exactly, then once again your class will not catch this

Indeed, this would crash the whole Application (tested).

If not using an interface, then the issue stated here (same author) also crashes the Application.

Solution
Force the implementation of an interface IWeakable by changing code (inside WeakReference class) from:

Public Property Let Object(obj As Object)

to:

Public Property Set Object(obj As IWeakable) 

and then to somehow inform all the weak references pointing to the IWeakable object that the object has terminated (from that object's Class_Terminate or in another way).
Note that Let has changed to Set thanks to the answer provided by @MathieuGuindon

In order for the referenced object to inform the weak references about termination, it needs to be aware of all weak references pointing to it.

Here is the IWeakable interface:

Option Explicit

Public Sub AddWeakRef(wRef As WeakReference)
End Sub

and the modified property:

Public Property Set Object(obj As IWeakable)
    m_fake.reference = ObjPtr(GetDefaultInterface(obj))
    If m_fake.reference = 0 Then Exit Property
    '
    obj.AddWeakRef Me
End Property

inside the improved WeakReference class:

Option Explicit

#If Mac Then
    #If VBA7 Then
        Private Declare PtrSafe Function CopyMemory Lib "/usr/lib/libc.dylib" Alias "memmove" (Destination As Any, Source As Any, ByVal Length As LongPtr) As LongPtr
    #Else
        Private Declare Function CopyMemory Lib "/usr/lib/libc.dylib" Alias "memmove" (Destination As Any, Source As Any, ByVal Length As Long) As Long
    #End If
#Else 'Windows
    'https://msdn.microsoft.com/en-us/library/mt723419(v=vs.85).aspx
    #If VBA7 Then
        Private Declare PtrSafe Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As LongPtr)
    #Else
        Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
    #End If
#End If

'https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-oaut/3fe7db9f-5803-4dc4-9d14-5425d3f5461f
'https://docs.microsoft.com/en-us/windows/win32/api/oaidl/ns-oaidl-variant?redirectedfrom=MSDN
'Flag used to simulate ByRef Variants
Private Const VT_BYREF As Long = &H4000

Private Type FAKE_REFERENCE
    remoteVarType As Variant  'Manipulates the variant type for 'reference'
    reference As Variant      'Will be holding the object reference/address
End Type

#If Win64 Then
    #If Mac Then
        Const vbLongLong As Long = 20 'Apparently missing for x64 on Mac
    #End If
    Const vbLongPtr As Long = vbLongLong
#Else
    Const vbLongPtr As Long = vbLong
#End If

Private m_fake As FAKE_REFERENCE

'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'Class Constructor
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Private Sub Class_Initialize()
    'Save address of the Variant that will hold the target reference/address
    m_fake.remoteVarType = VarPtr(m_fake.reference)
    '
    'Change remoteVT variant type to Integer ByRef. This will now be linked
    '   to the first 2 bytes of the Variant holding the target reference
    'Setting the VT_BYREF flag makes sure that the 2 bytes are not reclaimed
    '   twice when both 'remoteVarType' and 'reference' go out of scope
    'And most importantly this gives the ability to switch the variant type of
    '   the reference at will, just by changing the Integer value of remoteVT
    CopyMemory m_fake.remoteVarType, vbInteger + VT_BYREF, 2
End Sub

'*******************************************************************************
'Saves the memory address to an object's default interface (not to IWeakable)
'*******************************************************************************
Public Property Set Object(obj As IWeakable)
    m_fake.reference = ObjPtr(GetDefaultInterface(obj))
    If m_fake.reference = 0 Then Exit Property
    '
    obj.AddWeakRef Me
End Property

'*******************************************************************************
'Returns the default interface for an object
'Casting from IUnknown to IDispatch (Object) forces a call to QueryInterface for
'   the IDispatch interface (which knows about the default interface)
'*******************************************************************************
Private Function GetDefaultInterface(obj As IUnknown) As Object
    Set GetDefaultInterface = obj
End Function

'*******************************************************************************
'Retrieves the object pointed by the saved reference
'No external API calls are needed!
'*******************************************************************************
Public Property Get Object() As Object
    If m_fake.reference = 0 Then Exit Property
    '
    Set Object = DeReferenceByVarType(m_fake.remoteVarType)
End Property

'*******************************************************************************
'Utility function needed to redirect remoteVT - See Class_Initialize comments
'*******************************************************************************
Private Function DeReferenceByVarType(ByRef remoteVT As Variant) As Object
    remoteVT = vbObject
    Set DeReferenceByVarType = m_fake.reference
    remoteVT = vbLongPtr
End Function

'*******************************************************************************
'Needs to be called when the referenced object is terminated
'*******************************************************************************
#If VBA7 Then
Public Sub ObjectTerminated(refAddress As LongPtr)
#Else
Public Sub ObjectTerminated(refAddress As Long)
#End If
    If m_fake.reference = refAddress Then m_fake.reference = 0
End Sub

'*******************************************************************************
'Returns the referenced memory address
'*******************************************************************************
#If VBA7 Then
Public Function ReferencedAddress() As LongPtr
#Else
Public Function ReferencedAddress() As Long
#End If
    ReferencedAddress = m_fake.reference
End Function

All there is left to do is to inform the weak reference objects about the termination of the object they are targeting.
Unfortunately, the Class_Terminate event is not part of the interface so it cannot be forced to do anything.

Because too much boilerplate code would need to be added to all classes implementing IWeakable it is probably best to encapsulate all the logic inside a separate class called WeakRefInformer:

'*******************************************************************************
'' When terminated, informs all stored WeakReference objects about termination
'*******************************************************************************

Option Explicit

Private m_refs As Collection
#If VBA7 Then
    Private m_reference As LongPtr
#Else
    Private m_reference As Long
#End If

Public Sub AddWeakRef(wRef As WeakReference, obj As IWeakable)
    'Store the address for the object implementing IWeakable
    'When Class_Terminate is triggered, this will be passed to each
    '   WeakReference object in case the WeakReference will be set to point
    '   to a different target (in-between this call and the termination call)
    If m_reference = 0 Then m_reference = ObjPtr(GetDefaultInterface(obj))
    '
    If wRef.ReferencedAddress = m_reference Then m_refs.Add wRef
End Sub

Private Function GetDefaultInterface(obj As IUnknown) As Object
    Set GetDefaultInterface = obj
End Function

Private Sub Class_Initialize()
    Set m_refs = New Collection
End Sub

Private Sub Class_Terminate()
    Dim wRef As WeakReference
    '
    For Each wRef In m_refs
        wRef.ObjectTerminated m_reference
    Next wRef
    Set m_refs = Nothing
End Sub

and the only code needed in any class implementing IWeakable would be:

Implements IWeakable

Private Sub IWeakable_AddWeakRef(wRef As WeakReference)
    Static informer As New WeakRefInformer
    informer.AddWeakRef wRef, Me
End Sub

The main idea is that by not exposing the contained WeakRefInformer object, it will surely go out of scope when the object implementing IWeakable is terminated.

A quick visual example. Consider a "parent" object containing 2 "child" objects pointing back through weak references and a 3rd "loose" weak reference. This would look like:
Reference diagram

Finally, a check is made inside the ObjectTerminated method of the WeakReference class to be sure the current referenced object has terminated (and not a previously referenced object).

Demo

Class1 class:

Option Explicit

Implements IWeakable

Public x As Long

Private Sub IWeakable_AddWeakRef(wRef As WeakReference)
    Static informer As New WeakRefInformer
    informer.AddWeakRef wRef, Me
End Sub

And the test:

Sub TestWeakReference()
    Dim c As Class1
    Dim w1 As New WeakReference
    Dim w2 As New WeakReference
    Dim w3 As New WeakReference
    '
    Set c = New Class1
    c.x = 1
    '
    Set w1.Object = c
    Set w2.Object = c
    Set w3.Object = c
    
    Debug.Print w1.Object.x         'Prints 1 (correct)
    Debug.Print w2.Object.x         'Prints 1 (correct)
    Debug.Print w3.Object.x         'Prints 1 (correct)
    Debug.Print TypeName(w1.Object) 'Prints Class1 (correct)
    Debug.Print TypeName(w2.Object) 'Prints Class1 (correct)
    Debug.Print TypeName(w3.Object) 'Prints Class1 (correct)
    '
    Dim temp As Class1
    Set temp = New Class1
    Set w3.Object = temp
    temp.x = 2
    '
    Set c = Nothing 'Note this only resets w1 and w2 (not w3)
    Set c = New Class1
    c.x = 3
    '
    Debug.Print TypeName(w1.Object) 'Prints Nothing (correct)
    Debug.Print TypeName(w2.Object) 'Prints Nothing (correct)
    Debug.Print TypeName(w3.Object) 'Prints Class1 (correct)
    Debug.Print w3.Object.x         'Prints 2 (correct)
End Sub

The rest of this answer is focused on all the other improvements suggested in the same mentioned answer.


Let/Set Performance

You use modified ByRef variants to do the memory manipulation in the performance critical area*
...
*well, if you don't count the Let procedure as performance critical, which it probably isn't in the typical use case. It's called once at the Child's birth, while the Get is potentially called many times in the Child's lifetime. However best not to make assumptions on how users will interact with your code, especially something as fundamental as this

There is no need to do the vTable check since the weak reference is informed about the termination so Let (now Set) does not have any API calls anymore (so it is fast). There is no need for the GetRemoteAddress method as well.

Speed comparison

Here I've run an integer dereference (read 2 bytes from one variable and write to another) many times (x axis) and calculated the average time per call (y axis) for the standard, ByRef and GetMem2 techniques, and the latter comes out on top.

I've decided to test this on the 2 Windows computers I have. On my third machine, a Mac, the msvbvm60 library is missing.

Machine 1 (M1) configuration:
Intel Xeon CPU E5-2699A v4 @ 2.40GHz, 6.00GB RAM, 64-bit Operating System
Excel version 1902 (Build 11328.20420 Click-to-run)
VBA x32

Machine 2 (M2) configuration:
Intel Core i7-9750H CPU @ 2.60GHz, 16.00GB RAM, 64-bit Operating System
Excel version 2007 (Build 13029.20344 Click-to-run)
VBA x64

I've tested method:

Set Object = DeReferenceByVarType(m_fake.remoteVarType)

for ByRef and:

SetVariantType m_fake.reference, vbObject
Set Object = m_fake.reference
SetVariantType m_fake.reference, vbLongPtr

for PutMem2

inside a loop directly in the Public Property Get Object() As Object property using CTimer. CTimer seems to be consistent with the VBA Timer function for the longer runs (where the latter has enough resolution).

On Machine 1 I got:
M1 chart
which seems to be off by a factor of 10 from what the other answer showed for the ByRef approach and way off (much slower) for the PutMem2 approach.

On machine 2 I got:
M2 prompt

Since that is not really helpful, I compared the ByRef approach between M1 and M2:
enter image description here
which seems to be consistent.

Considering that the msvbvm60.dll library is only present on some Windows machines and that the speed is quite different from machine to machine (looking at this answer and the mentioned answer), the ByRef approach seems to be the correct choice. Readability has been improved slightly by wrapping the calls into the DeReferenceByVarType function.

Misc 1

This should not be linked to the class instance, it should be defined with conditional compilation

#If Win64 Then
    Const vbLongPtr As Long = vbLongLong
#Else
    Const vbLongLong As Long = 20
    Const vbLongPtr As Long = vbLong
#End If 

True, with the added note that on Mac the vbLongLong is missing for x64:

#If Win64 Then
    #If Mac Then
        Const vbLongLong As Long = 20 'Apparently missing for x64 on Mac
    #End If
    Const vbLongPtr As Long = vbLongLong
#Else
    Const vbLongPtr As Long = vbLong
#End If

Misc 2

ByVal VarPtr(blah) is like telling the function that the argument it is receiving "has a value equal to the pointer to blah" rather than "is the pointer to blah". No difference

Absolutely. I've only noticed this when reading the answer. It was a leftover from previous testing code where the assignment happened in the 9th byte of the code with:

CopyMemory ByVal VarPtr(m_fake.vTableByRef) + 8, m_fake.reference, 8 'Or 4 on x32

which obviously is just a mere:

m_fake.vTableByRef = m_fake.reference

It got propagated through copy-paste. Nice attention to detail by @Greedo


As stated in the question, the full code with explanation is under MIT license on GitHub at VBA-WeakReference.

Many thanks to @Greedo and @MathieuGuindon for their contribution!

Answered by Cristian Buse on October 27, 2021

Sorry for taking so long with this review despite being the one to prompt you to post your code here, but I hope you (and others) may still find it useful.

Now, although I do have a number of more general points to talk about, as Matt says I think it'd be nice to dive into the "meat" of your code and dissect the approach you've taken, then hopefully include some of the general points along the way.


Let's remind ourselves first of the "standard approach", as you referred to it in the GitHub repository, which is more or less the approach Matt uses in the question I linked (minus a nice constructor and some other OOP stuff), and looks something like this:

Property Get ObjectFromPtr() As Object
    Dim result As Object
    CopyMemory result, ByVal this.ptr, LenB(ptr) 'De-reference cached ptr into temp object
    Set ObjectFromPtr = result                   'IUnknown::AddRef once for the return value
    ZeroMemory result, LenB(ptr)                 'Manually clear the temp reference so IUnknown::Release isn't called when it goes out of scope
End Property

For the sake of having something to compare to, what's good and bad about this code?

Pros:

  • Quite simple technique; only requires basic knowledge of pointers and reference types
  • Short clear code
  • Minimal number of API calls
  • Small instance memory footprint (only 1 cached LongPtr)

Cons:

  • Fairly slow API used for most performance critical part
  • If parent has been nulled and cached pointer references a bit of memory that no longer represents a real object instance, Excel will likely crash when the returned Object is inspected
  • If parent has been nulled, but the memory has been overwritten with a valid but different object instance, then this approach will appear to succeed, yet return an incorrect/unintended object, since Object is effectively weakly typed in VBA

So how does your approach differ (ideally maintaining or adding to the pros while reducing the cons)? I see 3 key areas where your underlying approach is different:

  1. You use modified ByRef variants to do the memory manipulation in the performance critical area*
  2. Rather than creating a temporary Object instance and filling it with the Object pointer, you toggle the VarType flag of a Variant to create the temporary Object
  3. You partially circumvent the weak typing of the Object return type by caching the parent's VTable pointer in the Let method and then manually checking it still matches the referenced object instance every time Get is called

Let's take a look at these 3 differences in turn to see what they bring to the implementation as a whole

*well, if you don't count the Let procedure as performance critical, which it probably isn't in the typical use case. It's called once at the Child's birth, while the Get is potentially called many times in the Child's lifetime. However best not to make assumptions on how users will interact with your code, especially something as fundamental as this

1) ByRef Variants for moving memory

You set up these "remote variables" by manually modifying the VarType of a Variant:

CopyMemory ByVal VarPtr(m_fake.vTableByRef), m_fake.vbLongPtr + VT_BYREF, 2

I haven't seen this before, impressive to come up with a totally new approach, well done! At a glance it seems to offer a number of benefits:

  • Make use of super-fast native VBA code to do the pointer dereference + memory overwrite for you instead of an API call
  • Simplify call sites by interacting with native VBA variants
  • Avoid the VBA interpreter trying to reclaim the same bit of memory twice by using ByRef

However there are some issues with all of these arguments...


To begin with, I'm not sure reclaiming memory was ever really a concern; value types aren't reference counted so there was never any risk of double reclaiming. The real risk to watch out for is where the variable that owns the memory goes out of scope before the remote variable does. This leaves the remote variable pointing to a section of memory that has been reclaimed.

In the case of reading memory like with your vTableByRef, it's sufficient to know that the value it reads could be anything. However when you set up a variable to write memory, then you have to be very careful you don't end up corrupting memory you don't own. This isn't too much of a risk for your code, since reference and remoteVarType are in the same scope, however if the Child_Terminate code runs after the parent's, and the child attempts to access its parent's reference at this point, then in some circumstances I'll discuss later the remoteVarType will allow writing to an un-owned bit of memory, which is, needless to say, a bad thing!

So accessing memory with remote variables doesn't do much to protect you compared to an API call.


Secondly, does using ByRef variants really help to simplify call sites compared to an API?

'*******************************************************************************
'Utility. Changes the data type for the reference Variant while preserving the
'   level of redirection of remoteVarType
'*******************************************************************************
Private Property Let VarType(ByRef v As Variant, newType As Integer)
    v = newType
End Property

'*******************************************************************************
'Returns the value of a Variant that has the VT_BYREF flag set
'*******************************************************************************
Private Function GetRemoteAddress(ByRef memAddress As Variant) As MEM_ADDRESS
    GetRemoteAddress.ptr = memAddress
End Function

The fact that you need these 2 methods to interact with the remote variables is itself a warning sign. It would be great if you could simplify your calling sites to this:

m_fake.isValid = (m_fake.vTableByRef = m_fake.vTable) 'check live value against the cache

Or

m_fake.remoteVarType = vbObject                'toggle the VarType flag

... which would be a big improvement over accessing the memory the old way:

CopyMemory m_fake.remoteVarType, vbObject, 2   'much less clear

But in fact the call sites are not nearly that clear:

VarType(m_fake.remoteVarType) = vbObject
m_fake.isValid = (GetRemoteAddress(m_fake.vTableByRef).ptr = m_fake.vTable.ptr)

VarType and GetRemoteAddress indicate that storing Variants ByRef beyond their typical function argument scope is not something VBA is happy about, hence the additional redirection required to get around VBA's complaints.


Final point regarding these remote variables is performance. Low level APIs are always risky and VBA's complaints haven't stopped me in the past, so maybe this technique's speed will make it worthwhile? While it's true that native is native, Variant is not Integer, and using variants for dereferencing brings overhead as they are essentially dynamically sized variables. Since we know dynamic sizing isn't something to worry about (the memory these remote variables work with is fixed in size), it is more efficient to move memory around in pre-defined chunks. Fortunately the VB6 (msvbvm60.dll) runtime exposes a family of undocumented methods to do just that, let's compare everything for speed:

Comparison of integer dereferencing techniques

Here I've run an integer dereference (read 2 bytes from one variable and write to another) many times (x axis) and calculated the average time per call (y axis) for the standard, ByRef and GetMem2 techniques, and the latter comes out on top.

All things considered, the remote variable technique you use doesn't in fact improve readability, safety or performance, and requires additional knowledge of COM Variants that means people looking at your code for the first time (myself included) may need a couple of takes to understand what's going on - ultimately hindering maintainability and accessibility of the approach. So should you scrap the remote variables? Well there is still one important advantage over the faster Get/PutMem functions which is that I can't seem to find any examples of using them on Mac! I'm fairly certain it has to be possible since they should ship with VBA, but I haven't found them in Office's VBE7.dll, only Windows' msvbvm60.dll so I'm not so sure. So maybe on Mac you could fall-back to ByRef Variants as they still outperform MoveMem, but if anyone has better suggestions do drop them in the comments.

2) Object References

So while the standard code has this for creating an object from a pointer

CopyMemory result, ByVal ptr, LenB(ptr)
Set ObjectFromPtr = result
ZeroMemory result, LenB(ptr)

Yours has

VarType(m_fake.remoteVarType) = vbObject
Set Object = m_fake.reference
VarType(m_fake.remoteVarType) = m_fake.vbLongPtr

I think the only drawback of your approach over the standard (ignoring the dereferencing technique discussed above) is the conceptual one; the standard method requires understanding of Object pointers, the method you've used also requires additional knowledge of COM Variants, so is just a slightly steeper learning curve. In terms of performance, both have 2 dereferencing steps and one native Set call, so probably nothing in it (although you could time it to see if copying 2 bytes is faster than 4). Some better naming might help with the conceptual difficulties:

Private Declare PtrSafe Sub SetVariantType Lib "msvbvm60" Alias "PutMem2" (ByRef target As Variant, ByVal varTypeFlag As Integer)

'Toggle the varType flag on the variant to create a temporary, non reference-counted Object
SetVariantType m_fake.reference, vbObject
Set Object = m_fake.reference
SetVariantType m_fake.reference, vbLongPtr

Renaming imports introduces a simple layer of abstraction that clarifies the intent of the code, reducing the need for comments (in fact, you could even declare varTypeFlag As VbVarType - a Long with intellisense, since Longs are stored little-endian in VBA so the first 2 bytes at the pointer to a Long are the same as an Integer with the same decimal value). It also allows for type checking of parameters which is nice.

3) VTable check

Finally we come to it, what I think is the most innovative part of your code. As I mentioned at the start of this post, one of the downsides of the standard approach is that if the parent instance goes out of scope, and its memory is overwritten, then 2 things can happen:

  • It can be overwritten with a valid object instance, perhaps even a separate instance of the same Class as the parent! That's really bad and will lead to a successful dereference but undefined behaviour and nasty hard to diagnose bugs.
  • More likely (purely by probability) the memory will be re-allocated to an invalid object instance (i.e. something that's not an object, or maybe a load of zeros). This will likely lead to a crash - which seems nasty for the developer but is actually the best course of action when dealing with a bad pointer - at least you know something's definitely wrong.

Your approach vastly reduces the number of headaches for developers by eliminating most of the false positives, so really well done for that. There are still a few exceptions I can see:

  • If the parent instance is overwritten with another, different instance of the same class, it will have the same VTable so your check will not catch this. I don't know how likely this is to happen, but it may be worth investing in an additional interface for parent classes that exposes something like a GUID, so that once you successfully dereference the Object, you cast it to an IGUID interface and check it has a GUID that matches what you expect, if so then return the parent Object. This will bring false positives from this mode of failure down to zero (or as good as)
  • If the parent instance has been overwritten with an invalid object, but it so happens that the first 4/8 bytes of the memory have been reused to store a string of binary that coincidentally matches the VTable pointer exactly, then once again your class will not catch this. What's worse is that rather than crashing, everything will plow on but with random data you don't own populating an imaginary instance of the parent class! Who knows what will happen... *

What I'm trying to say is that the behavior of pointer dereferencing once the thing they point to has gone out of scope is undefined, so while returning Nothing is very nice for a developer and cuts down many of the false positives, it doesn't mean that the Something can be trusted any more than before, they will still need to perform other checks or employ other measures like carefully handling scope to ensure bugs don't creep in.

*Well... IUnknown::AddRef will attempt to increase the class' reference count, incrementing the random chunk of memory at ObjPtr + 4. You may then cast to the IGUID interface, incrementing some other memory - which might actually succeed because the IUnknown methods are the real ones from a valid VTable and don't know the instance data isn't from a real object. If you get this far then it should be obvious the GUIDs don't match, but then what? It's possible you'll be able to undo all of the effects if you do work out that the instance you started with doesn't match the one you currently have, but it most likely assumes a lot about the layout of classes in memory which may be true generally, but what if the parent class isn't user defined, but a COM object from another library?

D) Misc

m_fake.vbLongPtr

This should not be linked to the class instance, it should be defined with conditional compilation

#If Win64 Then
    Const vbLongPtr As Long = vbLongLong
#Else
    Const vbLongLong As Long = 20
    Const vbLongPtr As Long = vbLong
#End If

Or if you don't trust #Win64 and prefer to keep VBA.VarType(ObjPtr(Nothing)) then put it in a standard module or a static class instance perhaps


CopyMemory ByVal VarPtr(m_fake.remoteVarType), vbInteger + VT_BYREF, 2

should be

CopyMemory m_fake.remoteVarType, vbInteger + VT_BYREF, 2

ByVal VarPtr(blah) is like telling the function that the argument it is receiving "has a value equal to the pointer to blah" rather than "is the pointer to blah". No difference


vTable As MEM_ADDRESS

I'd probably rename to cachedVTablePointer and get rid of MEM_ADDRESS altogether, just put the conditional compilation inside the FAKE_REFERENCE type

Also you could simplify the vtable check potentially. Right now you dereference the original objptr and cache it. You then have the remote variable which essentially dereferences the objptr again, live, to see if the vtable pointer is still there. Finally you compare these two dereferenced variables for equality. You could instead check for equality in place without any explicit dereferencing using RtlCompareMemory which can be imported from kernel32 or ntdll e.g.

Private Declare Function EqualMemory Lib "ntdll" Alias "RtlCompareMemory" (Destination As Any, Source As Any, ByVal Length As LongPtr) As LongPtr

might be faster, maybe a bit clearer


Overall, nice job, I've enjoyed reading through it and thinking about it

Answered by Greedo on October 27, 2021

Looks great overall, although I'm really not a fan of banner comments (some of them would do well as @Description annotations), but I like that the commenting is very extensive. Good job!

This is dangerous though:

'*******************************************************************************
'Sets the weak/fake reference to an object
'*******************************************************************************
Public Property Let Object(obj As Object)

It breaks a very well-established convention where object references are assigned using the Set keyword. By defining the property as a Property Let member, the consuming code has all rights to consider this legal:

Set weakRef.Object = someObject

But they'll be met with a confusing "invalid use of property" compile-time error.

Public Property Let Parent(newParent As DemoParent)
    Set m_parent = New WeakReference
    m_parent.Object = newParent
End Property

That should read:

Public Property Set Parent(ByVal newParent As DemoParent)
    Set m_parent = New WeakReference
    Set m_parent.Object = newParent
End Property

(note: Property Let/Set RHS argument is always passed ByVal; the implicit default being ByRef everywhere else, it's a good idea to make it explicit here)

Why? Because depending on how the newParent object is defined, this code might not do what you think it does:

    m_parent.Object = newParent

Indeed, classes in VBA can have hidden member attributes. If you have Rubberduck, you can do this:

'@DefaultMember
Public Property Get Something() As Long
    Something = 42
End Property

And when you synchronize the Rubberduck annotations (via inspection results), the member would look like this if you exported the module:

'@DefaultMember
Public Property Get Something() As Long
Attribute Something.VB_UserMemId = 0
    Something = 42
End Property

If that's what the DemoParent class does, then this:

    m_parent.Object = newParent

Is implicitly doing this, through a mechanism known as let coercion, where an object can be coerced into a value:

    Let m_parent.Object = newParent.Something

That makes WeakReference not work with most classes that define a default member/property. Granted, most classes should not define such a member (implicit code is code that says one thing and does another: avoid it!), but it wouldn't be uncommon to see it adorn a custom collection class' Item property - if each item in that custom data structure has a reference to its containing collection, then the error would be complaining about an argument (to a method we don't intend to invoke, and whose name won't appear in the error message) not being optional...

VBA uses the Set keyword specifically to disambiguate this assignment scenario:

    [Let] m_parent.Object = newParent 'ambiguous, RHS could be let-coerced
    Set m_parent.Object = newParent 'unambiguous

The Let keyword is redundant and can safely be omitted, but not the Set keyword.

The keyword is not needed in later versions of Visual Basic, because in these versions, the compiler will refuse to allow the definition of a parameterless default member: the possible presence of a parameterless default member on a VBA class is why the Set keyword is required: skirting around it introduces unexpected implicit behavior that can be very hard to diagnose and/or debug.

Answered by Mathieu Guindon on October 27, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP