TransWikia.com

A VBA Product Dictionary with a Collection of Product Services

Code Review Asked on November 11, 2021

I need to extract unique Product Group names along with its corresponding services from a table in a worksheet. The table is generated by a bot and is not filtered, I have sorted it by alphabetical order. The data is not fixed and can contain anywhere from 5 – 100 rows of data, depending on the month which the report from the bot is generated.

I decided to use a Dictionary to store the the Product Group Name as they Key, while using a Collection to store services. The Collection only stores unique services by using On Error Resume Next

What changes could I make to my code?

Snippet of my Table

enter image description here

Code

Public Sub BuildTMProductDictionary()

Dim tmData As Variant
tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value

Dim i As Long
For i = LBound(tmData, 1) To UBound(tmData, 1)
    Dim product As String
    product = tmData(i, 1)
    
    'store unique services in a collection, On Error Resume Next used to avoid duplicates
    On Error Resume Next
    Dim services As New Collection
    services.Add (tmData(i, 2)), (tmData(i, 2))
    
    'get the product name of the next row
    Dim nextProduct As String
    nextProduct = tmData(i + 1, 2)
    
    'compare the current product against the next product create New Dictionary if <>
    If product <> nextProduct Then
        Dim productGroup As New Dictionary
        productGroup.Add product, services
        Set services = New Collection
    End If
Next
End Sub

Edit
My Collection of services needs to be unique. As an example "Positive Pay" which belong to the "ARP" product group should only appear once in the collection.

2 Answers

I have sorted it by alphabetical order

A year from now are you going to remember that the data is supposed to be presorted? Adding a comment notating it would be helpful. Better yet would be suffix it to the routines name:

Public Sub BuildTMProductDictionaryFromSortedTable()

The best approach is not to rely on the data being sorted in the first place. The reason we use dictionaries in the first place is for lightning fast lookups and the ability to check if a key exists. Simply, store a new collection each time you create a key in the dictionary and use the key to retrieve the collection as needed.

    If Not productGroup.Exists(product) Then productGroup.Add product, New Collection
    On Error Resume Next
    productGroup(product).Add tmData(i, 2)
    On Error GoTo 0

It is best to limit the scope of On Error Resume Next as much as possible by using On Error GoTo 0. The tighter the scope the better chance we will find the errors while debugging.

Public Sub BuildTMProductDictionary()

So you have a sub routine that builds the compiles the data just the way you want it. Excellent! Now what? You could, of course, add some more functionality to the method but that isn't what you should be doing. Ideally, every routine should do as few things as possible and do them flawlessly in a very easy to read manor.

It would be better to change BuildTMProductDictionary() from a sub routine to a function and have it return the data.

Something like this:

Public Function GetTMProductDictionary()
    Const productCol As Long = 1, serviceCol As Long = 1
    
    Dim Data As Variant
    Data = Sheet1.ListObjects("Table1").DataBodyRange.Value

    Dim productGroup As New Dictionary
    Dim i As Long
    
    For i = LBound(Data, 1) To UBound(Data, 1)
        If Not productGroup.Exists(Data(i, productCol)) Then productGroup.Add Data(i, productCol), New Collection
        
        On Error Resume Next
        productGroup(Data(i, productCol)).Add Data(i, serviceCol)
        On Error GoTo 0            
    Next
    
    Set GetTMProductDictionary = productGroup
End Function

This is pretty good but is the function as simple as it can be? What does it actually do?

  • Retrieve data
  • Compile the data
  • Return the data

If the function is compiling data, it really need to return it. But does it need to retrieve the data?

  • No not really. We could pass the data in as a parameter.

What effects would passing the data in as a parameter have our overall design?

  • By decoupling data gathering from data processing makes it far easier to test the code. In this case we could make a test table an a unit test that will run regardless independently from the actual data.

  • It reduces the size of the method, which in turn, makes the code easier to read and modify.

    Public Function GetTMProductDictionary(Data As Variant) Const productCol As Long = 1, serviceCol As Long = 1

      Dim productGroup As New Dictionary
      Dim i As Long
    
      For i = LBound(Data, 1) To UBound(Data, 1)
          If Not productGroup.Exists(Data(i, productCol)) Then productGroup.Add Data(i, productCol), New Collection
    
          On Error Resume Next
          productGroup(Data(i, productCol)).Add Data(i, serviceCol)
          On Error GoTo 0            
      Next
    
      Set GetTMProductDictionary = productGroup
    

    End Function

How does this effect the meaning of our names? Should the variable names remain the same?

The larger the scope of the more descriptive the names should be.

Lets take a closer look at the names. Can they be simplified or improved? Can they be shortened or generalized?

GetTMProductDictionary(), productCol, serviceColThis all makes sense.

But productGroup? What is a productGroup? Its a dictionary. How many dictionaries are there in this small function? Only 1. Why not just call it Dictionary? I name my dictionaries Map or somethingMap because it is a simple and clean naming pattern and I hate seeing dic.

So now we have a Map. Maps use key/value pairs. The Map doesn't care if the key is a product group or that the product group or that the value is a collection. Does knowing about product groups and services even help us review the code? Maybe...just a little.

What would happen if we just generalized the code? If we gave everything simple, common, familiar, and meaningful names that we see every time we work with this type of code? What would it look like?

Public Function GetMapCollection(Data As Variant, keyColumn As Long, valueColumn As Long)
    Dim Map As New Dictionary
    Dim i As Long
    
    For i = LBound(Data, 1) To UBound(Data, 1)
        If Not Map.Exists(Data(i, keyColumn)) Then Map.Add Data(i, keyColumn), New Collection
        
        On Error Resume Next
        Map(Data(i, keyColumn)).Add Data(i, valueColumn)
        On Error GoTo 0            
    Next
    
    Set GetMapCollection = Map
End Function

Looks to me that we found a generic reusable function hiding in the code. Not only has the data retrieval and compilation been decouple but the context, in which, the compiled data is going to used has been washed away.

This is what we should strive for when we are refactoring. Our methods should be so small and simple that they only know the bare minimum.

Addendum

I modified the function to use only dictionaries and added sample usage.

Locals Window

Sub Usage()
    Dim productGroupServices As Scripting.Dictionary
    Dim serviceProductGroups As Scripting.Dictionary
    
    Dim tmData As Variant
    tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value

    Set productGroupServices = GetUniqueGroups(tmData, 1, 2)
    
    Set serviceProductGroups = GetUniqueGroups(tmData, 2, 1)
    
    Stop
End Sub

Public Function GetUniqueGroups(Data As Variant, keyColumn As Long, valueColumn As Long) As Dictionary
    Dim Map As New Dictionary
    Dim i As Long
    Dim Key As Variant
    Dim Value As Variant
    
    For i = LBound(Data, 1) To UBound(Data, 1)
        Key = Data(i, keyColumn)
        Value = Data(i, valueColumn)
        
        If Not Map.Exists(Key) Then Map.Add Key, New Dictionary
        If Not Map(Key).Exists(Value) Then Map(Key).Add Value, Value
    Next
    
    Set GetUniqueGroups = Map
End Function

Answered by TinMan on November 11, 2021

You seem to be misunderstanding how to use a Scripting.Dictionary.

There is no need to sort the data before processing into a dictionary.

There is also no need to construct a collection before you add to the dictionary.

Its also slightly more sensible to write the sub as a function.

As a final tweak I'd pass the array in as a parameter rather than hardwiring it into the function, but I'll leave that as an exercise for the reader (smile)

Public Function BuildTMProductDictionary() As Scripting.Dictionary

    Dim tmData As Variant
    tmData = Sheet1.ListObjects("Table1").DataBodyRange.Value
    
    
    Dim myDict As Scripting.Dictionary
    Set myDict = New Scripting.Dictionary
    
    Dim i As Long
    For i = LBound(tmData, 1) To UBound(tmData, 1)
    
        Dim myProduct As String
        myProduct = tmData(i, 1)
        
        Dim myService As String
        myService = tmData(i, 2)
    
        If Not myDict.exists(myProduct) Then
        
            myDict.Add myProduct, New Collection
        
        End If
        
        myDict.Item(myProduct).Add myService
        
    Next
    
    Set BuildTMProductDictionary = myDict

End Function

Replace

   If Not myDict.exists(myProduct) Then

        myDict.Add myProduct, New Collection

    End If

    myDict.Item(myProduct).Add myService

with

    If Not myDict.exists(myProduct) Then
    
        myDict.Add myProduct, New Scripting.Dictionary
    
    End If
    
    If Not myDict.Item(myProduct).exists(myService) Then
    
        myDict.Item(myProduct).Add myService,myService
        
    End If

Answered by Freeflow on November 11, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP