Looking for help creating a macro; which functions are used and how is it organized? Details Below!

Hi everyone,

I’m working in a sheet that has 4 tabs with information about insurance companies. The sheet is used for updating a database that tells users which states a company works in as well as what type of insurance they sell.

There are two tabs for states – add and delete, as well as two for insurance codes – add and delete as well. If Company A is marked with an x in the add column for AZ, then it will delete Arizona in the database. Same thing for the insurance codes. This is what it looks like currently:

image description

image description

The issue is that the upload to the database needs each state to link to each code. Basically, we can’t tell it to just “add AUTO to Company A”. Instead, you need to tell it to add AUTO to Company A for WI, then add it for IL, and so on.

The end result should look like this:

image description

and so on far all of the states and codes. I understand how to do this algorithmically but have no idea how to turn it into a macro or even which functions to use. Because its relatively simple to do in a stepwise fashion I assume it is possible, I am just not sure how. It would take days or weeks to do in the actual sheet which has tens of thousands of rows.

Here are the steps involved in doing it manually:

  1. Count the number of states being added for each insurance company. In this case, Company A has 6 states.
  2. Under the first row in the CODE sheet, add that number minus one. In this case you add 5 rows under the row with AUTO, giving you 6.
  3. Copy and paste the states into the “States” column in the CODE sheet.
  4. Copy the CODE value for all of the cells until you reach the next code value.
  5. Do the same for the Company Name value and the “x” under “add” or “delete”.

I believe that does it. I understand how this works conceptually but keep in mind I am a total novice when it comes to computers in general and have never coded at all much less worked in BASIC. But this definitely seems doable.

Any ideas? Thank you!

1 Like

Converting data from ordered tables is not a very difficult task, a macro will do the job in a few seconds … Oops! tens of thousands of rows? Well, okay - in a few minutes. But you should be prepared for the macro code to be slightly longer than your task description. First of all, the algorithm you described is good. But it shouldn’t be used. The fact is that read-write operations to cells are quite slow, if you write lines one by one, then the macro execution time will be very long. You should read all the necessary data into arrays at one time, form an array with the results, and write the results to the specified location at one time. And you mentioned the “x” in “add”/“delete” too briefly. What should you do if Company A has “add” in Arizona and “delete” in AUTO?

The main questions are those of design: Is the chosen design appropriate? Is it suitable in specific to map all the relevant details expectable for a real-world-case?

Are you sure, e.g, a Co offering AUTO in WI, and also active in OH will offer AUTO there? They may serve different selections of Cd in different states, can’t they?

If there may occur a case as mentioned above - probably only in a single example - a program as you dream of can’t do the task.

Maintaining data, one of the few advantages of spreadsheets is their capability of accommodating humans with their ways of thinking and viewing - but this only in 2 dimensions.
You may need to first create a contingency table Co X St, to normalize the filled table then to a next one which you extend to a contigency table (Co X St) X Cd. (And so on…)

If your normalized table only shall be a representation of a 3D space getting the selectors later, you should state this. No “x” there.

Hello, thanks for your response.

On your first point; yes, in our system each company offering a type of insurance will offer it in each state. It is possible as you describe that they offer AUTO in CO but not WI, but this isn’t something we’re supporting for now and is easier to deal with in rarer situations where it arises.

I have no idea what contingency tables are, I will start researching and see if it gets any clearer.

A contingency-table is a kind of matrix with one sequence in the position of the vertical index and one for the horizontal index. The matrix elements are Boolean (true = isMember; false = notaMember) in the original case, where the tables is used to describe what is called a relation in mathematics (where the sequences simply are sets). In a more general case the elements may be any kind of values, but in this case we need to add a third dimension to get the structure under the term of “relation”. The described structure also is often called cross-table or similar.
Databases only use tables with one index (primary key) and need a kind of normalized representation of relations therefore. For a selection the field names can be used like a second index.
(That’s Lupp-speak, not database textbook.)

Ok let’s do it together
The final program will have to

  • read the original tables into arrays,
  • pre-convert them (order),
  • then perform the Cartesian product and place them in the result array, and at the very end
  • write this array to a sheet of your spreadsheet.

Let’s write a function that will read all data from the sheet into an array
This function will take one sheet as a parameter (as an object), create a cursor for this sheet, make this cursor select all the data on the sheet up to the very last cell, and return the values of all selected cells as an array of rows (each row is an array of cells)

Function getDataFromSheet(oSheet As Variant) As Variant
Dim oCursor As Variant
   oCursor = oSheet.createCursor()
   oCursor.gotoEndOfUsedArea(True)
   getDataFromSheet = oCursor.getDataArray()
End Function

You were going to count the number of states for each of the companies and the number of insurance services for sale. We will do better - we will immediately collect the necessary data into arrays and attach these arrays to the company name.
The following procedure will help us to collect data into arrays (it simply increases the array by one element and writes the specified value to this new element):

Sub addToArray(key As Variant, arr As Variant)
Dim uB As Long
   uB = UBound(arr)+1
   ReDim Preserve arr(uB)
   arr(uB) = key
End Sub

The procedure for “packing” the list of companies is not much more complicated:

Sub AddElement(key As String, aData As Variant, sValue As Variant)
Dim l&, r&, m&, N&, i&
   l=LBound(aData)
   r=UBound(aData)+1
   N=r
   While (l<r)
      m=l+Int((r-l)/2)
      If aData(m)(0)<key Then 
         l=m+1
      Else
         r=m
      EndIf
   Wend
   If r=N Then
      addToArray(Array(key, Array(sValue)), aData)
   ElseIf  aData(r)(0)=key Then	
      addToArray(sValue, aData(r)(1))
   Else
      ReDim Preserve aData(0 To N)
      For i = N-1 To r Step -1
         aData(i+1)=aData(i)
      Next i
      aData(r) = Array(key, Array(sValue))
   EndIf
End Sub

The first eight lines are a binary search, the program looks in the array for a place where the specified element could be located. And the next If... ElseIf...Else statement simply adds, complements, or inserts an element at the specified location. Thus, we get a sorted array.
In a very similar way, you can quickly get the value of the desired element from a sorted array.

To get the Cartesian product (connect each to each), you only need three cycles, nested one inside the other. And writing the result to a sheet is just one line of code:

aResult = Array()
addToArray(Array("Company Name","CODE","State","delete","add"), aResult)
For i = 0 To UBound(aSortProducts)
	aTemp = getElement(aSortProducts(i)(0), aSortCompany)
	For j = 0 To UBound(aSortProducts(i)(1))
		For k = 0 To UBound(aTemp)
			addToArray(Array(aSortProducts(i)(0), aSortProducts(i)(1)(j), _
				aTemp(k)(0),aTemp(k)(1),aTemp(k)(2)), aResult)
		Next k
	Next j
Next i
oSheets.getByName("Result").getCellRangeByPosition(0,0,UBound(aResult(0)),UBound(aResult)).setDataArray(aResult)

The full text of the program takes less than 100 lines of code, but I hope it will help you solve your problem.
But be careful! The Cartesian product is very gluttonous! Let’s count. Let’s say you have a list of 30 insurance products. Let’s say a company can operate in each of the 50 states. That’s 150 rows for one company. Therefore, consider that more than 7000 companies will not fit on the list - try not to exceed this limit.

Full text of the program with test data - here InsuranceBase.ods

You asked for a macro and we wrote a macro. If you asked “How to do this using LibreOffice?”, the answer would be completely different: create a Base database, put all your tables there and immediately get the result using one query - left join of two tables.

Update. How To - see HERE

I am going to need quite a bit of time to look over this over the next few days but just wanted to say thank you so much for the incredibly thorough reply! This is amazing

Thank you sooo much!!! I am going to test it a few more times on our real data but from my initial test using Base this worked like a charm. You have saved me hours of work, I cannot thank you enough!

Interesting case. I had a similar problem with www.versicherungsriese.de and also found help in this forum!