Merging MULTIPLE excel spreadsheets using Libre

The company i work for left the documentation of the hardware we use daily to us. In order to facilitate our search for specific items, one of my co-workers created an excel spreadsheet containing 11 columns and over 1000 rows. The columns are Location, Part Numer, Description, Other Locations, and then 7 columns dedicated to other numbers the hardware might use depending on the context needed.

The issue is that it is not a jointly edited document, she simply sends out a new iteration every so often with changes she has consolidated into the sheet.

What I’m wanting to know is:
1.) Is Libre capable of, and how do I, start with the oldest document, add in the data from the next iteration, then delete duplicates leaving only the consolidated data from the spreadsheet?
2.) Then sort the table as a whole by the FIRST column, LOCATION, and keep the rows together so the data moves with the location information.

I currently have 6 iterations that i would like to consolidate, and then i am looking to push it through MSTeams as a living document that all of us can view, edit and use as we need.

Thanks for any and all assistance.

https://forum.openoffice.org/en/forum/viewtopic.php?f=21&t=77069

1 Like

On a highlevel, export files to CSV, merge files, sort, remove duplicates, import CSV. I was initially thinking use your OS tools to merge, sort and remove duplicates, but Libreoffice can Handle Duplicate Records. 6000 rows isn’t much data. An enhancement would be to create some forms to keep the data clean and consistent. Just a side note, make lots of backups before processing :slight_smile:

Calc Guide - Read online or download the PDF.

This is what every database engine does automagically with millions of records. A spreadsheet is not a database, not even close.

True. But in general, any collection of organized data is a database. Whatever is maintainable and meets OP goals. Perhaps a suggestion. for phase 2 :slight_smile: . Fun stuff for OP.


PS: I want to stay focused on OP’s question and I hope OP gets many great responses because this is an interesting topic.

I’m not so sure about the fun. Spreadsheets tend to be inconsistent. Databases refuse to store inconsistent data, and spreadsheet users tend to not understand why. All I can offer is another spreadsheet template, where you dump raw table data, click a macro button, fill out an “input form” on spreadsheet cells and get an SQL statement to generate a table in MySQL, Firebird or HSQL. Some very simple spreadsheet formulas detect inconsistencies in the pasted data, such as mixed numbers and text, missing values. You have to fix your data before you can generate the table accepting correct numbers, dates, booleans, currency values etc.
create_table.ots (41.7 KB)
and a combination of both templates:
Merge_and_Export.ots (39.4 KB)

Well, sorting data is one click if they happen to be consistent.

I don’t understand this one. Sounds like a pivot table could be the solution.

Calc can store a consistent rectangle of cells as a dBase table which is a database in a directory with one file per table. Then you connect a database document to the directory and build input forms.

I had thought @Berenost has 6 versions of the same file and wants to boil them down again to one. What I don’t understand is the combination with

If “she” had done this right @Berenost could just use the last version he got (unless other persons edited their copy between versions).

The main question is the how to find/define duplicates. Has a duplicate all columns identical? Or Is it a new version of a record for the same "Part numer"? Maybe even "Part numer"@"location".

To approach this in a spreadsheet I would

  • Load Oldest file in Sheet1 and the next to compare in Sheet
  • In both sheets add a column to the left to hold the unique-index like =Partnum or =Location&'@'&Partnum etc.
  • In the second sheet add a test to find the “unique-index” in sheet1. Could use COUNTIF or VLOOKUP etc.
  • For COUNTIF: Sort sheet2 by the result of Countif and you see easily the “new” records
  • Append the new record via copy/paste to sheet1 (It is maybe a good idea to have an additional column with the origin of the row, like in Villeroys code.

Vary according to your needs (changed records ...)

Are the NEW VERSIONS you receive for: ADDING NEW DATA; are they CHANGES to the data; or BOTH?

Is it possible to post an example of this file?


As NOVAS VERSÕES que recebe, são para: ADICIONAR NOVOS DADOS; são ALTERAÇÕES nos dados ou AMBOS OS CASOS?

É possível postar um exemplo deste arquivo?

Sorry for the late response. Work is pushing us to six days a week, soon to be sevens. I’ll try to answer questions here.

I am a complete noob to all things excel and spreadsheet, so please bear with me.

  • The iterations i speak of were all the same original file, but the material handlers including myself made changes/additions to our own individual files. i found out about this and had them send me their files in the hopes i could combine them all together into a singular file.

  • At work, the Shipping and Receiving supervisor created a spreadsheet that is used to order items that are stored on semi-trailers off site, bulk items that can’t be stored in the main warehouse. That spreadsheet is linked through Teams, and everyone with a link to the file can open it and add to it in order to request a trailer to be brought to the warehouse and have an item offloaded from it to the warehouse. I would love to be able to do the same with our spreadsheet of hardware locations, a file that all us material handlers can open, search, add to, etc. and be a unified document instead of all 8 of us using individual files.

  • The duplicates i refer to is when i copy/paste spreadsheet 1 into spreadsheet 2, they started as the same document, with roughly 98% of the document being identical, but say i have added 12 items to spreadsheet 1, and BOB added 14 items to spreadsheet 2, the 98% is identical, and a few of our entries might be the same. is there a way to remove the 98% identical “duplicates” from the spreadsheet?

  • As for: Are the NEW VERSIONS you receive for: ADDING NEW DATA; are they CHANGES to the data; or BOTH?
    Is it possible to post an example of this file?

The answer is both. new added data and changes that have been incorporated. i can post/pin/link and example of what the documents look like, a mock one i just created, to show the format. Just keep in mind the actual documents have around 1000 rows.
Sample Sheet.xlsx (56.6 KB)
^^ That is the format the documents are in.

Ola @Berenost, detalhes a considerar:

  • Em Sheet1, pela amostra existe informações alfanumĂ©ricas, para facilitar as analises, formatar todas as colunas como “texto”, limpar todos os dados retirando a informação de texto (’) de algumas cĂ©lulas e espaços vazio antes e depois dos dados. De todos os arquivos.

  • Considerando que todas as colunas sĂŁo idĂŞnticas em todos as planilhas, em uma nova planilha, copiar as seis em uma sĂł alocando os dados um abaixo do outro.

  • Sobre eliminação de duplicatas, como deve ser a analise: por alguma coluna especifica ou considerando todos os dados das colunas de cada item?

Segue arquivo formatado conforme sugestĂŁo acima.


Hi @Berenost, details to consider:

  • In Sheet1, based on the sample, there is alphanumeric information. To facilitate analysis, format all columns as “text,” clear all data by removing the text information (’) from some cells and empty spaces before and after the data. Do this for all files.

  • Considering that all columns are identical in all spreadsheets, in a new spreadsheet, copy all six into one, allocating the data one below the other.

  • Regarding the elimination of duplicates, how should the analysis be done: by a specific column or considering all the data from the columns of each item?

The formatted file follows the suggestion above.

Sample Sheet_GS.xlsx (54,7,KB)