Ask Your Question

Help with construction of database from file names [closed]

asked 2012-06-12 21:06:55 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Hi all,

I am a Musical Director working on a cruise ship and I have a Windows 7 folder with in excess of 19,000 files, which are all music arrangements in PDF format (an arrangment is a collection of parts for different instruments, eg Piano and Bass and Drums and Trumpet and Sax and Trombone. The last three are collectively known as 'horns' (short for 'horn section') The files are all named according to the following standard:

Name (composer) - description - vocal - Arranger.pdf

the description field is used to identify the numer of parts to the arrangement, eg "13 Horns", "Piano Only", "Rhythm section only", etc.

The 'vocal' field signifies whether this is a vocal arrangement; if it's an instrumental arrangement then the field is blank and the "-" separators are not included.

I want to be able to search all this data and I think I need to create a searchable database to do it. I'm aware that Windows Search will look at all the files and search for whatever my search term is, but I want to go further and have 2 or 3 search parameters, possibly searching on every field. I also want to include an element of fuzzy logic.

Can anyone tell me how to go about creating a database where each record is generated from these filenames?

And what clever things can I do with the database once created?

Thanks for your help.

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-10-16 21:48:18.777537

3 Answers

Sort by » oldest newest most voted

answered 2012-06-21 09:57:37 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Hmmm, parsing 19,000 file names into a hsqldb embedded database as provided by the native ODB container in LibreOffice will represent something of a challenge in terms of performance and stability. However, it is in theory doable via several routes :

  • use a windows scripting host script to parse the directory, split the file names into respective fields, save as CSV (not being an expert in WSH, I wouldn't know how to do that), and then import into a hsqldb ODB - this would be statically linked, i.e. you wouldn't be able to update the data held in the underlying CSV file;

  • write a LO Basic macro that would parse the file names, create an array, set up a connection to a new or existing ODB db file, and then load the contents of the array into the database - doable, but probably fairly poor performance, since everything would be loaded into memory of the LO process, including all the JVM required to handled the database bit ;

  • write some other script (python/php/perl, etc), to parse the list of files, and write it to an output format that can be imported by LO Base, or preferably, to a db server backend table - the advantage of this route is that the server would do all the heavy db lifting work, and you could just use an ODB file as a frontend for forms and reports.

If you trawl on the internet, there a quite a few scripts out there that deal with parsing file names and chopping them up into something usable. After that, it is up to you how you want to import them (or not) into Base.

edit flag offensive delete link more

answered 2012-06-21 10:16:21 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Oh, forgot to mention that Base doesn't provide for "fuzzy logic" searches, it uses regular SQL expressions, you might want to search for a db engine that provides that kind of functionality, not even MS SQL Server had it (but it might have been implemented since then) :

Note that even with other DB engines, this is not always that easy :

edit flag offensive delete link more

answered 2012-06-21 10:33:12 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Maybe programs like CubicExplorer ( could be an easier solution.

edit flag offensive delete link more

Question Tools


Asked: 2012-06-12 21:06:55 +0200

Seen: 721 times

Last updated: Jun 21 '12