Ask Your Question
1

Help with construction of database from file names

asked 2012-06-12 21:06:55 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >750 is welcome to improve it.

updated 2013-01-28 01:33:43 +0200

qubit gravatar image qubit flag of United States
5693 3 48 41

Hi all,

I am a Musical Director working on a cruise ship and I have a Windows 7 folder with in excess of 19,000 files, which are all music arrangements in PDF format (an arrangment is a collection of parts for different instruments, eg Piano and Bass and Drums and Trumpet and Sax and Trombone. The last three are collectively known as 'horns' (short for 'horn section') The files are all named according to the following standard:

Name (composer) - description - vocal - Arranger.pdf

the description field is used to identify the numer of parts to the arrangement, eg "13 Horns", "Piano Only", "Rhythm section only", etc.

The 'vocal' field signifies whether this is a vocal arrangement; if it's an instrumental arrangement then the field is blank and the "-" separators are not included.

I want to be able to search all this data and I think I need to create a searchable database to do it. I'm aware that Windows Search will look at all the files and search for whatever my search term is, but I want to go further and have 2 or 3 search parameters, possibly searching on every field. I also want to include an element of fuzzy logic.

Can anyone tell me how to go about creating a database where each record is generated from these filenames?

And what clever things can I do with the database once created?

Thanks for your help.

delete close flag offensive retag edit

3 Answers

Sort by » oldest newest most voted
1

answered 2012-06-21 09:57:37 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >750 is welcome to improve it.

updated 2012-06-21 09:57:37 +0200

Alex Thurgood gravatar image Alex Thurgood
166 8 3

Hmmm, parsing 19,000 file names into a hsqldb embedded database as provided by the native ODB container in LibreOffice will represent something of a challenge in terms of performance and stability. However, it is in theory doable via several routes :

  • use a windows scripting host script to parse the directory, split the file names into respective fields, save as CSV (not being an expert in WSH, I wouldn't know how to do that), and then import into a hsqldb ODB - this would be statically linked, i.e. you wouldn't be able to update the data held in the underlying CSV file;

  • write a LO Basic macro that would parse the file names, create an array, set up a connection to a new or existing ODB db file, and then load the contents of the array into the database - doable, but probably fairly poor performance, since everything would be loaded into memory of the LO process, including all the JVM required to handled the database bit ;

  • write some other script (python/php/perl, etc), to parse the list of files, and write it to an output format that can be imported by LO Base, or preferably, to a db server backend table - the advantage of this route is that the server would do all the heavy db lifting work, and you could just use an ODB file as a frontend for forms and reports.

If you trawl on the internet, there a quite a few scripts out there that deal with parsing file names and chopping them up into something usable. After that, it is up to you how you want to import them (or not) into Base.

link delete flag offensive edit
1

answered 2012-06-21 10:16:21 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >750 is welcome to improve it.

updated 2012-06-21 10:16:21 +0200

Alex Thurgood gravatar image Alex Thurgood
166 8 3

Oh, forgot to mention that Base doesn't provide for "fuzzy logic" searches, it uses regular SQL expressions, you might want to search for a db engine that provides that kind of functionality, not even MS SQL Server had it (but it might have been implemented since then) :

http://consultingblogs.emc.com/jamiethomson/archive/2008/05/21/fuzzyness.aspx

Note that even with other DB engines, this is not always that easy :

http://www.yvoschaap.com/weblog/easy_fuzzy_logic_with_mysql_the_end_of_no_results_found

link delete flag offensive edit
0

answered 2012-06-21 10:33:12 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >750 is welcome to improve it.

updated 2012-06-21 10:33:12 +0200

mariosv gravatar image mariosv flag of Spain
4629 20 46

Maybe programs like CubicExplorer (http://portableapps.com/apps/utilities/cubicexplorer_portable) could be an easier solution.

link delete flag offensive edit

Login/Signup to Answer

Donate

LibreOffice is made available by volunteers around the globe, backed by a charitable Foundation. Please support our efforts: Your donation helps us to deliver a better product!

Question tools

Follow

subscribe to rss feed

Stats

Asked: 2012-06-12 21:06:55 +0200

Seen: 161 times

Last updated: Jun 21 '12