Importing Spreadsheet to Database

Hi
I’ve created a LibreOffice spreadsheet which I call “Index of Films in Print.” It has four columns: title, director, premiere date, and availability (where you can buy it or watch online). Eventually I want to create a web page where the data can be sorted by each column. So far I have 2,400 rows. My goal is to index every film in print. My understanding is that for large amounts of data I should use a database rather than a spreadsheet for the web page. I’m an absolute beginner with database. I’ve tried setting up one, connecting to my spreadsheet (which I saved as a csv file) and I created a form in design view with four label boxes and four text boxes which I grouped together. I can see all my data and it’s in the correct boxes. Here are the problems:

  1. the text in my spreadsheet had hyperlinks and it seems they weren’t imported
  2. special characters have been reformatted (e.g. “Crazy Cinématographe. Europäisches Jahrmarktkino” was converted to “Crazy Cinématographe: Europäisches Jahrmarktkino”) When I try to enter text in the text boxes (with design mode off) nothing happens

Any suggestions or pointers would be very much appreciated.

Hi @nerves,

If you want to index every film in print and provide something online, I suggest that you look into other database technologies like MySQL. You could even use a Base front-end to connect to MySQL.

Migrating a Base database to MySQL at a later point is possible, but given that you’re already running into some issues with character encoding, I’d try to move from your spreadsheet directly to a database that you can use online.

Do you have web hosting right now? If so, ask them about what kinds of databases they make available to their users. That could help you decide what tools to use to create and manage your film database.

One more quick design note: Given the number of films in print right now, I suggest that you only display a fraction of the film titles on each page. That will make your web application much faster and make it easier for people with older computers or limited system resources to view your website.

Thanks quibit. I have web hosting with dreamhost and will contact them.

Hi qubit, dreamhost does have MySQL and I was able to import my spreadsheet as a csv file. I’m not sure what the next step is. I want it to look pretty much like this: http://www.danwillardmusic.com/filmindex/ . The problem is that the HTML tables in the web design software I’ve used (Kompozer and

Seamonkey) are limited to 1,000 rows. So is it possible to create a page(s) like the one above except with an unlimited amount of rows? If I can do this with MySQL please point me in the right direction. I want the columns to be sortable and have anchor text.

Hi @nerves,

Based on what you’ve described, it looks like you don’t even need a database. Below you’ll find a script that can take in a tab-delimited CSV file and pump out some HTML for you. That can just be pushed up to your website (using scp or whatever :slight_smile:

Notes:

  • If you want sortable columns, you’ll want to sprinkle a little javascript magic on this :slight_smile:
  • You’ll need ruby on your machine. If you’ve running a popular GNU/Linux distribution or OSX, I think you’ll have that already installed, but if you’re on Windows or don’t seem to have it, please see these instructions.

Here’s the central part of the script – it pulls apart the CSV and throws everything into a table:

# Strip surrounding whitespace and/or quotes from data chunks.
class String
  def strip_moar
    self.strip!
    self.sub!(/^\"(.*)\"$/, '\1')
  end
end

c = "th"
# Pull in the CSV and process it line by line.
File.foreach(input_filename, "\n") {|line|
  # Split the line up and toss into cells.
  line = line.split("\t").each{|s| s.strip_moar}.join("</#{c}><#{c}>")
  outfile << "     <tr><#{c}>#{line}</#{c}></tr>\n"

  # Only the first row uses table headers.
  c = "td"
}

Files:

  • This file isn’t a JPG, but I we can’t even upload text files to the Ask site – so download it and take off the “.jpg” extension:

Thanks again. I downloaded and renamed the file from .jpg to .rb and it opened with textedit when I double clicked. I’m on a mac. Does that mean I don’t have ruby installed? And to install it I use terminal?

@nerves,

Try opening a terminal and typing:

which ruby [ENTER]

If it gives you a path like ‘/usr/bin/ruby’, then you’ve got ruby installed. Read the script for more information – you’ll want to run it from the terminal like this:

ruby films-in-print_ask13300.rb input-file-here.csv [ENTER]

Thanks. I do have ruby installed. Does the .rb file you created have to reside in any particular place on my computer? Do I need to specify the path to the csv file in the command? Here is what I entered in Terminal: ruby films-in-print_ask13300.rb /Users/nirvesh/Documents/Movies/index.csv but I get an error : ruby: No such file or directory – films-in-print_ask13300.rb (LoadError)

@nerves – When you run the command in the terminal, you need to be in the same directory as the .rb file. It’s probably easiest to save the script to the same directory as your input CSV.

Open Terminal, type ‘cd ~/Documents/Movies/’, then run the command. If you’re in the same directory as your CSV file, you don’t have to specify the absolute path, you just can just enter the filename.

@qubit1 - Thanks. It works but there are formatting problems for special characters and the hyperlinks from my spreadsheet didn’t translate to HTML. Is there an easier way to fix this other than editing the HTML? http://www.danwillardmusic.com/filmindex/

@nerves,

More notes on special characters and hyperlinks:

  1. the text in my spreadsheet had hyperlinks and it seems they weren’t imported

I agree – hyperlinks in a Calc document do not appear to survive CSV export. This could be a good candidate for an enhancement for CSV export.

As a workaround, could you just put the hyperlink in a separate column in your spreadsheet?

  1. special characters have been reformatted (e.g. “Crazy Cinématographe. Europäisches Jahrmarktkino” was converted to “Crazy Cinématographe: Europäisches Jahrmarktkino”)

Special characters appear to export and display correctly for me. Here are the steps I took:

  • Open your Calc file
  • Navigate to File → Save As
  • Choose “Text CSV” from the drop-down and give your file a name
  • Check the box next to “Edit filter settings”
  • In the “Confirm File Format” dialog, choose “Use Text CSV Format”
  • In the “Export Text File” dialog, choose
    • Character set: Unicode (UTF-8)
    • Field delimeter: {Tab}
    • Text delimiter: "
    • Only check “Save cell content as shown”
  • Click “OK”

The CSV file from your text above appears to have properly-accented characters in it. Do you see something different?

@qubit1 - As for the links: I wanted them to be anchor text for three of the fields - title, director, and availability. Do links survive CSV export in Excel? As for special characters: I did follow the steps you outlined above but if you’ll scroll down here: http://www.danwillardmusic.com/filmindex/ to the first entry in the director field, you can see that what was Lumière in my calc file has become Lumière.

@nerves – Hmm… ah, okay. The charset is set in the HTML header to be ISO-8859-1, not UTF-8. Just change ‘charset=ISO-8859-’ to ‘charset=UTF-8’ in the ruby script, and it should display properly in your browser.

@qubit1 - Thanks! Worked like a charm. Any further suggestions regarding exporting links? Do you know if I can do it with Excel?

@nerves, don’t know about exporting links, or how to do it in Excel, sorry.

@qubit1 - Thanks for all your help. I think I’m set: http://www.danwillardmusic.com/filmindex/. I downloaded Excel for mac on a trial basis and tried a couple things. It allows you to save as an html file which preserves the links but the file was 56mb which crashed my browser. I looked at the code and it repeated the formatting code for every row. I also tried saving as a csv file but there was no option to set the delimiter to tab. I suppose I could alter your script as a workaround but when I

opened the file generated by your script in Dreamweaver it looked as though none of the links survived so it would probably be a waste of time. Anyway, I can add links and new data directly to my html file from now on. My only concern is that if I get to 70,000 rows the file would be too big. Is there a way to edit the html so the table is split into several pages and when the columns are sorted it would sort the entire table and split into several pages?

@nerves – If you want to sort data across multiple pages, then the easiest thing would be to put your data in a database…

@qubit1 - Could I keep the format I have now with the rows but have x amount of rows per page which are sortable by the columns (and sorting by columns would sort the entire table - not just the current page)?