Plotting histograms in calc?

I frequently have to plot frequency of occurrence of values using varying window sizes and it is frustrating apparently not to be able to do it in Calc without having to pre-process the data in some way. Does anyone have a solution? Is it a feature we can expect “real soon now”?

My dataset sizes can vary between a few objects up to hundreds and beyond and I need to be able to adjust the “bucket” size. In theory it should be straightforward… :wink:

Thanks

PS I did actually write a Python program to do it, but it required exporting, reformatting etc etc before I even got to starting the plot.

1 Like

Is this the same as How do I make a frequency-density (unequal bin width) histogram? https://bugs.freedesktop.org/show_bug.cgi?id=81530 ?

Duplicate of this question.

oweng: Could you please provide a working link?

Check my answer in this question (Percentage based histogram). Could be a solution for you.

A histogram is basically a graph of frequencies vs. categories (i.e. ranges of values indicated by bins), so you could use the frequency() function provided. It’s an “array” function, meaning you need to highlight the range of cells you want filled with the results. Here’s a FAQ on this function.

Basically, you need to start with a column listing your data (e.g. scores on an exam?), a 2nd column with bins (e.g. 0, 10, 20, …, 80, 90, 100), and a third column for frequencies. Highlight the cells in your frequency column (should be one less cells than your # bins, since the bins indicate the range for each category) and enter the formula as follows:

=frequency(data,classes)

“data” are your data that you want a to create a histogram to describe. “classes” would be the bins you’ve chosen to specify the ranges for each category.

If you’ve entered the formula correctly you just need to plot the values it gave with a bar chart. It might take a bit of fooling around, but it works well, and it’s a very effective way to make histograms of all types.

Read the answers in this question. Might help ;-)).

The link is broken

Could you please update your answer with a working link?

Thx, but it doesn’t really answer it. The problem is that data is in general unsorted, so that we need to be able to specify a column and have the algorithm first sort the data into as/descending order and within each bucket (ie selected plotting interval) count the number of occurences. The results should be offered as a simple column histogram chart. It’s the first step that the current package fails to offer. it ought to be straightforward; the parameters are column containing data and bucket width.

PeterD, this should have better been a comment to an answer (which one?) instead of a separate new answer.

The “array formula” FREQUENCY creates histogram data from unsorted raw values.