Weighted frequencies

Hello everyone, I want to do something on Calc and just can’t find a way to get it to work.

so I’m trying to make a file to manage the number and age of a bunch of mice for my lab, long story short my file is set up in lines, where i have litters of mice: number, sex, birthdate and age (in weeks, calculated based on “today()”), so it looks like " 5, :male_sign:, 01/04/2022, 2"

Now what i want to do is to pull a population graph like you often see when talking about the population increase or decline in country XYZ.

Using the “Frequency” function I can quite easily pull how many times the last “2” appears, but that only tell me how many litters of 2 week old mice I have. Because it’s a litter of 5 I would want this “2” to be counted five times, so that my frequency chart properly reports 5 mice of that age.

Does anyone knows how i could possibly do that ?

I understand that the easiest fix would be to always have one line per mouse, but sometimes they give birth to nine pups at once, and sometimes you have to move them around, so that would make keeping the file up to date quite a chore. maybe. also my file is arranged to mimic the arrangement of the cages in the animal facility, so adding and deleting lines all the time is sure to make a mess.

Edit: not quite as important, but if anyone happens to know how I can put a condition in the frequency counting process, that would also be great to count males and females separately.

Thank you very much !

Attach a sample file with what you have, so we have actual data to work with and see your chart. That would better show us (potential helpers) what you are trying to achieve, relieve us from creating a (possibly flawed) data set for testing our suggestion, and eliminate some guessing on our part.

Also, a precise description of the columns may help. Sometimes a grasp of the subject helps, even when the task is “purely mathemathical”. Just assume that we know nothing about lab life. E.g.:

  • Does the “number” column hold the count of offspring in the litter, is it an ID number ifentifying the mother or father, or something else?
  • Is “weeks” the current age of a litter, age of the mother at birth, or what?

Use a Pivot Table

Just select any cell in your data table (assuming you have column headers that directly touch the data themselves and nothing else does) then click Data>Pivot Table>Insert or Edit. Then press OK for Current Selection. Then from the right slide your Weeks Age header label into the Row Field box and your Count header label into the Data Field Box. A freq chart of your ages will appear on a new worksheet.

Note

This is inherently a “denormalized” way to enter the data…unless all of your litters are all male or all female (what an experiment!). So, not only is @Villeroy right about using a database, you should also find a way to not aggregate data while entering it. Instead, you should have each litter entry as:

MalesCount, FemalesCount, Date

and, as you are doing, calculate Age on the fly.

That said, taking what you have already, you can also use SUMIF(). It’s made for this exactly. I also imagined using SUMPRODUCT(IF()), but it was no where near as obvious.

I’ve attached an ODS with both pivot table and SUMIF.

SumForMice.ods (17.4 KB)

1 Like

Hello everyone,

Thank you very much for your responses. I never used databases nor Pivot tables, but i will look into it.

In the meantime it dawned on me that i could actually sort of “bruteforce” the process by using a table full of sum of sumifs. it’s not pretty but seems to work.

I attached the file in question if you want to check it out
edit : for the record, the “w old” that appear after the age is not real text, it’s inserted by formatting. and yes, the number is the number of offspring, there is cage IDs but it is not relevant here. and “weeks” is the age of litter.

mouse cages.ods (37.2 KB)

This should be a scientific/medical use-case.

In addition to what others already told:
Never use unstandardized date formats. If something like 01/04/2022 gets exported as a text in this format, it is seriously ambiguous (D/M/Y or funny M/D/Y?), can’t be sorted into a bunch of equally formatted dates, and …
The one and only correct format for dates occurring in collections of data is ISO 8601 (extended): YYYY-MM-DD.
For entering dates into a Calc spreadsheet you can still define your preferred date acceptance patterns. (Don’t miss to check the results by eye.)

BTW: The second and the third sheet of your example aren’t data in a reasonable sense. Entries are scattered over the sheets in a way making it extremely difficult to evaluate them and to maintain the sheets (considering some changes/enhancements/scaling may be needed one day). This is independent of the question whether data should be kept in spreadsheets at all.,
Create a table where you collect the actually observable data/events row by row. You may use a sheet of paper looking similar to what you have as an interim scrap.
If a kind of pretty print looking similar to what your sheets show is urgently needed, create it in extra sheets using formulas (and probably even macros) to fill the printing positions.

For any task like this (storing records in tables) one should always use a database instead of a spreadsheet. The one and only feature that brings a tiny little bit of database into sheets is the pivot table.
t76574.ods (50.3 KB)