Convert comma separated string to serialized data

I am uploading a CSV file to a website. I need to convert data in one of the fields (i.e. column of cells) to serialised data for the website plugin to accept it and populate correctly.
I have it in this form (this is an example)
Monday 11 May, Afternoon, Monday 11 May, Evening, Friday 15 May, Morning, Friday 15 May, Afternoon, Friday 15 May, Evening, Saturday 16 May, Morning, Saturday 16 May, Afternoon, Saturday 16 May, Evening, Sunday 17 May, Morning, Sunday 17 May, Afternoon, Sunday 17 May, Evening
I need to convert it to
a:11:{i:0;s:24:”Monday 11 May, Afternoon”;i:1;s:22:”Monday 11 May, Evening”;i:2;s:22:”Friday 15 May, Morning”;i:3;s:24:”Friday 15 May, Afternoon”;i:4;s:22:”Friday 15 May, Evening”;i:5;s:24:”Saturday 16 May, Morning”;i:6;s:26:”Saturday 16 May, Afternoon”;i:7;s:24:”Saturday 16 May, Evening”;i:8;s:22:”Sunday 17 May, Morning”;i:9;s:24:”Sunday 17 May, Afternoon”;i:10;s:22:”Sunday 17 May, Evening”;}
Where a is the number of terms (‘Sunday 17 May, Afternoon’ counts as 1 term)
i is the a-1 ordinal number of the following term
s is the character count, including commas and spaces in the following term.
I need a formula or macro to convert each cell to that format - the ‘a’ count of terms varies from entry to entry.

please attach an example.ods which shows unambiguous how your data is organized in the spreadsheet,
use the ⇑ -upload button in the middle of the toolbar above the edit-frame.

I would not try to do all in one step. You could use the spreadsheet itself to get a more usable arrangement of the data. For example: Use the TEXTSPLIT function to convert the one long string into a row where each cell has only one content. Then use the WRAPROWS function to arrange this row to a two-dimensional array where the days are stacked vertically. Then add a column for the character count and other things you want to calculate.


To combine it back to one long string you can use TOROW function to generate a horizontal sequence of cells and the TEXTJOIN function to combine these cells into a single string.


There exist similar functions for the other orientation: WRAPCOLS, TOCOL. Anyway, you need the newest LibreOffice version for these functions.

Here is an example of the data. 2 records with an id number (user) for each. The second column has 2 examples of the data that needs transrorming.
example of data.ods (13.4 KB)

Assuming your data is in one Column, the following python replace each cell-content with the »serialized« 
 select the cells with the data and run:

import re

rex = re.compile(r"\w+ \d+ \w+, \w+")

def funny_serialize(*_):
    doc = XSCRIPTCONTEXT.getDocument()
    data = (sel:=doc.CurrentSelection).DataArray
    out = []
    for entry in data:    
        a=len(raw_list:=rex.findall(entry[0]))
        _payload = ";".join(f'i:{i};s:{len(entry)}:"{entry}"' for i, entry in enumerate(raw_list))
        out.append( (f'a:{a}:{{{_payload};}}',) )
    sel.DataArray = out

probably you’ll need help to organize the python-stuff, use apso.oxt from here

edit:
and here’s the pure-calc-monster-formula to convert:

=LET(data;TRIM(WRAPROWS(TEXTSPLIT(B2;",") ;2));
pay;INDEX(data;;1)&", "&INDEX(data;;2);
"a:"&COUNTA(pay)&":{"&
TEXTJOIN(";";1;"i:"&SEQUENCE(COUNTA(pay);;0)&":s:"&LEN(pay)&":"""&pay&"""")&";}")

edit: @MHBurton @Regina slightly shorter Formula which works also with LO25.2

=LET(r;REGEX(B2;"\w+ \d+ \w+, \w+";;SEQUENCE(99));
   f;FILTER(r;NOT(ISNA(r)));
   a;COUNTA(f);
   "a:"& a & ":{" & TEXTJOIN(";";1;"i:" & SEQUENCE(a;;0) & ":s:" & LEN(f) & ":""" & f & """" ) & ";}")
1 Like

Thank you.
Unfortunately the formula didn’t work - I just got a ?NAME error.
I’ll try the script but that won’t be for a week now.

maybe you will need update LO to version 25.8.x ?

1 Like

Still struggling I’m afraid.
I downloaded the latest LibreOffice (25.2.6.2) and then the apso.oxt extension file but the extension failed with this error on trying to install it: see screenshot of error message.


So I can’t install your script. The formula gives me a ?NAME error but I assume it needs the python script installing before it will work.

the Formula needs nothing else, except LO25.8.x.x, but you have:

Perhaps an algorithm could be used herein:
example of data_v002_081946.ods (26.1 KB)

Thanks. In the end I used your algorithmic approach and then a series of nested formulae to concatenate the strings I need. It is a bit cumbersome but now done it will allow me to convert the comma separated list to the serialised string. (The formulae given above didn’t work for me - CALC didn’t understand the term ‘data’ in the LET function.)

As the purpose of =LET is to define a meaning for “data” this is a funny conclusion

https://help.libreoffice.org/latest/en-US/text/scalc/01/func_let.html?DbPAR=CALC

My guess is the culprit is WRAPROWS (already mentioned above you need 25.8. for the formila.

https://help.libreoffice.org/25.8/en-US/text/scalc/01/func_wraprows.html?&DbPAR=SHARED&System=UNIX

:thinking:

You don’t read Uyghur /ug/ ?? :wink:
.
I corrected the links for en-us (and I never understood, why google often provides links to other languages for LibreOffice help. They are in english, as long as nobody translates them, so there is a hit, but why prefer over en-us?)

yep.
→ 164964 – Hebrew help contains uninterpreted \</emph\> etc.

and generally difficult to understand what the directions are :expressionless:
https://community.documentfoundation.org/t/documentation-team-meeting-minutes-friday-october-31st-2025-15-00-utc/13340/19