Ask Your Question
0

A method of using regex to apply classifications

asked 2012-10-21 10:46:05 +0200

anonymous user

Anonymous

updated 2013-02-01 05:04:31 +0200

anonymous user

Anonymous

I have a spreadsheet of bank transactions. One column contains a description of the transaction however the text is not always exactly the same. e.g.

Date (a1), Description(b1), Amount(b3)
1/1/2012, Caltex sturt, $50.00
1/2/2012, Caltex mayford, $34.00
2/3/2012, Hobble Rd Caltex Mayford, $102.00
etc....

I would like to apply a classification to each entry, e.g. fuel, based on a regex for the classification. e.g. ^Caltex .*. I was hoping that vlookup would do this by allowing me to create a table with the classification patterns, e.g.

Regex(f1), Classification(g1)
.*Caltex.*, Fuel
^Coles.*, Food
etc.....

And then apply =VLOOKUP(b2,f2:g5,2,0) to each row with the result as:

Date (a1), Description(b1), Amount(b1), Classification (c1)
1/1/2012, Caltex sturt, $50.00, Fuel
1/2/2012, Caltex mayford, $34.00, Fuel
2/3/2012, Hobble Rd Caltex Mayford, $102.00, Fuel

My take away from the documentation and trials is that this will not work. Could someone confirm this and possibly suggest an alternative?

22/10/2012 - In response to JohnSUN's answer.
I see my question examples do suggest that I am after the first word however other examples could have the pattern of interest anywhere in the text, hence the need for the power of regex matching. For example, the description could have:
Hobble St Caltex Mayford
Caltex-Stafford Maryland
Rockford Caltex

delete close flag offensive retag edit

Comments

In response to JohnSUN's answer.

Jasonm ( 2012-10-22 05:02:22 +0200 )edit

2 Answers

Sort by » oldest newest most voted
2

answered 2012-10-21 18:25:56 +0200

JohnSUN gravatar image JohnSUN flag of Ukraine
2338 2 23 36
http://wmstrong.ru/

updated 2012-10-22 20:31:56 +0200

As I understand your idea, you want to determine the category of the operation by the first word? In this case, you can use the formula

=VLOOKUP(LEFT(B2;FIND(" ";B2&" ";1)-1);$F$2:$G$5;2;0)

22/10/2012 If you want to organize a search for any of the words, change the LEFT on a combination of CONCATENATE and SUBSTITUTE:

=VLOOKUP(CONCATENATE(".*";SUBSTITUTE(B2;" ";",*|.*");".*");$F$2:$G$5;2;0)

In your column "RegEx" wildcard characters are not needed. They need to search operators.

link delete flag offensive edit
0

answered 2012-10-22 23:22:20 +0200

mariosv gravatar image mariosv flag of Spain
4644 20 46

I think would be easy to work with data distributed on columns. Select the range with data, and Menu/Data/Text to column, select ',' as separator. Once data are properly distributed, Menu/Data/Pivot table, is a very good option to analyze data.

image description

link delete flag offensive edit

Login/Signup to Answer

Donate

LibreOffice is made available by volunteers around the globe, backed by a charitable Foundation. Please support our efforts: Your donation helps us to deliver a better product!

Question tools

Follow

subscribe to rss feed

Stats

Asked: 2012-10-21 10:46:05 +0200

Seen: 134 times

Last updated: Oct 22 '12