Ask Your Question
0

A method of using regex to apply classifications [closed]

asked 2012-10-21 10:46:05 +0100

anonymous user

Anonymous

updated 2013-06-10 21:27:57 +0100

manj_k gravatar image

I have a spreadsheet of bank transactions. One column contains a description of the transaction however the text is not always exactly the same. e.g.

Date (a1), Description(b1), Amount(b3)
1/1/2012, Caltex sturt, $50.00
1/2/2012, Caltex mayford, $34.00
2/3/2012, Hobble Rd Caltex Mayford, $102.00
etc....

I would like to apply a classification to each entry, e.g. fuel, based on a regex for the classification. e.g. ^Caltex .*. I was hoping that vlookup would do this by allowing me to create a table with the classification patterns, e.g.

Regex(f1), Classification(g1)
.*Caltex.*, Fuel
^Coles.*, Food
etc.....

And then apply =VLOOKUP(b2,f2:g5,2,0) to each row with the result as:

Date (a1), Description(b1), Amount(b1), Classification (c1)
1/1/2012, Caltex sturt, $50.00, Fuel
1/2/2012, Caltex mayford, $34.00, Fuel
2/3/2012, Hobble Rd Caltex Mayford, $102.00, Fuel

My take away from the documentation and trials is that this will not work. Could someone confirm this and possibly suggest an alternative?

22/10/2012 - In response to JohnSUN's answer.
I see my question examples do suggest that I am after the first word however other examples could have the pattern of interest anywhere in the text, hence the need for the power of regex matching. For example, the description could have:
Hobble St Caltex Mayford
Caltex-Stafford Maryland
Rockford Caltex

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-10-18 18:50:25.601516

Comments

In response to JohnSUN's answer.

Jasonm gravatar imageJasonm ( 2012-10-22 05:02:22 +0100 )edit

2 Answers

Sort by » oldest newest most voted
2

answered 2012-10-21 18:25:56 +0100

JohnSUN gravatar image

updated 2012-10-22 20:31:56 +0100

As I understand your idea, you want to determine the category of the operation by the first word? In this case, you can use the formula

=VLOOKUP(LEFT(B2;FIND(" ";B2&" ";1)-1);$F$2:$G$5;2;0)

22/10/2012 If you want to organize a search for any of the words, change the LEFT on a combination of CONCATENATE and SUBSTITUTE:

=VLOOKUP(CONCATENATE(".*";SUBSTITUTE(B2;" ";",*|.*");".*");$F$2:$G$5;2;0)

In your column "RegEx" wildcard characters are not needed. They need to search operators.

edit flag offensive delete link more
0

answered 2012-10-22 23:22:20 +0100

m.a.riosv gravatar image

I think would be easy to work with data distributed on columns. Select the range with data, and Menu/Data/Text to column, select ',' as separator. Once data are properly distributed, Menu/Data/Pivot table, is a very good option to analyze data.

image description

edit flag offensive delete link more

Question Tools

Stats

Asked: 2012-10-21 10:46:05 +0100

Seen: 744 times

Last updated: Oct 22 '12