Thursday, 26 December 2013

Number of keywords uniquely present on a particular statement

From the list of keywords, we will find the number of keywords uniquely present on a particular statement.
For example :- We have keywords like "people" , "morning", etc.
In a statement, "He gave people by that morning, O Deep Thought is it?" :- Two keywords i.e "morning" and "people" are present , so an answer should be 2.

Let's see Excel formula to this issue :

In sheet 2, I have some list of keywords :
morning
people
fabric
nobody
cabinet
sauce
cynical
time
temper
curious
difficult
beneath


In Sheet 1 , I have list of statements, refer below print shot :-



Formula, in column B, simple and easy one :-

=SUMPRODUCT(--ISNUMBER(FIND(keyword,A2)))

But, above formula consists few problems, as observed in a statement  "little difficulty ruffles his temper". 
Problem : In listed keyword it is "difficult" ,  but above formula also takes "difficulty" as a keyword , which we do not need.

So to rectify the above formula , in column C we used :
=SUMPRODUCT(--ISNUMBER(FIND(" "&TRIM(keyword)&" ",A2)))

Our motive is to define a word to its limit only with no other alphabet(s) are attached to it. For that, we placed a space before and after the word.  We used the TRIM function and concatenate the one space before & after the word . As adding space imples that a word is attached with a space, no other alphabets(s). But with this formula too, our problem is not actually solved as in the same statement "little difficulty ruffles his temper" , "difficulty" word problem is solved but a "temper" word is in a keyword list but unfortunately it is not taking it as a keyword because in a statement after "temper" word there is no space and with formula above we did modify the keywords with space. Therefore, it is not considering it as similar in a keyword list. Same problem lies with this formula in a statement "He gave people by that morning, O Deep Thought is it?" , "morning" is in a keyword list but because in a statement it is "morning ,(comma)" instead of "morning (space)" our formula is considering not similar as in keyword list. Same case seen in a statement "wanted that we picked beneath cabinet stuck in a curious" with "curious" keyword , it is in a keyword list but it is not considering it as keyword because of not having a space after it in statement.

To rectify above problems we used formula in column D :
=SUMPRODUCT(--ISNUMBER(FIND(" "&TRIM(keyword)&" "," "&text&" ")))

We required to match the word(s) having comma and full stop and having no space before and after it (Mainly, first and last word(s) have no space before and after it ,respectively) with modified keyword .

For that we first used the SUBSTITUTE function (to remove comma and full stop) and concatenate it with a space before and after the statement (by adding the space in a statement , it actually add space before the first and after the last words).


To remove fullstop(.) and comma(,) use formula

=SUBSTITUTE(SUBSTITUTE(Sheet1!$A2,".",""),",","")
and we defined it as a name "text" for the use of column D formula

For a reference please consider below print shot, "Name Manager" shows that there is a formula named as text;


So, by a formula in Column D, we succeed getting our desired result except one, that the FIND function is case-sensitive. Sometimes we do not have a proper case word in a statement.

Like in a statement "STUFFED WITH TOMATO SAUCE Even" . FIND function did not consider "sauce" keyword as a keyword because in a statement it is in uppercase case and in a keyword list it is in lowercase. Use SEARCH function instead of FIND function if you do not required case-sensitive outcome. 

Column E used the following SEARCH function :
=SUMPRODUCT(--ISNUMBER(SEARCH(" "&TRIM(keyword)&" "," "&text&" ")))

No comments:

Post a Comment