Finding matching or duplicate or "contains matches" across tables in Power Pivot

We Want to Flag Rows in our Companies Table (on left) When They
Contain a Keyword from our MatchList Table (on right)

These are a few of my favorite things…

Perhaps the only thing that makes me happier than a new “X” function (I still badly want a CONCATENATEX) is “inventing” a new one (like we’ve seen with PRODUCTX).

The other day I was looking at a Power Pivot model and thinking “gee, it sure would be nice to have a CONTAINSX.”

Turns out we can “make” our own CONTAINSX using SUMX.

Does this row’s value in list one also appear in list two?

imageReduced to this generic form, this a pretty common overall need.

Sometimes you can do this VERY quickly in Power Pivot by relating the two tables, and then writing a =RELATED calc column in table 1 to see if it has a matching value in table 2.

But there are times when that doesn’t work.  For instance, when you’re not looking for an exact match, but a “contains” match.

Cutting to the chase

I won’t drag this one out.  Let’s give you a formula (I won’t call it THE formula, because there are definitely other ways and probably better ways).

  [Is this company a metals company] =

  =IF(
      SUMX(MatchList,
           FIND(
                UPPER(MatchList[Keyword]),
                UPPER(Companies[Company])
                ,,0
               )
          ) > 0,
      “YES!”,
      “Probably Not”
     )

In English
 

  1. The SUMX part – Step through every row in MatchList.  For each row in MatchList, evaluate the FIND function.  FIND will return a number.  Sum up all the values you get from FIND.  (There will be 5 values to sum up, because there are 5 rows in MatchList.
  2. FIND – for this row of MatchList, see if you find that substring in the current row of Companies.  If you do, return the number of the position where that substring is found, like FIND always does.  If you don’t find a substring match, return 0.
  3. UPPER – I did this to make the FIND case-INsensitive.  If you want it to be case-sensitive, remove the UPPERs.
  4. IF – if you get 0 back from the SUMX, that means no matches were found, so return “Probably Not.”  If anything other than 0 comes back, there was at least one match (maybe more!), so return “YES!”

No relationship!

image

No Relationships Are Required for this Technique

A million variations

I bet there are lots of practical ways to twist this:  Return the number of matches, rather than Yes/No.  Case-sensitive versus insensitive.  Begins with, ends with, exact match.  Go nuts.

Grab the workbook

Seriously, just grab it already. Smile