Filtering Yahoo Mail and Live Mail in Google Analytics

[Note: Please leave a comment if there is another type of filter that you would like to see? Or issue you have with GA data.]

One of the time consuming and annoying things about Google Analytics is that it handle the sub-domains from Yahoo and Live mail as separate referral sources. There is not sufficient documentation at Google to explain how to condense these mail programs into a single source so I will show you how.

First, if you don’t already have one, you should create a Test Profile. Just in case anything goes wrong — you don’t want to screw up your existing profile’s data.

Now make a Custom Advanced Filter:

Google Analytics Mail Filter

Extract Campaign Source. In this I am extracting anything that ends in mail.yahoo.com and overwriting the source with “mail.yahoo.com”

  1. Field A -> Extract A: Campaign Source: (.*)\.mail\.yahoo\.com$
  2. Output To -> Constructor: Campaign Source: mail.yahoo.com
  3. Field A Required: Yes
  4. Override Output Field: Yes
  5. Case Sensitive: No

This filter will collect all URLs that end in mail.yahoo.com and condense them to only mail.yahoo.com. You can do a similar filter for Live or any other e-mail service that is being seen as multiple referrers so your goal conversion tab will be more accurate and useful.

Why a Custom Filter is Necessary

Google Analytics has a pre-made filter called Search and Replace, but because it does not accept Regular Expression commands you would need to create a separate filter for every webmail account rather than this filter that handles the problem at a provider level.

A note on regular expressions: The extract fields are give special meaning to some character (see below), make sure that you use a forward slash (\) before your periods that are supposed to be read as periods, otherwise you may get bad results.

From Google FAQ

Regular Expression Characters

Click on each character’s description to read a detailed article describing how to use it.

Wildcards

. Matches any single character (letter, number or symbol) goo.gle matches gooogle, goodgle, goo8gle
* Matches zero or more of the previous item The default previous item is the previous character. goo*gle matches gooogle, goooogle
+ Just like a star, except that a plus sign must match at least one previous item gooo+gle matches goooogle, but never google.
? Matches zero or one of the previous item labou?r matches both labor and labour
| Lets you do an “or” match a|b matches a or b

Anchors

^ Requires that your data be at the beginning of its field ^site matches site but not mysite
$ Requires that your data be at the end of its field site$ matches site but not sitescan
Note: to understand why anchors are necessary, please read Tips for Regular Expressions at the bottom of this page.

Grouping

() Use parenthesis to create an item, instead of accepting the default Thank(s|you) will match both Thanks and Thankyou
[] Use brackets to create a list of items to match to [abc] creates a list with a, b and c in it
- Use dashes with brackets to extend your list [A-Z] creates a list for the uppercase English alphabet

Other

\ Turns a regular expression character into an everyday character mysite\.com keeps the dot from being a wildcard

Tips for Regular Expressions

  1. Make the regular expression as simple as possible so that you and your colleagues can work with them easily in the future.
  2. Be sure to use a backslash if you have characters like “?” or “.” and you wish to match those literal characters — otherwise, they will be interpreted as special regular expression characters.
  3. Not all regular expressions include special characters. For example, you can specify that a Google Analytics goal be a regular expression, and even if you don’t have any special characters, your goal will be interpreted according to the rules of regular expressions.

Regular expressions are greedy. For example, site matches mysite and yoursite and sitescan. If site is your regular expression, it is the equivalent of asking to match to all strings that contain site. Therefore, you should use anchors whenever necessary, to get a more accurate match. ^site$, which uses both a beginning ^ and ending $ anchor, will ensure that the expression has to start with site and end with site and include nothing else. Notice, too, that there were no special characters in the regular expression site – it is interpreted as a regular expression only if it is in a regular expression-sensitive field.

Share and Enjoy:

  • email
  • Sphinn
  • del.icio.us
  • StumbleUpon
  • Mixx
  • Google Bookmarks
  • Digg
  • Twitter
  • Facebook

Related posts

25 Responses to “Filtering Yahoo Mail and Live Mail in Google Analytics”

  1. [...] Hundred Dollar SEO You Get What You Pay For Skip to content ContactArchivesSitemapSexiest Man In SEO « Filtering Yahoo Mail and Live Mail in Google Analytics [...]

  2. [...] you should be linking with an appropriate description of the content that you are linking to like: filtering mail in Google Analytics. If you need a full sentence to describe a link something is suspect — both as a reader and [...]

  3. [...] Filtering Yahoo Mail and Live Mail in Google Analytics – 100 Dollar SEO [...]

  4. [...] Filtering Yahoo Mail and Live Mail in Google Analytics – 100 Dollar SEO [...]

  5. [...] Filtering Yahoo Mail and Live Mail in Google Analytics – 100 Dollar SEO [...]

  6. Services SEO March 12, 2009

    Is there a way to put this on a trial basis?

  7. Carlos del Rio March 12, 2009

    I’m not sure what you mean by trial basis. You can do it on a separate profile.

  8. denise martens March 15, 2009

    This is very interesting information. I just bookmarked it now.

  9. metafever June 24, 2009

    I have to say I’m really impressed with your posts and blog overall. I stumbled on your site accidentally but am now happy I did. I’ll be stopping in to read more often now. Thanks again !
    Thanks,
    Lou

  10. blaze July 23, 2009

    Nice post. Thanks for sharing these tips.

  11. Midwest Fire August 18, 2009

    Thank you for sharing such a complicated topic with the distinct screen shots. Great info to know when setting up our Analytics!

  12. Mark November 1, 2009

    Well, I don’t actually want any email hosts appearing in my referrals.

    I already tag my emails so I’m wondering if GA is actually double counting when it lists email hosts.

    IE When I look in my Medium section it shows me X number of visits came from email campaigns.

    Now, when I look in referrers I see various email hosts. Have these already been counted under email campaigns.

    If so I may as well filer them out altogether.

    Any thoughts?

  13. Carlos del Rio November 1, 2009

    No, Google isn’t double counting, it is just a different way of labeling the visitors. Combining this filter will allow you to better see Yahoo and MSN mail as channels in conjunction with, and independent, of your tagged campaigns.

  14. Mark November 2, 2009

    Thanks Carlos,

    that’s what I wanted to know.

  15. Mark November 2, 2009

    Another Question.

    Could I also use this to fix my own url referring issues.?

    IE: “mysite.com.au” often appears in the Referring sites report.

    Apart from the fact that it’s annoying, I’d rather it was not counted as a referrer.

    It’s not a really large amount, and I’ve done the usual things like setting up sub domain inclusion, excluding staff traffic etc, but I still get a certain amount of referrals from my own domain.

    The standard explanation is users have loaded a page on their browser and left it for long enough for the session to expire. Then when they re-engage with the page it gets recorded as a referral from my own site.

    Fair enough, but I still don’t want it appearing in the referrals report.

    Can I adapt your method to overcome this?

  16. Seth January 21, 2010

    Great thank you for this post. When does the filter get applied to the data? Does it effect future visits or will it adjust past records of referrals? Thanks.

  17. Carlos del Rio January 21, 2010

    It is applied immediately and only affects future visits.

  18. Seth January 21, 2010

    Thanks Carlos!

  19. [...] the most important areas you can apply campaign tagging is e-mail. And though you can get fancy and create a filter that combines email sources you should not have to do this because your campaigns should be tagged to begin with. So definitely [...]

  20. [...] the most important areas you can apply campaign tagging is e-mail. And though you can get fancy and create a filter that combines email sources you should not have to do this because your campaigns should be tagged to begin with. So definitely [...]

  21. [...] the most important areas you can apply campaign tagging is e-mail. And though you can get fancy and create a filter that combines email sources you should not have to do this because your campaigns should be tagged to begin with. So definitely [...]

  22. [...] the most important areas you can apply campaign tagging is e-mail. And though you can get fancy and create a filter that combines email sources you should not have to do this because your campaigns should be tagged to begin with. So definitely [...]

  23. [...] the most important areas you can apply campaign tagging is e-mail. And though you can get fancy and create a filter that combines email sources you should not have to do this because your campaigns should be tagged to begin with. So definitely [...]

  24. Denis March 7, 2010

    If a customer clicks on a pay per click ad and then sends us an email, we respond and the customer clicks on a tagged email link what does GA report?

  25. Carlos del Rio March 8, 2010

    @Denis
    In the case that you described GA would tag the initial click as referred from PPC, the second click as referred from e-mail AND as part of the user defined tag group.

    The purpose of the above filter is to consolidate all users of a particular web e-mail client into a cohesive group, instead of treating each e-mail user as a separate referrer.

Leave a Reply