Traffic sources in Google Analytics: who is more important? Tracking Yandex Direct campaigns in Google Analytics

Hello everyone!

System Google analytics Analytics provides a ton of analysis capabilities various sources, traffic channels that you need to learn to use in order to find new ways to optimize, non-standard solutions for unique situations. One of these opportunities is to import expense data from third-party sources, for example from Yandex.Direct.

Why do you need to import expenses from Direct into Google Analytics, if possible using the system itself? contextual advertising and Yandex.Metrics to analyze the return on investment? Indeed, if you work only with Yandex.Direct, then this is what you need. But when several traffic sources come into play (Adwords, Direct, advertising on Facebook, Instagram, VKontakte, for example), Metrics alone are no longer sufficient - either analyze everything one by one, and then combine the results, or upload the data to Google Analytics (since Metrica does not allow this) and calmly conduct a comparative analysis, and then begin to more effectively distribute the budget between online advertising systems. I think that the second option is still better.

Setting up import into Google Analytics

First, we need to set up data import into Google Analytics so that we can download information, and Google Analytics can easily accept it. But, looking ahead, I want to clarify one point: by default, Analytics provides data on expenses/income in US dollars, which is why when importing expenses from Direct, fabulous sums of thousands of “bucks” arise. To prevent this from happening, you just need to change the currency in the view settings to the one you are working with:

Now everything related to money in Analytics will be displayed in rubles. Let's go to the import settings.

So we have two solutions: we can load the data in a CSV file, that is, prepare separate table, or use third-party online services that allow you to import into automatic mode. Today we will use the first option, and in the next lesson I will talk about the second.

I'll tell you right away. Filling out a CSV file is a challenge - the risk of making mistakes is too high, due to which import will be impossible. In addition, correcting them will take a lot of time (as it did for me). But you can use this method for the first time, to understand how the system works with import. Or at least it's fun.

We'll move on to the table a little later, but for now let's set up the import:


After all these steps, a notification should appear indicating that the data set has been defined and the data can be loaded. The “Get scheme” button will appear here, which you will need to click on:

Then simply download the CSV template:

I recommend working with the CSV file in Libre Open Office Calc to avoid problems with encoding. In addition, Google Analytics accepts CSV with a comma delimiter, and in Open Office Calc selecting the desired split and encoding will not pose any problems, and Analytics will easily read everything:

You must fill out the file itself following some rules (you can read them), otherwise you will not be able to download the data. Personally, I struggled with the date format - I need to follow next format YYYYMMDD, that is, 20180907. There were also problems with the cost of clicks - here you need to enter whole numbers, no decimal values.

Here is an example of filling out the file:

It is not very convenient to work with large volumes of data here, since it takes quite a lot of time to process and fill it out correctly.

So, once you have filled out the file, you need to upload it to Google Analytics. This is done in “Download Management” in the created data set:

Next, simply select the desired file and upload it. The data will be checked for errors. If any are found, the system will show where exactly the error is and describe it. If everything is fine, then you will see the following:

After successfully loading the data, you need to wait a little for the analytics system to accept your expense data and be able to display it in reports. Import of Yandex.Direct expense data is completed. Now you can carry out the analysis.

When it comes to direct traffic in Google Analytics, there are two deeply ingrained misconceptions.

The first is that direct traffic is almost always caused by users entering a site's URL into the browser's address bar (or clicking on a bookmark). The second misconception is that direct traffic is a bad thing; not because it has any negative impact on the operation of the site, but because it is not subject to further analysis.

Most digital marketers believe that direct traffic is an inevitable inconvenience. As a result, discussions on this topic focus on ways to assign it to other channels and troubleshoot problems associated with it.

In this article we will talk about modern view for direct traffic in Google Analytics. Not only will we look at how data on referral sources can be lost, but we will also look at several tools and tactics that can be used to reduce the level of direct traffic in your reports. Finally, we'll learn how advanced analysis and segmentation can unlock the mysteries of direct traffic and shed light on what your most valuable users might actually be.

What is direct traffic?

In short, Google Analytics records direct traffic when there is no data about how the user came to the site. Or if the transition source was configured to be ignored. In general, direct traffic can be considered as a fallback option in Google Analytics for cases where the system was unable to attribute a session to a specific source.

To understand why direct traffic occurs, it is important to understand how GA handles traffic sources.

In general terms, and without regard to user-configurable overrides, GA follows the following chain of checks:

AdWords Settings > Campaign Overrides > UTM Parameters > Search Engine Referrals > Other Site Referrals > Previous Timed Campaign > Direct Traffic

Note the penultimate processing step (previous campaign in the waiting period), which significantly affects the Direct channel. For example, a user learns about your site through organic search, and a week later returns through a direct link. Both sessions will be attributed to organic search. In fact, campaign data is retained for up to six months by default. The key point here is that Google Analytics already tries to minimize the impact of direct traffic on you.

What causes direct traffic?

Contrary to popular belief, there are actually many reasons why a session might be missing campaign and traffic source data. Below we will look at the most common of them.

  1. Manually entering addresses and bookmarks

This is a classic scenario for getting direct traffic. If the user enters the site URL into the browser's address bar or clicks on a bookmark in the browser, then this session will be counted as direct traffic.

  1. HTTPS >HTTP

Note that this is by default behavior. This is part of how the secure protocol was designed and does not affect other scenarios: transitions type HTTP-HTTP, HTTPS-HTTPS and even HTTP-HTTPS all transmit referral data.

Therefore, if your referral traffic has decreased, but your direct traffic has increased, perhaps one of your main referral sources has moved to HTTPS. The reverse is also true: if you switch to HTTPS and link to HTTP sites, the traffic you send to them will be recorded by Google Analytics as direct.

If your referrers have moved to HTTPS and you have remained on HTTP, you should also consider migrating your site to HTTPS. By doing this (and updating your backlinks to point to HTTPS URLs), you'll get back the referral data that was previously lost.

If, on the other hand, you've already switched to HTTPS and are concerned about your users registering as direct traffic on affiliate sites, you can set up a referrer meta tag. This is a way to tell the browser to pass referral data to sites over HTTP. It can be implemented as an element or HTTP header.

  1. Missing or broken tracking code

Let's say you changed the template landing page and forgot to add the GA tracking code. Or, imagine that the Google Tag Manager container is a bunch of poorly configured triggers and the tracking code simply does not fire.

So, users end up on a page with a missing tracking code. They click on the link and go to the page where the code is there. From the point of view of Google Analytics, the first hit will be a visit to the second page, and your own website (self-referral) will act as the referral source. If your domain is included in the list of excluded referral sources (according to the default settings), the session will be registered as direct. This will happen even if the first URL contains UTM parameters.

As a short term solution, you can simply add the missing tracking code. To prevent this from happening again, conduct a thorough audit of Google Analytics, move to implementing tracking through Google Tag Manager, and promote a culture of data-driven marketing.

  1. Incorrect redirect

Everything is simple here. Don't use meta refresh or redirect to JavaScript based: They may erase or replace referral data, resulting in direct traffic to Google Analytics. Also keep a close eye on server-side redirects and check your redirect file frequently. Complex chains of redirects increase the likelihood of losing referral data, as well as UTM parameters.

Again, control what you can: use carefully crafted 301 redirects to retain referral data where possible.

  1. Non-web documents

Links in documents Microsoft Word, presentation or PDF files do not convey referral information. By default, users who click on these links are recorded as direct traffic. Transitions from mobile applications(especially those that have a built-in browser) are also deprived of referral data.

To a certain extent this is inevitable. Similar to so-called “dark social” visits (discussed in detail below), non-web links are bound to generate some amount of direct traffic. However, you can always control the controllable.

If you publish scientific articles or offer downloads of PDF documents, you should add UTM parameters to your embedded hyperlinks. Probably no email campaign launches without tracking, so why would you distribute other types of materials without tracking this process? In some cases, this is even more important, given that these materials have a durability that email campaigns lack.

Below is an example of a URL with UTM parameters that will be added to the document as a hyperlink:

https://builtvisible.com/embedded-whitepaper-url/?…_medium=offline_document&utm_campaign=201711_utm_whitepape r

The same goes for URLs in offline content. For core campaigns, it's common to choose a short, memorable URL (such as moz.com/tv/) and create an entirely new landing page. You can bypass page creation altogether: just redirect this URL to URL existing page, which is correctly tagged with UTM parameters.

So, whether you tag URLs directly, use forwarded URLs, or—if you don't like UTM parameters—track hashes (URL snippets) using Google Tag Manager, the takeaway is the same: use campaign parameters wherever appropriate.

  1. "Dark» social traffic

This is a large source of referrals and probably the least understood by marketers.

The term “dark social” was first used in 2012 by Alexis Madrigal in an article for The Atlantic. Essentially, it refers to social sharing methods that cannot be easily attributed to a specific source. These include email, instant messages, Skype, WhatsApp, Facebook Messenger, etc.

Recent studies have shown that more than 80% of what people share on publisher and company sites now comes from these private channels. In terms of the number of active users, instant messengers surpass social networks. All activity generated by these platforms is usually recorded by analytics systems as direct traffic.

People who use the controversial phrase "marketing in social media" usually means advertising: you broadcast your message and hope people hear it. Even if you overcome consumer apathy with a well-targeted campaign, any subsequent interactions are affected by their public nature. The privacy of so-called “dark social” channels instead represents a potential goldmine for more personal, targeted and relevant interactions with high conversion potential. The nebulous and difficult-to-track world of dark social holds enormous potential for effective marketing.

So how can we minimize the amount of such traffic that is recorded as direct clicks? The sad truth is that there are no silver bullets: proper attribution of this traffic requires careful campaign tracking.

The optimal approach will vary greatly depending on your industry, audience, offer, etc. However, for many websites, a good first step is to provide user-friendly and properly configured sharing buttons for private platforms such as email, WhatsApp, and Slack. This will allow users to share URLs with UTM parameters appended to them (or shortened URLs redirected to those addresses). This way you can illuminate some of your “dark” social traffic.

Checklist: minimizing direct traffic

To minimize direct traffic in your reporting, follow these steps:

  • Go toHTTPS. A secure protocol is not only about access to HTTP/2 and the future of the Internet. It also has a huge positive impact on your ability to track referral traffic.
  • Optimize redirects. Avoid redirect chains and ditch client-side redirects in favor of carefully crafted server-side 301 redirects. If you are using shortened URLs to redirect to pages with UTM parameters, check that you have configured everything correctly.
  • Use campaign tags. Even among data-driven marketers, the common belief is that UTM begins and ends with the inclusion of automatic tagging in email marketing software. Others go to the other extreme, even flagging internal links. Control what you can control and you will be able to track the results of your work more effectively.
  • Conduct an auditGoogleAnalyticss. Data integrity is vital, so consider this when assessing the effectiveness of your work. A GA audit is about more than just checking for missing tracking codes. A good audit includes a review of the measurement plan and thorough testing at the page and resource level.

Follow these principles and you can see significant reductions in your direct traffic in Google Analytics. The following example involves migrating to HTTPS, GTM and a complete overhaul of internal campaign tracking processes over a six-month period:

However, the saga of direct traffic does not end there! Once this channel is clear, what remains can become one of the most valuable traffic segments.

Analyze why direct traffic can be really valuable

For the reasons we've already discussed, traffic from bookmarks and dark social is an extremely valuable segment to analyze. These will likely be some of your most loyal and engaged users, and it's not uncommon to see a noticeably higher conversion rate for pure direct channel Compared to the site average. You should make an effort to get to know these people better.

The number of potential avenues to explore is endless, but here are some good starting points:

  • Create meaningful user segments by defining subgroups within direct traffic based on landing page, location, device, repeat visits and purchasing patterns.
  • Track meaningful engagement metrics using modern GTM triggers such as scrolling and element visibility tracking. Measure how your direct users use and view your content.
  • Watch for correlations with your other marketing activities and use them as opportunities to improve your tagging and segmentation techniques. Set up custom alerts to monitor spikes in direct traffic.
  • Check out the Goal Map and Behavior Map reports to understand how your direct traffic is converting.
  • Ask your users for help! If you've isolated a valuable segment of traffic that's eluding deeper analysis, add a button to your page that offers visitors a free eBook or other useful material if they tell you how they discovered your page.
  • Start thinking (if you haven’t already) about such an indicator as LTV (life time value). Revisiting the attribution model and introducing user IDs is good steps to overcome indifference and frustration regarding direct traffic.

After creating a site, we immediately become interested in who is visiting it and how users find it in the first place. You can, of course, ask users how they came to you :), but the easiest way is to install analytics systems. I use both Metrica and Google Analytics, but I prefer the latter. And in this article I will tell you how different traffic sources behave and interact with each other.

When you log into your Google Analytics account, you see something like this:

Organic Search– this is traffic from search engines (Google, Yandex, etc.).
Direct– direct visits to the site.
Referral is the traffic you receive from links on other sites.
Social– traffic from social networks.
Other– traffic the source of which Analytics could not determine.
Email– visits from your mailings.
Campaigns– this is a separate type of traffic via links with utm tagging.

It would seem that everything is quite simple: a user found you on Google and came to your website - this is an organic visit; you were mentioned on the forum, and users who want to learn more came to you - this is referral traffic; someone shared your link on Twitter and people came to you through it - that's social traffic.

But think about it, you often find something interesting on the Internet and save it in your bookmarks to watch later. In the first case we get an organic visit, in the second - a direct one. How will this be shown in Analytics?

Hierarchy of traffic sources, or who is more important.

First, try to answer the question: which traffic source provides the least amount of useful information for you? In my opinion, this is direct traffic, and therefore it is in the hierarchy of sources that it looks like the “weakest link”; it is this that is “eaten up” by all other channels.

Examples of recording traffic sources in Google Analytics

Organic -> Direct = Organic

Misha reads a lot and follows all the new trends in real estate. He found one on Google interesting article(organic), but there was no time to read it, so Misha simply left it in the browser until better times. They arrived a week later, Misha got to the article, which was already open in his browser (direct). However, this young man will still be considered a user coming from a search engine.

Direct -> Referral -> Organic = Organic

Sasha came to the site via a link sent to him by a friend (direct); a week later, while reading some blog that links to your site, Sasha clicks on the link and again ends up with you (referral). After another 2 weeks, Sasha had a very important question, and in order to solve it, he Googled it, as everyone usually does in the modern world. From the search, Sasha clicked on your site again. In Analytics, this visit will appear as an organic one that “overshadowed” the referral visit.

After Sasha was convinced for the third time that your site was useful, he saved it to his bookmarks, and then returned to you from them several more times. Attention, question: what source of the visit will be recorded in these cases?

Referral -> Direct -> Direct = Referral

Masha came to the culinary site from a forum (referral), where she was looking for how to make mulled wine. She liked the recipe and saved the link to it in her recipe file. After that, there were several more holidays when Masha needed to prepare mulled wine, and she came to the culinary site, copying the link from her file. But in Google Analytics, Masha was still listed as a user who came from another site (referral). Then Masha remembered the recipe and stopped visiting that culinary site, but she often looks for other recipes. But that's a completely different story.

It is important to know

All previous conclusions on the “seniority” of traffic sources were made under the following conditions:

The user visits the site from the same computer and browser.

The user always has Javascipt enabled in the browser.

The user does not clear cookies.

No more than 6 months pass between the first and last visit in the chain*.

6 months– this is how long by default information about any campaign is saved in Google Analytics. Universal Analytics allows you to change this period up to 24 months.

This is how you track users who come to you every day. Such a system once again proves a simple principle of life: not everything is always as it seems; The user who comes via the link is not always registered as direct. I hope I was able to help you understand how things really work with the interaction of traffic sources.

When I do a seminar or a training (I am Google Regional Trainer in AdWords & Analytics) is often one of the tougher things for the audience to understand. My audience is usually comprised of digital marketers, SEOs, AdWords specialists, and SMB owners so IT/web development background is relatively rare. That’s why their intuitive definition for direct traffic is that it’s type-in ​​traffic, just users returning to your site by typing your URL or using bookmarks. And why wouldn’t they, as even the definition in Google’s own help center states

users that typed your URL directly into their browser, or who had bookmarked your site”

However, this definition is highly misleading and in accurate, as I can demonstrate with a simple screenshot:

These direct sessions are not that different from our organic traffic, right? Tons of new users in the direct segment – ​​this can’t be all from prior users and bookmarks!

In order to understand what “direct / none” in your Source / Medium report really is you need to know at least a bit about the technical side of how Google Analytics is able to say where visitors to your site come from.

How does Google Analytics recognize referrer sources?

The web runs on a number of protocols and one of the most widely used one is HTTP. In it there is a specification for how a browser might pass information about the referring source to a web server:

The Referer request-header field allows the client to specify, for the server's benefit, the address (URI) of the resource from which the Request-URI was obtained (the “referrer”, although the header field is misspelled.) The Referer request -header allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance. The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard.

Basically, when a user’s browser is requesting a page on your site it can supply this “Referer” field, which is then accessible to Google Analytics. GA reads and parses the value of the field, processes it and then displays it in your Source / Medium report. Make note that this field is not mandatory and also that it “MUST NOT” be set in case the source doesn’t have a Unique Resource Identifier (URI). This will be important in just a second.

What is Direct traffic in Google Analytics then?

The second thing you need to know in order to understand what direct sessions really represent is how Google Analytics attributes traffic to traffic sources and mediums. If you have a deeper interest in the subject I highly recommend getting yourself acquainted with the processing flow chart at the bottom of this page from the Google Analytics help center.

Here is the short version of it: GA will check for: AdWords auto-tagging, UTM campaign tagging parameters, and the HTTP referrer field we just discussed, in that order. If none of these are set AND if there is no prior campaign data associated with the user’s browser (ID is clientId in the _ga cookie), then Google Analytics will mark such traffic as… wait for it... direct/none.

Direct traffic is traffic for which the referrer is unknown and for which no prior campaign data could be found for the cookie (user).

So direct traffic is not direct at all, it’s just unknown, undefined. Google has no idea if your user is typed in your URL, if they used a bookmark or if something else happened. Let that really sink in and to help that process, let us see in what other cases the user’s browser will not set the “Referer” field.

The different types of “Direct”, a.k.a. “Unknown” traffic

Here is an incomplete list of the cases when a user will navigate to your site and Google Analytics will not know where the user came from so the sessions will be marked as “direct / none” (unless previous campaign data exists for that cookie):

  1. User types in a URL
  2. User clicks on a bookmark
  3. User clicks on a link in an e-mail from Outlook or Thunderbird or similar desktop software
  4. User clicks on a link in Skype or other desktop messengers
  5. User clicks on a link in a PDF, DocX, ODF, XLSX or a different type of document.
  6. User clicks on a link in a mobile app
  7. User clicks on a link from a secured site (https://something) to your non-secured site (just http://something)
  8. User clicks through a URL-shortener or in a different scenario where certain JS is being used (rare)
  9. User clicks on a link in any desktop software in general…

As you can easily see, there is a plethora of very common cases in which a user will NOT type in your URL and will NOT be using a bookmark and still be tracked as in GA. That’s why I think “unknown” is a much better term. Most of these are related to other applications forcing the browser to open a link, but in the case of https to http we actually have users coming from another site who are still not tracked as proper referrers. That’s because in the case of SSL to Non-SSL the browser is required to not pass the referral information.

Now that we know what direct / none really is, let’s see what we can do understand it better and gain some insights.

Knowing the Unknown: Demystifying Direct Traffic

There are two things you can do in order to see in the black box that is direct traffic.

1. Breakdown by Landing Page

The simplest thing you can do immediately and retroactively is to just take a look at your landing pages breakdown for the direct / none source/medium dimension. Are people landing on pages that are only accessible to logged-in users? Are they landing on pages that you’ve used in e-mail campaigns or system/operational emails? Are they landing on pages to which you have links in your desktop or mobile software (if you’re a desktop or mobile software company).

Let’s see how that looks for our own site, www.analytics-toolkit.com:

As will be the general case – most direct traffic arrives at the homepage, however, we have large amounts of it landing on various tool pages, our prices page, login page and at our account activation page. You can see the % new sessions metric is being very helpful in this case – it varies greatly between landing pages, suggesting very different types of traffic are landing on them.

If you are a more advanced user you can use the landing pages to define Custom Segments, Custom Channels, and Custom MCF Channels in order to get a better idea of ​​the different types of traffic from unknown sources.

2. UTM Campaign Tagging

This is something you can do for direct traffic that you control: emails of all kinds, links embedded in desktop software or mobile apps, links in materials distributed as PDFs or other types of documents, affiliate links, etc. By tagging your links you’ll supply Google Analytics with the referral data you want. You can use Google's URL builder tool or our own Advanced URL Builder to assist in this task. However, this will only apply for traffic going forward, so if you want to analyze prior direct sessions see solution #1 above.

As you can see, understanding direct traffic is not the easiest thing, but it’s not that hard once you have a bit of technical background. If you like this article, let me know, as we are planning on releasing a series of these more educational articles in the very near future.

Updated Aug 3, 2017: clarity and style improvements.

Georgi is an expert internet marketer working passionately in the areas of SEO, SEM and Web Analytics since 2004. He is the founder of Analytics-Toolkit.com and owner of an online marketing agency & consulting company: Web Focus LLC and also a Google Certified Trainer in AdWords & Analytics. His special interest lies in data-driven approaches to testing and optimization in internet advertising.