Ruthless sitemap html. Using multiple cards. Does Sitemap affect promotion?

The sitemap.xml file is a tool that allows webmasters to inform search engines about the site pages that are available for indexing. Also, in the XML map you can specify additional page parameters: date latest update, update frequency and priority relative to other pages. Information in sitemap.xml can influence the behavior of the search crawler and, in general, the process of indexing new documents. The sitemap contains directives for including pages in the queue for crawling and complements robots.txt, which contains directives for excluding pages.

In this guide you will find answers to all questions regarding the use of sitemap.xml.

Do I need sitemap.xml

Search engines use sitemap to find new documents on the site (this can be html documents or media content) that are not accessible through navigation, but need to be crawled. Having a link to a document in sitemap.xml does not guarantee that it will be crawled or indexed, but most often the file will help large sites be indexed better. In addition, data from XML maps used in determining canonical pages, unless specifically stated in the rel=canonical tag.

Sitemap.xml is important for sites where:

  • Some sections are not accessible through the navigation menu.
  • There are many isolated pages or poorly connected pages.
  • Technologies that are poorly supported by search engines are used (for example, Ajax, Flash or Silverlight).
  • There are a lot of pages and there is a chance that the search crawler will miss new content.

If this is not your case, then most likely you do not need sitemap.xml. For sites where every page important for indexing is available within 2 clicks, where JavaScript or Flash technologies are not used to display content, where canonical and regional tags are used if necessary, and fresh content appears no more often than a robot visits the site, in the file sitemap.xml is not necessary.

For small projects, if there is only a problem with a large level of document nesting, it can be easily solved with using HTML sitemaps without resorting to using XML cards. But if you decide that you still need sitemap.xml, then read this guide in its entirety.

Technical information

  • Sitemap.xml is text file XML format. However, search engines also support text format(see next section).
  • Each sitemap can contain a maximum 50,000 addresses and weigh no more 50MB(10MB for Yandex).
  • You can use gzip compression to reduce the size of the sitemap.xml file and increase its transfer speed. In this case, use the gz extension (sitemap.xml.gz). At the same time, weight restrictions remain for uncompressed sitemaps.
  • The location of the Sitemap determines the set of URLs that can be included in the Sitemap. The map containing the addresses of the pages of the entire site should be located in the root. If the sitemap is located in a folder, then all URLs in this sitemap should be located in this folder or deeper ().
  • Addresses in sitemap.xml must be absolute.
  • The maximum URL length is 2048 characters (1024 characters for Yandex).
  • Special characters in the URL (such as the ampersand "&" or quotes) must be masked in the HTML entity.
  • The pages specified in the map must display a 200 http status code.
  • The addresses listed in the map should not be closed in the robots.txt file or in meta-robots.
  • The sitemap should not be closed in robots.txt, otherwise the search engine will not crawl it. The file itself may be in the index, this is normal.

XML map formats

Search engines support a simple text sitemap format, which simply lists the URLs of pages without additional parameters. In this case, the file must be UTF-8 encoded and have the extension .txt.

Search engines also support the standard XML protocol. Google additionally supports sitemaps for images, videos, and news.

An example sitemap containing only one address.

https://сайт/ 2018-06-14 daily 0.9

XML tags
urlset
url(required) - The parent tag for each URL.
loc(required) - Document URL, must be absolute.
lastmod- date last change document in Datetime format.
changefreq- frequency of page changes (always, hourly, daily, weekly, monthly, yearly, never). The meaning of this tag is a recommendation search engines, not as a team.
priority- URL priority relative to other addresses (from 0 to 1) for scanning order. If not specified, the default is 0.5.

XML map for images

Some optimizers insert links to images into sitemap.xml in the same way as links to HTML documents. This can be done, but it is better for Google to use an extension standard protocol and send along with the URLs Additional information about images. Creating XML image maps is useful if images need to be scanned and indexed, and at the same time, they are not directly accessible to the bot (for example, JavaScript is used).

Example of a sitemap containing one page and its associated images

http://example.com/primer.html http://example.com/kartinka.jpg http://example.com/photo.jpg Вид на Балаклаву Севастополь, Крым http://creativecommons.org/licenses/by-nd/3.0/legalcode

XML tags
image:image(required) - information about one image. A maximum of 1000 images can be used.
image:loc(required) - path to the image file. If a CDN is used, then it is acceptable to link to another domain if it is verified in the webmaster panel.
image:caption- caption for the image (may contain long text).
image:title- title image (usually short text).
image:geo_location- the shoot place.
image:license- Image license URL. Used for advanced image search.

XML map for video

Similar to the image map, Google also has a video sitemap extension where you can specify detailed information about video content, which affects display in video search. A video sitemap is necessary when the site uses videos that are hosted locally, and when indexing these videos is difficult due to the technologies used. If you are embedding a video from YouTube on your website, then a video-sitemap is not needed here.

News Sitemap

If there is news content on the site and participation in Google News It is useful to use a Sitemap for news, so Google will quickly find your latest materials and index all news articles. In this case, the Sitemap should contain only addresses of pages published in the last 2 days and contain no more than 1000 URLs.

Using multiple cards

If necessary, you can use several sitemaps, combining them into one index sitemap. Multiple sitemap.xml are used in cases where:

  • The site uses several engines (CMS).
  • The site has more than 50,000 pages.
  • It is necessary to set up convenient error tracking in sections.

In the latter case, everyone large section site has its own sitemap.xml and all of them are added to the panel for webmasters, where it is convenient to see which section has the most errors (see the section for finding errors in the site map).

If you have 2 or more sitemaps, they need to be combined into an index sitemap, which looks the same as a regular sitemap (except for the presence of sitemapindex and sitemap tags instead of urlset and url), has similar restrictions and can only link to regular XML maps (not index maps) .

Example Sitemap Index:

http://www.example.com/sitemap-blog.xml.gz 2004-10-01T18:23:17+00:00 http://www.example.com/sitemap-webinars.xml.gz 2005-01-01

sitemapindex(mandatory) - specifies the current protocol standard.
sitemap(mandatory) - contains information about a separate sitemap.
loc(required) - sitemap location (in xml format, txt or rss for Google).
lastmod- time of sitemap change. Allows search engines to quickly discover new URLs on large sites.

How to create sitemap.xml

Creation methods XML Sitemap:

  • Internal CMS tools. Many CMSs already support sitemap creation. To find out, read the documentation for your CMS, look at the menu items in the admin panel, or contact engine technical support. Upload the file https://yoursite.com/sitemap.xml on your site; it may already exist and is being dynamically generated.
  • External plugins. If the CMS does not have functionality for generating a sitemap, and it supports plugins, Google which plugin covers the sitemap.xml question for your engine and install it. In some cases, you need to contact programmers to write a similar plugin for you.
  • Separate script on the site. Knowing the XML card protocol and technical limitations, you can create sitemap.xml yourself by adding a generation script to CRON. If you are not a programmer, use the other items in this list.
  • Sitemap generators. There are many sitemap.xml generators that scan your site and give you a ready-made map to download. The disadvantage here is that every time the site is updated, you need to manually generate a sitemap.
  • Parsers. Desktop programs designed for technical analysis of a website usually provide the opportunity to download sitemap.xml, generated based on crawled pages. It works similarly to sitemap generators, only it runs locally on your machine.

Popular online sitemap generators

XML-Sitemaps.com

Allows you to get sitemap.xml in a few clicks. Supports XML, HTML, TXT and GZ formats. Convenient to use for small sites (up to 500 pages).

A similar generator, but has a little more settings and allows you to create a map of up to 2000 pages for free.

Has many settings, allows you to import URLs from a CSV file. Scans up to 500 URLs for free.

There are no limits on the number of pages to scan. But for large sites, the generation process may freeze for several tens of minutes.

Local programs for generating XML Sitemap

G-Mapper Sitemap Generator

Free desktop version of the sitemap generator for Windows.

Screaming Frog SEO Spider

Flexible sitemap generation tool with many settings. Convenient if you already use screamin frog for other SEO tasks. After scanning the site, use the menu item Sitemaps -> Create XML Sitemap.

Netpeak Spider

A less flexible, but also convenient solution for quickly generating sitemap.xml. After scanning the site, you need to use the menu item Tools -> Generate Sitemap.

What is a site map

The content of any web resource will sooner or later be indexed by search engines. How can we make this process happen faster?

One of the most effective ways– use of the so-called site map ( Sitemap).

Map of site ( Sitemap) - This xml- a file with information for search engines about the pages of a web resource that are subject to indexing. Sitemap helps search engines determine the location of web resource objects, the time of their last update, update frequency, and priority.

Protocol format Sitemap comprises XML-tags.

The file must use the encoding UTF-8.

Attributes XML-tags Sitemap

required attribute.

– обязательный атрибут. Родительский тег для каждой записи URL . Остальные теги являются дочерними для этого тега;

https://viws.ru/en/ – обязательный атрибут. URL страницы; должен начинаться с префикса (например, http:// ) и заканчиваться косой чертой, если ваш веб-сервер требует этого. Длина этого значения не должна превышать 2048 символов;

– необязательный атрибут. Дата последнего изменения файла; должна быть в формате W3C Datetime . Этот формат позволяет при необходимости опустить сегмент времени и использовать формат ГГГГ-ММ-ДД;

– необязательный атрибут. Вероятная частота изменения этой страницы. Это значение предоставляет общую информацию для поисковых систем и может не соответствовать частоте сканирования этой страницы. Допустимые значения: always , hourly , daily , weekly , monthly , yearly , never ;

– необязательный атрибут. Приоритетность URL относительно других URL на вашем сайте. Допустимый диапазон значений – от 0,0 до 1,0. Это значение не влияет на процедуру сравнения ваших страниц со страницами на других сайтах – оно только позволяет указать поисковым системам, какие страницы, по вашему мнению, более важны для сканеров (приоритет, который вы назначили странице, не влияет на положение ваших URL на страницах результатов той или иной поисковой системы). Приоритет страницы по умолчанию – 0,5.

Пример XML -файла Sitemap

(необязательные теги выделены ):

http://сайт/!}

2010-04-19

daily

0.8


http://сайт/aldan.htm
2009-10-03
monthly

0.5

http://сайт/aldan-weather.htm

2010-04-15

weekly

0.5

Encapsulates this file and specifies the current protocol standard; Sitemap):

http://сайт/

http://сайт/aldan.htm

http://сайт/aldan-weather.htm

If your site contains many web pages, you can omit optional attributes (this will significantly reduce file sizes Sitemap

Using Index Files Sitemap File must contain no more than 50,000 URL , and its size should not exceed 10.

MB Sitemap If necessary file can be compressed using an archiver gzip to reduce the requirements for bandwidth

channel. must contain no more than 50,000 If you need to transfer more than 50,000 Sitemap, you should create several files Sitemap. You will need to list each of these files in the index file. Sitemap. In the index file Sitemap a maximum of 50,000 files can be listed , and its size should not exceed 10.

. The size of this file should not exceed 10

How to create a sitemap Sitemap To create a sitemap, you can use so-called generators

, or you can do everything yourself: – open;

Notebook Sitemap– following the rules of protocol Sitemap ;

, fill out the file – enter the file name in the appropriate text field (for example,);

sitemap.xml – in the drop-down list File type select;

sitemap.xml All files (*.*) File type UTF-8 Encoding , press;

Save Sitemap – upload

to the root directory of your site. Sitemap

Notifications for search engine scanners about the presence and location of a file Sitemap After the file created and hosted on a web server, its location must be reported to search engines that support this protocol. It can be done:

Uploading a Sitemap Using the Search Engine Web Interface

To send a file Sitemap directly to the search engine, which provides the ability to obtain information about processing status and errors, contact help system search engine.

For example, transfer

Have you thought about creating a “Site Map”? Let's try to figure out how to do it correctly.

What is a sitemap?

Site Map- this is a separate page that lists all sections, subsections, articles. This is something like a directory in which all the articles on the site are recorded with links to these same articles.

Why do you need a site map? XML or HTML: which map is better?

A sitemap in html format is needed for visitors - to make it easier for them to search necessary information. Such a map must be present on large sites with more than 30 pages.

A sitemap in xml format is simply necessary for search engines - so that the search robot can see all the links that are on the site and better index the site itself.

In total, it is better to create 2 separate sitemaps: sitemap.XML for robots and .HTML for visitors.

Using a Sitemap file, you can tell Yandex which pages of your site need to be indexed, how often the information on the site is updated, and which pages are most important to index. It is useful to look at maps of large sites or good sites on your topic that are in the TOP.

We'll look at the xml format later, first let's try to figure out the html format, i.e. with a map familiar to us, which we see on almost every portal.

HTML Sitemap - 7 Iron Rules

    Post it on separate page, which can be accessed from the main menu. That is, the link to the site map should be visible from any page of the site.

    The structure of the map should reflect the hierarchy of the site pages; the structure should clearly indicate where the main sections are and where the subsections are.

    It would be good if it was placed at the beginning of the site map short description the site itself, so that the visitor can quickly find out which site he is on.

    Do not overload the site map with unnecessary pictures; it is better to do without them altogether.

    Try to adhere to the rule in section descriptions: brevity is the sister of talent. Section headings should be concise, clear and understandable. You can add a short description of the section, for example:

    «- About company
    This section briefly describes the main operating principles of our company, the history of its origins and development, as well as our aspirations and aspirations in the long term.»

    Make sure your site map is up to date. If some pages are removed from the structure or, conversely, new ones are added, do not forget to reflect this in the map.

Follow these 7 rules, and your sitemap will become an excellent navigator for your visitors.

Sitemap sitemap.xml: why you need it and how to make it yourself

XML Sitemap- this is a file located in the root directory of the site, with information for search engines (such as Yandex, Google, Rambler, Bing and others) about the pages of your site. This file is needed to make it easier for search engines to index your site.

How does sitemap.xml work?

When visiting a site, a search robot first of all reads the instructions in the robots.txt file on how to index the site. And if you indicate in it that there is a sitemap.xml, then the robot will go to specified address, which lists the URLs of the most important pages of the site that are subject to mandatory indexing.

Therefore, do not forget that the sitemap.xml file must not only be placed on your website, but you must specify the path to it in robots.txt in the sitemap directive.

User-agent: Yandex
Allow: /
Sitemap: http://mysite.ru/site_structure/my_sitemaps.xml

This is how a sitemap.XML sitemap will make the search engine’s work easier and provide high-quality indexing to your site.

Sitemap sitemap.xml for Yandex

Yandex supports the Sitemap protocol. To convey information, use the following elements:

  • loc — page address;
  • How to make a sitemap.XML sitemap yourself and for free?

    It is not hard. There are several on the web free programs and sites that will generate such a map for you automatically. Here are some of them: sitemapgenerator.ru, xml-sitemaps.com, cy-pr.com/tools/sitemap/

You're just a cretin if you didn't take the time to create a sitemap the right attention. It is enough to understand the issue once and avoid it in the future. large quantity mistakes, so let's do it now.

Your humble servant was also such a cretin in his younger years when he just started promoting websites in one office. At that time I came across one website for promotion, who, it should be said, was just crap. And this shit had problems with indexing. Naturally, if the site were of sufficient quality, both search engines would index it no matter what the problems, but the owners relied on a normal designer, layout designer and programmer, and in this case the SEO specialist can only, so to speak, open the bottle with scissors. I tried everything on it - the last-modified setting, speeding up indexing using the fastbot that was fashionable at that time, and buying links. And only then it turned out that the problem was that the sitemap was not updated automatically! When I updated it, all the pages flew into the index.

What is a sitemap and why is it needed?

What is a sitemap? This is a file with information about the site pages that need to be indexed. Typically, a sitemap is created for Yandex and Google to notify search robots about pages that need to be included in the index. Using a sitemap, you can also check how often updates occur and which web documents are most important to index. In general, they spoke very well about it at the Yandex Webmaster:

Does having a sitemap affect promotion?

If you do not have a sitemap, this does not mean that search engines will not index the resource. Search robots often scan sites quite well without this and include them in the search. But sometimes glitches can occur, due to which sometimes it is not possible to find all web documents. The main reasons are:

  1. Sections of the site that can only be reached by making a long chain of transitions;
  2. Dynamic URLs.

So, creating sitemap.xml helps solve this problem in many ways. This file affects SEO only insofar as it facilitates/speeds up the indexing of pages. It also increases the chance that web pages will be indexed before your competitors can copy the content and publish it on their site.

What other format does a sitemap come in and why is it made in XML format?

Why you need a site map, we figured it out. Now let's look at what formats it can be done in:

  1. In html format. It is created in the form of an ordinary page with addresses leading to the main sections of the resource. This type of map helps you quickly find your way, and is designed more for people than for search robots. In HTML sitemap you can place limited number links (no more than 100), because if there are more of them, not all of them will be included in the index. Or search robots They may completely exclude such a page from the search due to an excessive number of URLs, even internal ones.
  2. Creation xml sitemap file. There are no too critical restrictions on the number of links, and search engines index it better, because the sitemap xml file contains full information in a form understandable to the robot. It is especially important for projects where there are hundreds and thousands of documents of equal importance, and the placement of all links to them is necessary. This type of sitemap has the ability to place up to 50 thousand URLs and in addition, you can set the frequency of updates and approximate priority (priority), which cannot be said about a map in HTML format. It is for these reasons that a sitemap is almost always created in xml.

Here's more information about this file:

How to make the right sitemap

Let's look at how to make a proper xml map. The following requirements must be met:

  1. The file size should be no more than 10 MB;
  2. The map should contain no more than 50,000 links. In cases where there are more links, you can create several maps and include them in the main xml map;
  3. The sitemap address should be entered in robots.txt;
  4. Also upload the sitemap to Yandex and Google (how to add a file is described below);
  5. Search engines must have access to the map. It is necessary to use special tags that let search engines understand that this is a map and not something else;
  6. The sitemap must have UTF-8 encoding.

Let me give you a simple example of a map:

http://site.ru/ 2016-11-20T19:45:08+03:00 always 0,9 http://site.ru/category/ 2016-11-20T19:46:38+03:00 monthly 0,6 http://site.ru/page/ 2016-11-20T19:48:41+03:00 yearly 0.4

< url >

< loc >http : //site.ru/

< lastmod >2016 - 11 - 20T19: 45: 08 + 03: 00< / lastmod >

< changefreq >always< / changefreq >

< priority > 0 , 9 < / priority >

< / url >

< url >

< loc >http : //site.ru/category/

< lastmod >2016 - 11 - 20T19: 46: 38 + 03: 00< / lastmod >

< changefreq >monthly< / changefreq >

< priority > 0 , 6 < / priority >

< / url >

< url >

< loc >http : //site.ru/page/

< lastmod >2016 - 11 - 20T19: 48: 41 + 03: 00< / lastmod >

< changefreq >yearly< / changefreq >

< priority > 0.4 < / priority >

< / url >

The url and loc tags are required. The first contains all the information about a specific URL. The second contains the address itself.

The lastmod, changefreq, priority tags are not mandatory, but it is still recommended to use them.

Lastmod in the sitemap is responsible for the date of the last update.

Changefreq indicates the frequency of page changes. The values ​​can be as follows:

  1. Hourly – updates hourly;
  2. Always – always updated;
  3. Weekly – updated once a week;
  4. Daily – updates occur daily;
  5. Monthly – updates occur once a month;
  6. Yearly – once a year;
  7. Never – not updated (it is better not to use this value).

Priority tells search engines how important a page is compared to others. The priority can be set from 0.1 (low) to 1 (high).

This was just an example map, you do not need to specify these exact values. In general, it is recommended to set priority as follows: maximum for home page(1), for headings the average (0.6), and for entries - minimal (0.4).

Now let's look at an example where there are more than 50 thousand links. In this case, the file includes other maps:

http://site.ru/sitemaps/sitemap01.xml 2016-11-20T21:37:28+03:00 http://site.ru/sitemaps/sitemap02.xml 2016-11-20T21:37:29+03:00

< sitemap >

< loc >http: //site.ru/sitemaps/sitemap01.xml

< lastmod >2016 - 11 - 20T21: 37: 28 + 03: 00< / lastmod >

< / sitemap >

< sitemap >

< loc >http: //site.ru/sitemaps/sitemap02.xml

< lastmod >2016 - 11 - 20T21: 37: 29 + 03: 00< / lastmod >

< / sitemap >

. The size of this file should not exceed 10

There are several ways to create an xml map, let's look at them:

  1. Download the map using online generator from another resource;
  2. Generate using special program. But it is worth considering that the programs this kind of mostly paid. An example of such a generator: Wonder WebWare SiteMap Generator. Screaming Frog also has this feature;
  3. Create a sitemap manually;
  4. Automatically create a map using a CMS (for example, such a function is available on WordPress).

Here is an option on how to make a sitemap without the help of plugins:

Plugins for creating sitemaps on WordPress

You can create a sitemap in WordPress using a special plugin called Google XML Sitemaps. Everything is simple here: download the plugin, install it, then start creating the file. To do this, open Console-Settings and select XML-sitemap. Next we set the settings. We leave the priority as default.

    Select a site from the list.

    In the field, enter the URL where the file is available. For example, https://example.com/sitemap.xml.

    Click the Add button.

After adding the file, it is queued for processing. The robot will download it within two weeks. Each added file, including those attached to the Sitemap index file, is processed by the robot separately.

After downloading, next to each file you will see one of the statuses:

Status Description Note
"OK"
"Redirect" Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly inform the robot about the update
"Not indexed"

Checking the server response

Disallow inform the robot about the update
Status Description Note
"OK" The file is formed correctly and loaded into the robot database

The date of the last download will be displayed next to the file.

Indexed pages will appear in search results within two weeks

"Redirect" The specified URL redirects to another address Remove the redirect and notify the robot about the update
"Error" The file is not formed correctly Click the Error link for details. After making changes to the file, notify the robot about the update
"Not indexed" When accessing Sitemap, the server returns an HTTP code other than 200

Check if the file is accessible to the robot using the Check Server Response tool by specifying the full path to the file.

If the file is not available, contact the administrator of the site or server on which it is located.

Access to the file is denied in robots.txt using the Disallow directive Allow access to the Sitemap and notify the robot about the update

Sitemap update

If you have changed the Sitemap file added to Yandex.Webmaster, you do not need to delete it and upload it again - the robot regularly checks the file for updates and errors.

To speed up file crawling, click the icon. If you are using a Sitemap index file, you can start processing each file listed in it. The robot will download the data within three days. You can use the function up to 10 times for one host.

Once you have used up all attempts, the next one will be available 30 days after the first. Exact date displayed in the Webmaster interface.

Removing Sitemap

In the Yandex.Webmaster interface, you can delete those files that were added on the Sitemap Files page: If a directive was added for Sitemap in the robots.txt file, delete it. After making changes, information about the Sitemap will disappear from the robot and Yandex.Webmaster database within a few weeks.