Googlebot Crawls and Indexes the First 15MB of HTML Content Per Page
After 15 MB of a site has been crawled, Googlebot will no longer be used to rank the rest of the page, according to the tech giant. “Any resources mentioned in the HTML such as photos, videos and CSS are acquired individually,” according to Google’s help guide.
When a file exceeds 15 MB in size, Googlebot will stop crawling and only index the first 15 MB. “The file size restriction is applied to the uncompressed data,” the document said further.
SEO experts were left wondering, according to the report, if Googlebot will ignore HTML files with text truncated below graphics at the cut-off. A Google Search Advocate, John Mueller, tweeted: “It’s particular to the HTML file itself like it’s written.” IMG tags are used to draw in resources and information, but they are not part of the HTML file, according to him.
If your website is more than 15MB in size, however, SEO experts warn that your user will have to wait a long time for your page to load.
How Does This Affect SEO?
Content that is relevant to Googlebot must be placed at the top of websites. Code must be organised in a manner that places SEO-relevant information inside the top 15 MB of an HTML or compatible text-based file. It also implies that pictures and videos should be compressed rather than encoded straight into the HTML wherever feasible.
Many websites will be untouched by this change, since current SEO best practices encourage limiting HTML pages to less than 100 KB. Google Page Speed Insights can be used to verify the size of a page. Having material on a page that isn’t utilised for indexing may seem alarming. In reality, 15MB of HTML is a significant amount of data. Images and videos, for example, are fetched independently, as stated by Google. According to Google’s phrasing, the 15MB limit is just for HTML.
As per digital marketing professionals, unless you’re posting full volumes worth of content on a single page, it would be impossible to go above that HTML limit.
The Latest Update
The 15MB crawl restriction has been clarified in a Google blog post in response to the confusion this material has generated. With the current average HTML file size of 30KB, the majority of visitors will not be harmed by this crawl restriction, according to the post. Google has this to say about it: It’s possible, though, that you could at least relocate some of your HTML page’s inline scripts and CSS dust to other files if you’re the page’s owner.
The Bottom Line
If your HTML pages are more than 15MB, you certainly have underlying problems that need to be addressed. Connect with one of the best SEO services providers, DigiMore, to learn more.