?> Search Engine Crawlers and Dynamic Web Pages

CashCome.com Articles Pages

Home
Articles Index
Site Map

Search Engine Crawlers and Dynamic Web Pages

?>

Download eBooks and Software

Human Anatomy & Physiology Study Course (75% Comms
Converts At 9%! *hot Seller* Strong Niche & Few Affiliates To Compete With! Target Students, Educators, Practitioners, Trainers, Nurses, Paramedics All At One Site! Affiliates - Start Making Money With This Niche Now!

Gov-auctions.org - #1 Government & Seized Auto Auctions. Cars 95% Off!
Highest Paying Car Auction Site On Cb! #1 Trusted US Government, Police, Repos & Seized Vehicles Auction Site (incl. Real Estate). Make $33 Or $27 Real Net Profit At 75%. Genuine Product & Acclaimed Customer Service = Min Refunds. Incredible Conversions!

Surveyspaid.com
If You Promote Surveys, You Need To Give Our Site A Try- You Will Be Happy! We Use Geotargeting And Downsells. Converts Great With International Traffic. Email Followups With Your CB Id! Http://www.surveyspaid.com/affiliates.html


Articles > Site Promotion

Search Engine Crawlers and Dynamic Web Pages

 by: Jerry Yu

There are misunderstandings and confusions in the Search Engine Optimization (SEO) world in regard to search engines indexing of dynamic web pages.

It has been claimed that search engine spiders don't index/crawl dynamic web pages well. This statement is only half true. The correct statement should be "Search engines don't index/crawl dynamic web pages well if the page URL contains "?" (without quotes) character.". Search engines do index dynamic web pages very well if the page URL contains no "?" character(s).

URLs that contain "?" are called dynamic URLs.

What web pages are dynamic?

If you have knowledge about HTML, you know the web pages you create normally have .htm, or .html, file extension. These files are static because the HTML code don't change on the fly when requested and they are not processed by web servers. They can be viewed without using a web server.

A web page is said to be dynamic if it is created by using server-side scripting languages such as php, asp, jsp, perl, cgi and so on. These languages are like normal programming languages such as C++, Java, etc. The major difference is scripting languages can't be compiled beforehand. They can only be processed by web servers on the fly when the page is requested by a visitor. Dynamic pages can't be viewed without a web server.

When a dynamic page is requested, the web server first looks at the page's source code and if any server-side scripting code exist, it will process them and generate static HTML result. When processing of the full page has been completed, web server sends only pure HTML code to the web visitor's browser.

Using scripting languages to create web pages gives you the power to do nearly anything you want. If the dynamic page has no "?" character in its URL, search engine spiders treat the page the same as a normal HTML static page.

Query string parameters

When "?" character is used, the page's full URL changes when values after "?" change. The portion after "?" is called the page's query string parameter(s), or simply query parameter(s). Every time when parameter(s) changes, the resulted page will be different.

A page URL can contain more than one "?" character. When this happens, search engine spiders will have difficult time to index the resulted page. If the page has only one "?" character, major search engine spiders can crawl that page well. For example, Google can index and store a page's URL as http://www.examplesite.com/product.asp?id=12345. But if the same page's URL is

http://www.examplesite.com/product.asp?id=12345&category=23&page=3

Most search engines will not be able to index it well even though Googlebot and Yahoo! Slurp may be able to index it.

(Note: Googlebot is Google's web-crawling robot. Yahoo! Slurp is Yahoo's web-crawling robot. Search engine robots collect documents from the web to build a searchable index.)

Yahoo help says

"Yahoo! does index dynamic pages, but for page discovery, our crawler mostly follows static links. We recommend you avoid using dynamically generated links except in directories that are not intended to be crawled/indexed (e.g., those should have a /robots.txt exclusion)."

Google's Webmaster Guidelines:

"If you decide to use dynamic pages (i.e. the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them small."

Let's analyze what Google has stated above.

1. the URL contains a "?" character: this means the definition of dynamic pages are those containing "?" characters in URL.

2. keep the parameters short: this means the number of characters in each individual parameter should be short. There is no quantitative measurement given by Google but we can check some web forums to see examples. My Search engine friendly article (http://www.webactionguide/action-guide/build-site/se-friendly.php) referenced black hat seo discussion thread on Cre8ASiteForums. Its URL is http://www.cre8asiteforums.com/viewtopic.php?t=8386

This page was crawled by Google. The length of its query parameter is 4 characters. There are many other examples on the internet that have more characters and were crawled successfully. The maximum number of characters that can be accepted by Google is unknown.

3. keep the number of them small: this means we should keep the number of parameters in each URL as small as possible. The above Cre8ASiteForums example has one parameter.

At least now we can say Googlebot is able to crawl dynamic pages that have one query parameter and the number of characters in the parameter can be 4.

How to get your pages crawled if using query parameters are not avoidable?

Query parameters are often used for database calls to retrieve stored information by using primary keys in one or more tables. Database Management System (DBMS) makes some tedious work easy to manage. When query parameters must be used for your site, consider build a site map page and hard code a page's URL. For example, the previous URL can be hard coded as

http://www.examplesite.com/product12345-23-3.asp

Hand code every dynamic page is time-consuming. If you use Apache web server, there is a Apache mod_rewrite module to help you (http://httpd.apache.org/docs/mod/mod_rewrite.html) rewrite the requested URL to one with no "?" character embedded on the fly.

Another mod rewrite resource site is www.modrewrite.com.

An interesting article on weberblog.com talked about a practical example of how Google successfully indexed a dynamic page after applying mod_rewrite module. The page originally had 17 characters in the query parameter.

Before rewrite: http://www.weberblog.com/article.php?sroty=20040419170030157

After rewrite: http://www.weberblog.com/article.php/20040419170030157

So, if your site is experiencing the same problem, hurry up and implement mod_rewrite now.

About The Author

Jerry Yu is an experienced internet marketer and web developer. Visit his site http://www.WebActionGuide.com for FREE "how-to" step-by-step action guide, tips, knowledge base articles, and more.

?>


News on Site Promotion

CreditQ.com Celebrates New Site: Begins Sweepstakes Promotion on CreditQ Social
CreditQ.com, an online financial services and resource website, recently launched a companion site called CreditQ Social. CreditQ Social operates as a combination social networking and financial information site, and is a place where companies, financial professionals, and consumers can collaborate, exchange ideas, and seek or provide financial products. As part of the roll-out, CreditQ has ...

Anti-Aging Brand Announces Promotion to Reduce Age Lines on Stressed Out Facebook Investors
Anti-aging website is announcing new anti-aging products and reviews, to reduce age lines and wrinkles of the many stressed out investors of the Facebook IPO debacle.Houston, TX. (PRWEB) May 25, 2012 Anti-aging website, antiagingsupplementnews.org, is taking a new approach to the recent Facebook IPO issues that caused a frenzy among many investors, by announcing new product reviews for stressed ...

Maple Grove Days offers business promotion opportunities
Maple Grove Days' "Showcase" silent auction offers one more way to join the community festival. The auction offers an inexpensive and relatively simple way for a business to support Maple Grove Days as well as promoting itself.

NewHomesSection.com Anticipates Summer with Complimentary Advertising for Builders
NewHomesSection.com Anticipates Summer with Complimentary Advertising for Builders With the arrival of the summer sales season on the horizon, NewHomesSection.com is offering busy builders a completely free30-day trial of the site's extensive advertising and marketing services. About NewHomesSection.com Since 2007, NewHomesSection.com has helped shape the real estate revolution through its ...

He got the promotion I wanted. How do I handle it?
Focus on what you can do to improve your performance and don’t dwell on negative thoughts about your colleague

transparent