Did you know that the visible portion of an iceberg only represents about 20% of its actual size? Beneath the water surface lies the other 80%. Imagine if the captain of the Titanic had that tidbit of information. Well the Internet is similar in many ways. The amount of the entire scope of the Internet that is still inaccessible to the engines and their crawlers is quite amazing. Even as Google indexed it one trillionth (with a T) web address last summer it appears as if there is so much more out there.
A New York Times article introduces this concept like this:
Beyond those trillion pages lies an even vaster Web of hidden data: financial information, shopping catalogs, flight schedules, medical research and all kinds of other material stored in databases that remain largely invisible to search engines.