Before the internet, there was the ARPANET, a computer network used by the U.S. government to share sensitive data in the 1970s. About a decade later, the ARPANET’s limited networks gave way to a single, worldwide network we call the Internet.
However, the Internet as we know it didn’t materialise until 1991. That’s when an English programmer introduced the World Wide Web as a place to store information, not just send and receive it. Gone were the days of only using the Internet to send e-mails or post articles in forums. Now users could create and find web pages for just about anything.
As the World Wide Web grew in popularity, users faced a new problem: learning to navigate it. Along came Google (and its predecessors), to give users a starting point for their web search. With the help of search engines, users could finally explore cyberspace without getting completely lost.
Today’s biggest search engines are much more adept than they were 20 years ago. They can predict your search, interpret multi-word inquiries, and serve trillions of (yes, we said trillions) of webpages.
However, despite Google’s web prowess, it and other search engines have a very limited view of what’s out there. (Some researchers say that search engines only show about 1% of what’s actually available online!) Search engines work by “crawling” links on a website. If a site owner doesn’t want a page on their site to be found, it won’t include a direct link to that page. If a web page has no link, it can’t be crawled or indexed in Google’s massive search library. The page won’t appear as a result on a search engine.
Because search engines skim the surface of what’s available online, the websites they show on their results pages are part of what’s called the Surface Web. Using Google is like scanning the horizon with your naked eye. Sure, there’s a lot to take in, but you’re only seeing a tiny bit of what’s happening in the world.
When you find web pages that a typical search engine can’t access, you’re using the Deep Web. We know the Deep Web sounds intimidating, but believe it or not, you use it every day. When you search for a place on Airbnb or compare plane flights on Expedia, you’re using the Deep Web. When you log in to your email account, online bank account, or Amazon account, you’re using the Deep Web.
Anytime you log in to an account or search for information directly on a web page, you’re getting access to Deep Web content that won’t show up on a search engine. And that’s a good thing. If someone Googled your name, you wouldn’t want your banking information list showing up in results. That information is meant to be private, so those sensitive web pages aren’t crawled by search engines.
Using the Deep Web is like looking at the world from an aeroplane. At such high elevations, you’ll have a much broader vantage point than your friends on Earth.
The Deep Web is enormous in comparison to the surface Web. Today’s Web has more than 555 million registered domains. Each of those domains can have dozens, hundreds or even thousands of sub-pages, many of which aren’t catalogued, and thus fall into the category of Deep Web.
Although nobody really knows for sure, the Deep Web may be 400 to 500 times bigger than the surface Web. And both the surface and Deep Web grow bigger and bigger every day. Most people use the internet daily, however, most of us only know a fraction of it. putting that fraction in an example, it could be said we only know the very top of the iceberg, most of the ice is submerged invisible except to those who know how to find it. This submerged network is known as the Deep Web (also called the Deepnet, Invisible Web, or Hidden Web).
Usually, we use the term “Surface Web” to refer to the “normal” internet. This is the information and pages you can easily find by searching on any search engine such as Google or yahoo. This search engines can only collect static pages and not dynamic pages, which is estimated to have only 0.03% of the information in the World Wide Web.
When you search for a place on AirBnB, or compare plane flights on Expedia, you’re using the Deep Web. When you log in to your email account, online bank account, or Amazon account, you’re using the Deep Web
For the rest, it is hidden in the so-called “Deep Web”, invisible web, or the deep Internet. this huge unknown space of the world wide web contains all the information that can not be found with a simple Google search. It is unknown exactly how big the Deep Web, Bright Planet estimated that it could be around 500 times larger than our surface Internet. Considering that Google, by itself, covering around 8 billion pages, it is truly amazing.
The vast majority of the invisible web pages contain valuable information. A report published in 2001 estimated 54% of sites are are records of valuable information or secret documents such as reports of NASA or NOAA.
Automatically determining if a Web resource is a member of the surface Web or the Deep Web is difficult. If a resource is indexed by a search engine, it is not necessarily a member of the Surface Web, because the resource could have been found using another method instead of traditional crawling.
If a search engine provides a backlink for a resource, one may assume that the resource is in the surface Web. Unfortunately, search engines do not always provide all backlinks to resources. Furthermore, a resource may reside in the surface Web even though it has yet to be found by a search engine.
Most of the work of classifying search results has been in categorising the surface Web by topic. For classification of Deep Web resources, Ipeirotis et al. presented an algorithm that classifies a Deep Web site into the category that generates the largest number of hits for some carefully selected, topically-focused queries.
Deep Web directories under development include OAIster at the University of Michigan, Intute at the University of Manchester, Infomine at the University of California at Riverside, and DirectSearch (by Gary Price). This classification poses a challenge while searching the Deep Web whereby two levels of categorisation are required. The first level is to categorise sites into vertical topics (e.g., health, travel, automobiles) and sub-topics according to the nature of the content underlying their databases.
The more difficult challenge is to categorise and map the information extracted from multiple Deep Web sources according to end-user needs. Deep Web search reports cannot display URLs like traditional search reports. End users expect their search tools to not only find what they are looking for special but to be intuitive and user-friendly. In order to be meaningful, the search reports have to offer some depth to the nature of content that underlie the sources or else the end-user will be lost in the sea of URLs that do not indicate what content lies beneath them.
The format in which search results are to be presented varies widely by the particular topic of the search and the type of content being exposed. The challenge is to find and map similar data elements from multiple disparate sources so that search results may be exposed in a unified format on the search report irrespective of their source.
WikiLeaks is a notorious Dark Web site that allows whistleblowers to anonymously upload classified information to the press. While the legality of leaking classified information is a hot topic in the U.S., no formal charges have been made against WikiLeaks founder, Julian Assange
However, everything is not as cool as it may sound. There is a dark side of the deep internet and it is as illegal and dangerous as it can ever be.
This dangerous and illegal part of the web is called the Dark Web.
The Dark Web still falls under the Deep Web umbrella; it’s just a much, much smaller portion of the Deep Web. The Dark Web, or “Darknet,” uses the masked IP address to intentionally hide web pages from search engines, web page search forms, and even standard web browsers. In fact, the Dark Web accounts for less than .01% of the Deep Web.
Often you can only access these sites by using special software browsers. This software ensures the privacy of both the source and the people visiting it are very secure. Ones secure and in you will into a world you never thought existed. Here you will find everything from purchasing a human kidney to prostitution, weapon or drugs. Anonymity allows the transfer, legal or illegal, information, goods and all type of services you can ever imagine all around the world.
Dark Web sites are so bent on anonymity, they require a special web browser to access them. The majority of Dark Web sites in America use the TOR Network (short for The Onion Router). The TOR network is a collection of “volunteer” computer networks that send users’ encrypted traffic to multiple servers before pulling up content. That way, a user’s browsing session is so jumbled up, their identity and location are almost untraceable.
Because the TOR network allows users to browse anonymously, it’s used by secret service agents, law enforcement, activists, researchers, whistleblowers, and users who are banned from Internet access.
WikiLeaks is a notorious Dark Web site that allows whistleblowers to anonymously upload classified information to the press. While the legality of leaking classified information is a hot topic in the U.S., no formal charges have been made against WikiLeaks founder, Julian Assange.
Even Facebook has a Dark Web site. Last October, the social media giant launched a Tor hidden service so users could avoid surveillance or censorship.
Anonymity, however, has a dark side. The TOR network can also be used to hide the identities of users involved in criminal activity.
The types of illegal operations you could find on the TOR network include: sale of unlicensed firearms, child pornography, sale of malware, pirated software, and hacking guides, sale of illegal drugs, sale of stolen credit card information and user accounts, sale of forged documents and currency, hiring hitmen, gambling, money laundering, insider trading etc.
Among the most famous of the TOR darknet content is a collection of secret websites that end with “.onion”. TOR activity can not be tracked because it works from a broadcasting system that bounces signals between different TOR compatible equipment worldwide.
The Silk Road is the best-known source of nefarious activity on the Dark Web. Known as the “Amazon of drugs,” the site sold high-grade, illegal drugs—that is until it got shut down by the FBI. Evolution, Agora Marketplace, and Nucleus Marketplace are three black market sites that are still active.