Info Hash Torrent Searching Technique - CE55
Info Hash Torrent Searching Technique - CE55
Info Hash Torrent Searching Technique - CE55
COMPUTER ENGINEERING
[email protected], [email protected],
[email protected], [email protected]
ABSTRACT: In this paper, we look closely at the BitTorrent P2P protocol. We extract problems that have
already been studied from the protocol and discuss those problems. We propose a system for efficient searching
which indexes torrents from multiple sources so that the users can have access to a large number of torrents
from a single source.
ISSN: 0975 – 6760| NOV 12 TO OCT 13 | VOLUME – 02, ISSUE – 02 Page 273
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
being distributed, and therefore a large number of but also from the PEX implementation, creating
users can be supported with relatively limited tracker something like a distributed Database of shared
bandwidth. By reducing dependency on a centralized torrents acting as backup tracker when all other
tracker, PEX increases the speed, efficiency, and trackers are down or can't deliver enough peers, as
robustness of the BitTorrent protocol. well as enabling trackerless torrents. The DHT acts
Within BitTorrent, a torrent file is a computer and is added to torrents as a pseudo-tracker if the
file that contains metadata about the files to be shared client has the option enabled and DHT trackers can
and about the tracker, the computer that coordinates be enabled and disabled per torrent just like regular
the file distribution. A seeder is a client that has a trackers. Clients using this permanent DHT tracking
complete copy of the torrent and still offers it for are now a fully connected decentralized P2P network,
upload. The more seeders there are, the better the they enter the DHT as a new node, this of course
chances of getting a higher download speed. makes it necessary for private trackers (or non-public
A downloader/leecher is any peer that does not have distributions) to exclude themselves from the
the entire file and is downloading the file. Bram participating.
chose the term downloader over leech because
BitTorrent's tit-for-tat ensures downloaders also Magnet links:
upload and thus do not unfairly qualify as leeches. The Magnet URI scheme refers to
With the adoption of DHT (Distributed Hash Tables) resources available for download via peer-to-
the BitTorrent protocol starts to become more that a peer networks. Such a link typically identifies a file
semi-centralized distribution network around a single not by location, but by content more precisely, by the
resource, it becomes more decentralized and removes content’s cryptographic hash value. Although it could
the static point of control, the tracker, this is done by be used for other applications, it is particularly useful
relying in DHTs and the use of the PEX extension. in a peer-to-peer context, because it allows resources
Enabling the volatile Peer to operate also as a tracker, to be referred to without the need for a continuously
but even if this addressed the need for static tracker available host. Traditionally, .torrent files are
servers, there is still a centralization of the network downloaded from torrent sites. But several clients
around the content. Peers don't have any default also support the Magnet URI scheme. A magnet link
ability to contact each other outside of that context. can provide not only the torrent hash needed to seek
the needed nodes sharing the file in the DHT, but
may include a tracker for the file.
Message digest:
A Message Digest is a digitally created hash
(fingerprint) created from a plaintext block. All the
information of the message is used to construct the
Message Digest hash, but the message cannot be
recovered from the hash. For this reason, Message
Digests are also known as one way hash functions.
SHA-1:
2. CHARACTERISTICS OF BITTORRENT
SHA-1 is the most widely used of the
existing SHA hash functions, and is employed in
Permanent DHT tracking:
several widely used applications and protocols. SHA-
With the PEX implementation and reliance
1 produces a 160-bit message digest. SHA-1 and
on the distributed hash table (DHT), the evolution
SHA-2 are the secure hash algorithms required by
into creating a real P2P overlay network that is
law for use in certain U.S. Government applications,
completely serverless was the next logical step. The
including use within other cryptographic algorithms
DHT will take information not only from old trackers
ISSN: 0975 – 6760| NOV 12 TO OCT 13 | VOLUME – 02, ISSUE – 02 Page 274
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
and protocols, for the protection of sensitive server and return results that are related to the search
unclassified information. string specified to the user. If implementing as a web
based system, the system can accept a search string
Disadvantages:
from the client and return the results to the client
The main disadvantage of the BitTorrent
network is that many of the torrents are not accessible browser.
to the users participating in the file sharing process.
There is no single place to have access to all the Through this system, the users are exposed
torrents in the system. The websites that host or cache and made accessible to a large number of torrents on
the torrent files have some restriction or there is some the network through which they can share more data
inefficiency to index all files. and it is accessible to a large number of users in the
BitTorrent network.
3. PROPOSED SYSTEM
Advantages:
The major disadvantage in the whole This proposed system will allow a user to
BitTorrent system is that there is no access to all of access any torrent uploaded on a website not familiar
the torrents available and thus there is not much with the user, making its major advantage of
sharing among the peers. Although there are some accessing any remote torrent and this will create an
hamsters/ bots that collect the torrent information efficient system for the required search.
from a considerable number of websites which host
the torrent files, there is a limitation to this. Another Also, a torrent uploaded on multiple sites
way is the use of torrent caching sites which cache will be shown as a single result in our proposed
the torrent files on their servers and are accessible system, unlike other search engines, which provide
only through their hash. There exists many torrent multiple results for a single torrent.
sites that provide torrent cache, but one cannot search
through them until they have the hash for the torrent 4. CONCLUSION
In this paper, we have clearly presented the
they want. This becomes very much inconvenient for
terms and characteristics all of the BitTorrent
a naive user to search through these sites. One way is
protocol. The disadvantages of the direct connect
to map info hash values of each torrent with the name
protocol are covered in the BitTorrent protocol, still
of the torrent by parsing the torrent file. The hash
as every coin has two sides, the BitTorrent protocol
would be mapped with the torrent names along with a
also must be having its disadvantages. As we can see
set of URLs and magnet links from where the torrent
above in this paper, our proposed system gives access
files can be downloaded and store them in a database
to large number of torrents that might not be
from where the user would be able to search for
accessible from familiar websites, thus allowing an
torrents using the name of the torrent. This can be
efficient torrent searching for everyone including the
implemented in client software where it will interact
naive users too.
with the database on the server or a web based search.
5. REFERENCES
The pre-requisite for such a system would be
a strong database capable of handling a large number [1] Bram Cohen, The BitTorrent Protocol
of records at a given point of time, higher bandwidth Specification
internet connection (possibly the bandwidth of a http://www.bittorrent.org/beps/bep_0003.html
server), and a little bit knowledge of the BitTorrent [2] http://wiki.theory.org/BitTorrentSpecification#
protocol. Bencoding
[3] http://en.wikipedia.org/wiki/Glossary_of_BitT
The database can first be populated by orrent_terms
mapping the hash value of the torrents and their other [4] John Hoffman, HTTP Seeding
http://www.bittorrent.org/beps/bep_0017.html
key properties and inserting these records into the
[5] J.A. Pouwelse, P. Garbacki, D.H.J. Epema,
database. After this step, a search function needs to H.J. Sips, The Bittorrent P2P File-Sharing
be implemented that can search the database related System: Measurements And Analysis
to the keywords specified by the user returning the http://www.cs.unibo.it/babaoglu/courses/cas04
links where the torrent file can be downloaded by the -05/papers/bittorrent.pdf
user. If implementing this system as a standalone [6] http://en.wikipedia.org/wiki/Comparison_of_B
itTorrent_sites
application software, the software may accept a
search string from the user, query the database on the
ISSN: 0975 – 6760| NOV 12 TO OCT 13 | VOLUME – 02, ISSUE – 02 Page 275