Monday, July 19, 2010

On Data Transfer in Perceptio

There are two methods of data transfer available today. One is direct transfer methods like FTP, Telephony and HTTP transfers. Direct transfer methods are efficient on small files and if there are a small number of users downloading simultaneously. As the size of the file and number of users increase, server capacity requirements increases and the transfer efficiency decreases however data transfer begins at the maximum available transfer speed.

The other method of transfer is a distributed approach. Although relatively newer and younger, a lot of research has been done into improving the efficiency of distributed data transfer. Initially developed from peer-to-peer technologies, distributed data transfer algorithms are now being used in industrial scale Content Distribution Networks (CDNs). The best algorithm for distributed data transfer is BitTorrent. The technology has matured to the point that BitTorrent traffic is estimated to amount to about 40% of internet traffic. Other technologies that use distributed data transfer methods are the Gnutella network and Pando CDN. Distributed data transfer is the opposite of direct data transfer in that the larger the file size and the more users downloading the same file, the higher the data availability and the faster the download becomes however data transfer is initially slow as the system starts connecting to other hosts.

Data transfer in Perceptio uses both methods in their most efficient areas. Small torrent file-like files called Ghost Files are created for each file the user wishes to make available on the network. These files can then be propagated, routed and/or indexed by other hosts and/or services.

A Ghost File contains the following:

  • - Routing information
  • - Owner of the data
  • - SHA1 checksum information
  • - Restrictions
  • - Category
  • - Propagate request
  • - Destination (if any)
  • - Digital signature (if any)
  • - Description of the data (if any)

No comments: