The BitTorrent protocol has downloaders help send the file to other downloaders, reducing the burden on the original owner. This usually results in all participants downloading the file faster.
The way this is achieved might be best explained with an illustrated example using the worst illustrations ever made (I’m not a graphics designer).
Moe, Larry, and Curly each want to download a copy of a 120 megabyte file. Each of their computers can download 3 megabytes each minute and can upload at the same rate. Currently, the file only exists on Shemp’s computer, which can download and upload at the same speeds.
The traditional approach is for each of Moe, Larry, and Curly to download the full file from Shemp. One way Shemp can do this is to upload the entire file exclusively to Moe, and then to Larry, and then to Curly. Since Shemp can upload at 3 Mb/min and each can download at 3 Mb/min, it takes 40 minutes to get it to Moe, another 40 minutes to get it to Larry, and a final 40 minutes to get it to Curly. The entire process takes two hours. Another way Shemp can do this is to upload the file simultaneously to all three by splitting his 3 Mb/min connection to 1 Mb/min to each of the three. The end result is the same: it takes two hours before all three have a copy.
There’s something important to notice about this traditional case: a significant amount of bandwidth is wasted. If you consider how much total available download and upload bandwidth exists and compare that to how much is being used, you’ll see that the network isn’t being used to its fullest: since each of the four machines can download and upload at 3 Mb/min, they have a total 12 Mb/min of both download and upload bandwidth, but in both of the above traditional cases only a total of 3 Mb/min of download and upload bandwidth is being used at a time. The network saturation here is 25% of download bandwidth and 25% of upload bandwidth.
Seeing this, Shemp decides on the following plan:
- Shemp splits the file into four pieces.
- First, he will send the first piece to Moe, the second piece to Larry, and the third piece to Curly.
- Then, he will send out the fourth piece to each of them, but also instruct them to get the other two pieces from the other two participants at the same time.
Let’s see what happens when they put this plan into action:
- Shemp sends each of the three a different 30 Mb piece. He splits his 3 Mb/min upload bandwidth between the three of them, so each piece uploads at 1 Mb/min.
- Shemp isn’t downloading anything (because he already has the file), but his upload connection is going full blast: he is uploading to each of Moe, Curly, and Larry at 1 Mb/min, for a total upload speed of 3 Mb/min (his maximum).
- Moe is only downloading at 1 Mb/min from Shemp, and he isn’t uploading anything. So he has 2 Mb/min of download bandwidth and 3 Mb/min of upload bandwidth going to waste. Same goes for Curly and Larry.
Phase 1 takes 30 minutes. When Phase 1 finishes, Shemp has the full file, Moe has piece #1 of 4, Larry has piece #2 of 4, and Curly has piece #3 of 4.
- Shemp sends each of the three a second 30 Mb piece, and each of the three also sends the piece they got in Phase 1 to the other two.
- As before, Shemp isn’t downloading anything, but continues to upload at top speed: 1 Mb/min for each of three clients, for his maximum speed of 3 Mb/min total.
- Moe already has piece #1 of 4, is downloading piece #4 from Shemp at 1 Mb/min, piece #2 from Larry at 1 Mb/min, and piece #3 from Curly at 1 Mb/min. At the same time, he is uploading the piece he already has (piece #1) to Curly and Larry at 1 Mb/min each. So Moe’s connection is almost going at full blast; he is downloading at his maximum of 3 Mb/min and uploading at 2 Mb/min.
- Larry already has piece #2 of 4, and he’s receiving each of the other three pieces at 1 Mb/min each while sending out piece #2 to Moe and Curly at 1 Mb/min each.
- Curly already has piece #3 of 4, and he’s receiving each of the other three pieces at 1 Mb/min each while sending out piece #3 to Moe and Larry at 1 Mb/min each.
Phase 2 takes 30 minutes. When it finishes, everyone has the full file.
Both phases took a total of one hour, which is half the time the traditional approach took.
By splitting up the file, sending different parts to clients, and having them help in sharing the pieces, the time was cut down to half of the traditional approach. The clients don’t need to be able to download or upload any faster; the network is simply better saturated: 75% download bandwidth (9 Mb/min out of available 12 Mb/min) and 75% upload bandwidth (9 Mb/min out of available 12 Mb/min)
And this is what Bittorrent does, except with much smaller pieces, redundancy in case some clients vanish without uploading their pieces, and sending unequal amounts of pieces to clients based on how fast their download and upload speeds are looking.
This example can also help explain some commonly-used terms.
A seeder is someone who has a complete copy of the file, but is staying in the network anyway in order to help in the task of uploading the file. In the above example, Shemp is the only seeder throughout Phases 1 & 2.
Let’s say that after Moe, Curly, and Larry finish downloading the file, they stick around on the network. Soon enough, Joe comes along and wants a copy of the file, but Joe’s internet connection can download at up to a whopping 12 Mb/min. He could download the complete file from any of Moe, Larry, Curly, or Shemp directly at 3 Mb/min (since that’s the fastest any of their upload connections will go), or he can download separate pieces from each of the four at 3 Mb/min from each person, fully saturating his download bandwidth and receiving the file in 1/4 the time. In this scenario, Moe, Larry, and Curly are also seeders: they all have the full file and therefore don’t strictly need to be on the network anymore, but they stick around to help upload the file to others (Shemp started off and continues to be a seeder the whole time).
A user’s share ratio compares the amount of data he has uploaded to others with the amount of data he has downloaded.
After Phases 1 & 2 finish, each of Moe, Larry, and Curly have share ratios of 0.5, because they downloaded four 30 Mb pieces (120 Mb total) but only uploaded two copies of a single 30 Mb piece (60 Mb total). Shemp’s share ratio is undefined, since he hasn’t downloaded any part of the file (resulting in a division by 0 when determining his share ratio).
Generally speaking, it is considered good etiquette to remain on the network after you finish downloading (i.e. be a seeder) until your share ratio is at least 1 (meaning you’ve uploaded the same amount that you’ve downloaded). Some private torrent trackers keep track of your average share ratio and give you perks for keeping it high, or blacklist you if it’s too low.
If nobody else joins the network to download the file, it is impossible for them to improve their share ratios, because they have nobody else to upload to. On the other hand, if Joe does come along as in the seeder example above, Moe, Curly, and Larry’s share ratios will improve to 0.75 (because each will have sent another 30 Mb piece; 90 Mb/120 Mb = 0.75), but Joe’s share ratio will be 0 because he has no one to upload to. Eventually, Joe cannot stay around on the network trying to seed anymore, so he is forced to stop with a share ratio of 0, which is unfortunate but not uncommon. Hopefully, his tracker is intelligent enough to see that he tried seeding it for a long time and cut him some slack.
A leecher is someone who plays his part in helping to upload the file poorly, often by leaving as soon as he has finished downloading the file and thereby preventing himself from being a resource to the rest of his peers.
Let’s reconsider Phase 1 & 2 if Shemp and Moe both had much faster internet connections:
Phase 1 happens a bit faster, but the end result is the same: each client has 1 of 4 pieces of the file.
In Phase 2, however, only Larry and Curly are constrained to downloading individual pieces at 1 Mb/min. In fact, because they have 1 Mb/min of upload bandwidth to spare, they can send their pieces to Moe twice as quickly. And Shemp can also send the fourth piece to Moe twice as quickly. So Moe ends up with a complete file halfway through Phase 2, and he can leave immediately if he wants to.
If Moe leaves, this is a major annoyance for everyone else. Since Moe left before he could finish uploading piece #1 to Larry and Curly, they have to get it from Shemp after they finish getting #4 from Shemp. So, in leaving, Moe has put a greater burden on Shemp and made the download take longer for Larry and Curly.
Another more tragic situation, which happens rather commonly in practice, is for all seeders to leave before every piece is available to non-seeders. For example, if Shemp leaves as soon as Moe does, figuring he has finished uploading all 4 pieces of the file, then piece #4 no longer exists on the network (Shemp only got halfway through uploading it to Larry and Curly before he left). Larry and Curly can share the pieces they have with each other, but this will only leave them with half of the file. The torrent is now dead, and can never be revived unless one of Shemp or Moe comes back to resume seeding. (This is what users are asking when they say things like “Seed!” or “Stuck at 99%! Seed, please!”)
A torrent file is a file that describes the files involved, their sizes, which server is coordinating who downloads what and when, and other information. Because it does not actually contain the files being sent around, the torrent file tends to be very small. It is more accurate to think of a torrent file as an announcement that there are people willing to upload the described files, rather than as an archive that actually contains those files. So a torrent upload site / tracker (like http://linuxtracker.org) doesn’t need to actually have disk space for every file in every torrent it is hosting.
In fact, this is a distinction that has made shutting down torrent-hosting sites (like thepiratebay) much more complicated: they don’t actually have the copyrighted work; all they have is a note from some stranger that he has the file and is willing to share it.
Shemp likely used any common BitTorrent client to create a torrent file describing what he wanted to upload and then sent that torrent file to a tracker of his choosing, where Moe, Larry, and Curly found it.
This question originally appeared on Quora. More questions on Computer Science: