Document piece length selection algorithm
Add a page to the book discussing factors in piece length selection, and Intermodal's piece length selection algorithm. type: documentation pr: https://github.com/casey/intermodal/pull/392 fixes: - https://github.com/casey/intermodal/issues/367
This commit is contained in:
		
							parent
							
								
									3ed449ce93
								
							
						
					
					
						commit
						09b0ee316c
					
				| @ -4,7 +4,8 @@ Changelog | ||||
| 
 | ||||
| UNRELEASED - 2020-04-19 | ||||
| ----------------------- | ||||
| - :books: [`xxxxxxxxxxxx`](https://github.com/casey/intermodal/commits/master) Generate reference sections with `bin/gen` - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
| - :books: [`xxxxxxxxxxxx`](https://github.com/casey/intermodal/commits/master) Document piece length selection algorithm ([#392](https://github.com/casey/intermodal/pull/392)) - Fixes [#367](https://github.com/casey/intermodal/issues/367) - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
| - :books: [`3ed449ce9325`](https://github.com/casey/intermodal/commit/3ed449ce932509ac88bd4837d74c9cbbb0729da9) Generate reference sections with `bin/gen` - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
| - :art: [`a6bf75279181`](https://github.com/casey/intermodal/commit/a6bf7527918178821e080db10e65b057f427200d) Use `invariant` instead of `unwrap` and `expect` - Fixes [#167](https://github.com/casey/intermodal/issues/167) - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
| - :white_check_mark: [`faf46c0f0e6f`](https://github.com/casey/intermodal/commit/faf46c0f0e6fd4e4f8b504d414a3bf02d7d68e4a) Test that globs match torrent contents - Fixes [#377](https://github.com/casey/intermodal/issues/377) - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
| - :books: [`0a754d0bcfcf`](https://github.com/casey/intermodal/commit/0a754d0bcfcfd65127d7b6e78d41852df78d3ea2) Add manual Arch install link - Fixes [#373](https://github.com/casey/intermodal/issues/373) - _Casey Rodarmor <casey@rodarmor.com>_ | ||||
|  | ||||
| @ -6,9 +6,10 @@ Summary | ||||
| {{commands}} | ||||
| 
 | ||||
| - [Bittorrent](./bittorrent.md) | ||||
|   - [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md) | ||||
|   - [Piece Length Selection](./bittorrent/piece-length-selection.md) | ||||
|   - [BEP Support](./bittorrent/bep-support.md) | ||||
|   - [Metainfo Utilities](./bittorrent/metainfo-utilities.md) | ||||
|   - [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md) | ||||
|   - [UDP Tracker Protocol](./bittorrent/udp-tracker-protocol.md) | ||||
| 
 | ||||
| {{references}} | ||||
|  | ||||
| @ -15,9 +15,10 @@ Summary | ||||
|   - [`imdl torrent verify`](./commands/imdl-torrent-verify.md) | ||||
| 
 | ||||
| - [Bittorrent](./bittorrent.md) | ||||
|   - [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md) | ||||
|   - [Piece Length Selection](./bittorrent/piece-length-selection.md) | ||||
|   - [BEP Support](./bittorrent/bep-support.md) | ||||
|   - [Metainfo Utilities](./bittorrent/metainfo-utilities.md) | ||||
|   - [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md) | ||||
|   - [UDP Tracker Protocol](./bittorrent/udp-tracker-protocol.md) | ||||
| 
 | ||||
| - [References](./references.md) | ||||
|  | ||||
							
								
								
									
										127
									
								
								book/src/bittorrent/piece-length-selection.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										127
									
								
								book/src/bittorrent/piece-length-selection.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,127 @@ | ||||
| BitTorrent Piece Length Selection | ||||
| ================================= | ||||
| 
 | ||||
| BitTorrent `.torrent` files contain so-called metainfo that allows BitTorrent | ||||
| peers to locate, download, and verify the contents of a torrent. | ||||
| 
 | ||||
| This metainfo includes the piece list, a list of SHA-1 hashes of fixed-size | ||||
| pieces of the torrent data. The size of these pieces is chosen by the torrent | ||||
| creator. | ||||
| 
 | ||||
| Intermodal has a simple algorithm that attempts to pick a reasonable piece | ||||
| length for a torrent given the size of the contents. | ||||
| 
 | ||||
| For compatibility with the | ||||
| [BitTorrent v2 specification](http://bittorrent.org/beps/bep_0052.html), the | ||||
| algorithm chooses piece lengths that are powers of two, and that are at least | ||||
| 16KiB. | ||||
| 
 | ||||
| The maximum automatically chosen piece length is 16MiB, as piece lengths larger | ||||
| than 16MiB have been reported to cause issues for some clients. | ||||
| 
 | ||||
| In addition to the above constraints, there are a number of additional factors | ||||
| to consider. | ||||
| 
 | ||||
| 
 | ||||
| Factors favoring smaller piece length | ||||
| ------------------------------------- | ||||
| 
 | ||||
| - To avoid uploading bad data, peers only upload data from full pieces, which | ||||
|   can be verified by hash. Decreasing the piece size allows peers to more | ||||
|   quickly obtain a full piece, which decreases the time before they begin | ||||
|   uploading, and receiving data in return. | ||||
| 
 | ||||
| - Decreasing the piece size decreases the amount of data that must be thrown | ||||
|   away in case of corruption. | ||||
| 
 | ||||
| 
 | ||||
| Factors favoring larger piece length | ||||
| ------------------------------------ | ||||
| 
 | ||||
| - Increasing the piece size decreases the protocol overhead from requesting | ||||
|   many pieces. | ||||
| 
 | ||||
| - Increasing the piece size decreases the number of pieces, decreasing the | ||||
|   size of the metainfo. | ||||
| 
 | ||||
| - Increasing piece length increases the proportion of disk seeks to disk | ||||
|   reads, which can be beneficial for spinning disks. | ||||
| 
 | ||||
| 
 | ||||
| Intermodal's Algorithm | ||||
| ---------------------- | ||||
| 
 | ||||
| In Python, the algorithm used by intermodal is: | ||||
| 
 | ||||
| ```python | ||||
| MIN = 16 * 1024 | ||||
| MAX = 16 * 1024 * 1024 | ||||
| 
 | ||||
| def piece_length(content_length): | ||||
|   exponent = math.log2(content_length) | ||||
|   length = 1 << int((exponent / 2 + 4)) | ||||
|   return min(max(length, MIN), MAX) | ||||
| ``` | ||||
| 
 | ||||
| Which gives the following piece lengths: | ||||
| 
 | ||||
| ``` | ||||
| Content -> Piece Length x Count    = Piece List Size | ||||
| 16 KiB  -> 16 KiB       x 1        = 20 bytes | ||||
| 32 KiB  -> 16 KiB       x 2        = 40 bytes | ||||
| 64 KiB  -> 16 KiB       x 4        = 80 bytes | ||||
| 128 KiB -> 16 KiB       x 8        = 160 bytes | ||||
| 256 KiB -> 16 KiB       x 16       = 320 bytes | ||||
| 512 KiB -> 16 KiB       x 32       = 640 bytes | ||||
| 1 MiB   -> 16 KiB       x 64       = 1.25 KiB | ||||
| 2 MiB   -> 16 KiB       x 128      = 2.5 KiB | ||||
| 4 MiB   -> 32 KiB       x 128      = 2.5 KiB | ||||
| 8 MiB   -> 32 KiB       x 256      = 5 KiB | ||||
| 16 MiB  -> 64 KiB       x 256      = 5 KiB | ||||
| 32 MiB  -> 64 KiB       x 512      = 10 KiB | ||||
| 64 MiB  -> 128 KiB      x 512      = 10 KiB | ||||
| 128 MiB -> 128 KiB      x 1024     = 20 KiB | ||||
| 256 MiB -> 256 KiB      x 1024     = 20 KiB | ||||
| 512 MiB -> 256 KiB      x 2048     = 40 KiB | ||||
| 1 GiB   -> 512 KiB      x 2048     = 40 KiB | ||||
| 2 GiB   -> 512 KiB      x 4096     = 80 KiB | ||||
| 4 GiB   -> 1 MiB        x 4096     = 80 KiB | ||||
| 8 GiB   -> 1 MiB        x 8192     = 160 KiB | ||||
| 16 GiB  -> 2 MiB        x 8192     = 160 KiB | ||||
| 32 GiB  -> 2 MiB        x 16384    = 320 KiB | ||||
| 64 GiB  -> 4 MiB        x 16384    = 320 KiB | ||||
| 128 GiB -> 4 MiB        x 32768    = 640 KiB | ||||
| 256 GiB -> 8 MiB        x 32768    = 640 KiB | ||||
| 512 GiB -> 8 MiB        x 65536    = 1.25 MiB | ||||
| 1 TiB   -> 16 MiB       x 65536    = 1.25 MiB | ||||
| 2 TiB   -> 16 MiB       x 131072   = 2.5 MiB | ||||
| 4 TiB   -> 16 MiB       x 262144   = 5 MiB | ||||
| 8 TiB   -> 16 MiB       x 524288   = 10 MiB | ||||
| 16 TiB  -> 16 MiB       x 1048576  = 20 MiB | ||||
| 32 TiB  -> 16 MiB       x 2097152  = 40 MiB | ||||
| 64 TiB  -> 16 MiB       x 4194304  = 80 MiB | ||||
| 128 TiB -> 16 MiB       x 8388608  = 160 MiB | ||||
| 256 TiB -> 16 MiB       x 16777216 = 320 MiB | ||||
| 512 TiB -> 16 MiB       x 33554432 = 640 MiB | ||||
| 1 PiB   -> 16 MiB       x 67108864 = 1.25 GiB | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| References | ||||
| ---------- | ||||
| 
 | ||||
| ### Articles | ||||
| 
 | ||||
| - [Vuze Wiki](https://wiki.vuze.com/w/Torrent_Piece_Size) | ||||
| 
 | ||||
| - [TorrentFreak](https://torrentfreak.com/how-to-make-the-best-torrents-081121/) | ||||
| 
 | ||||
| ### Implementations | ||||
| 
 | ||||
| - [libtorrent](https://github.com/arvidn/libtorrent/blob/a3440e54bb7f65ac6100c3d993c53f887025d660/src/create_torrent.cpp#L367) | ||||
| 
 | ||||
| - [libtransmission](https://github.com/transmission/transmission/blob/a482100f0cbae8050fd7e954af2cb1311205916e/libtransmission/makemeta.c#L89) | ||||
| 
 | ||||
| - [dottorrent](https://github.com/kz26/dottorrent/blob/fea5714efe0cde2a55eabfb387295781a78d84bb/dottorrent/__init__.py#L154) | ||||
| 
 | ||||
| - [Torrent File Editor](https://github.com/torrent-file-editor/torrent-file-editor/blob/811e401b38f26b6d94c4808c54ae2dcc7bbc27dd/mainwindow.cpp#L1210) | ||||
							
								
								
									
										127
									
								
								book/src/bittorrent/piece-length.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										127
									
								
								book/src/bittorrent/piece-length.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,127 @@ | ||||
| Piece Length Selection | ||||
| ====================== | ||||
| 
 | ||||
| BitTorrent `.torrent` files contain so-called metainfo that allows BitTorrent | ||||
| peers to locate, download, and verify the contents of a torrent. | ||||
| 
 | ||||
| This metainfo includes the piece list, a list of SHA-1 hashes of fixed-size | ||||
| pieces of the torrent data. The size of these pieces is chosen by the torrent | ||||
| creator. | ||||
| 
 | ||||
| Intermodal has a simple algorithm that attempts to pick a reasonable piece | ||||
| length for a torrent given the size of the contents. | ||||
| 
 | ||||
| For compatibility with the | ||||
| [BitTorrent v2 specification](http://bittorrent.org/beps/bep_0052.html), the | ||||
| algorithm chooses piece lengths that are powers of two, and that are at least | ||||
| 16 KiB. | ||||
| 
 | ||||
| The maximum automatically chosen piece length is 16 MiB, as piece lengths larger | ||||
| than 16 MiB have been reported to cause issues for some clients. | ||||
| 
 | ||||
| In addition to the above constraints, there are a number of additional factors | ||||
| to consider. | ||||
| 
 | ||||
| 
 | ||||
| Factors favoring smaller piece length | ||||
| ------------------------------------- | ||||
| 
 | ||||
| - To avoid uploading bad data, peers only upload data from full pieces, which | ||||
|   can be verified by hash. Decreasing the piece size allows peers to more | ||||
|   quickly obtain a full piece, which decreases the time before they begin | ||||
|   uploading, and receiving data in return. | ||||
| 
 | ||||
| - Decreasing the piece size decreases the amount of data that must be thrown | ||||
|   away in case of corruption. | ||||
| 
 | ||||
| 
 | ||||
| Factors favoring larger piece length | ||||
| ------------------------------------ | ||||
| 
 | ||||
| - Increasing the piece size decreases the protocol overhead from requesting | ||||
|   many pieces. | ||||
| 
 | ||||
| - Increasing the piece size decreases the number of pieces, decreasing the | ||||
|   size of torrent metainfo. | ||||
| 
 | ||||
| - Increasing piece length increases the proportion of disk seeks to disk | ||||
|   reads, which can be beneficial for spinning disks. | ||||
| 
 | ||||
| 
 | ||||
| Intermodal's Algorithm | ||||
| ---------------------- | ||||
| 
 | ||||
| In Python, the algorithm used by intermodal is: | ||||
| 
 | ||||
| ```python | ||||
| MIN = 16 * 1024 | ||||
| MAX = 16 * 1024 * 1024 | ||||
| 
 | ||||
| def piece_length(content_length): | ||||
|   exponent = math.log2(content_length) | ||||
|   length = 1 << int((exponent / 2 + 4)) | ||||
|   return min(max(length, MIN), MAX) | ||||
| ``` | ||||
| 
 | ||||
| Which gives the following piece lengths: | ||||
| 
 | ||||
| ``` | ||||
| Content -> Piece Length x Count    = Piece List Size | ||||
| 16 KiB  -> 16 KiB       x 1        = 20 bytes | ||||
| 32 KiB  -> 16 KiB       x 2        = 40 bytes | ||||
| 64 KiB  -> 16 KiB       x 4        = 80 bytes | ||||
| 128 KiB -> 16 KiB       x 8        = 160 bytes | ||||
| 256 KiB -> 16 KiB       x 16       = 320 bytes | ||||
| 512 KiB -> 16 KiB       x 32       = 640 bytes | ||||
| 1 MiB   -> 16 KiB       x 64       = 1.25 KiB | ||||
| 2 MiB   -> 16 KiB       x 128      = 2.5 KiB | ||||
| 4 MiB   -> 32 KiB       x 128      = 2.5 KiB | ||||
| 8 MiB   -> 32 KiB       x 256      = 5 KiB | ||||
| 16 MiB  -> 64 KiB       x 256      = 5 KiB | ||||
| 32 MiB  -> 64 KiB       x 512      = 10 KiB | ||||
| 64 MiB  -> 128 KiB      x 512      = 10 KiB | ||||
| 128 MiB -> 128 KiB      x 1024     = 20 KiB | ||||
| 256 MiB -> 256 KiB      x 1024     = 20 KiB | ||||
| 512 MiB -> 256 KiB      x 2048     = 40 KiB | ||||
| 1 GiB   -> 512 KiB      x 2048     = 40 KiB | ||||
| 2 GiB   -> 512 KiB      x 4096     = 80 KiB | ||||
| 4 GiB   -> 1 MiB        x 4096     = 80 KiB | ||||
| 8 GiB   -> 1 MiB        x 8192     = 160 KiB | ||||
| 16 GiB  -> 2 MiB        x 8192     = 160 KiB | ||||
| 32 GiB  -> 2 MiB        x 16384    = 320 KiB | ||||
| 64 GiB  -> 4 MiB        x 16384    = 320 KiB | ||||
| 128 GiB -> 4 MiB        x 32768    = 640 KiB | ||||
| 256 GiB -> 8 MiB        x 32768    = 640 KiB | ||||
| 512 GiB -> 8 MiB        x 65536    = 1.25 MiB | ||||
| 1 TiB   -> 16 MiB       x 65536    = 1.25 MiB | ||||
| 2 TiB   -> 16 MiB       x 131072   = 2.5 MiB | ||||
| 4 TiB   -> 16 MiB       x 262144   = 5 MiB | ||||
| 8 TiB   -> 16 MiB       x 524288   = 10 MiB | ||||
| 16 TiB  -> 16 MiB       x 1048576  = 20 MiB | ||||
| 32 TiB  -> 16 MiB       x 2097152  = 40 MiB | ||||
| 64 TiB  -> 16 MiB       x 4194304  = 80 MiB | ||||
| 128 TiB -> 16 MiB       x 8388608  = 160 MiB | ||||
| 256 TiB -> 16 MiB       x 16777216 = 320 MiB | ||||
| 512 TiB -> 16 MiB       x 33554432 = 640 MiB | ||||
| 1 PiB   -> 16 MiB       x 67108864 = 1.25 GiB | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| References | ||||
| ---------- | ||||
| 
 | ||||
| ### Articles | ||||
| 
 | ||||
| - [Vuze Wiki](https://wiki.vuze.com/w/Torrent_Piece_Size) | ||||
| 
 | ||||
| - [TorrentFreak](https://torrentfreak.com/how-to-make-the-best-torrents-081121/) | ||||
| 
 | ||||
| ### Implementations | ||||
| 
 | ||||
| - [libtorrent](https://github.com/arvidn/libtorrent/blob/a3440e54bb7f65ac6100c3d993c53f887025d660/src/create_torrent.cpp#L367) | ||||
| 
 | ||||
| - [libtransmission](https://github.com/transmission/transmission/blob/a482100f0cbae8050fd7e954af2cb1311205916e/libtransmission/makemeta.c#L89) | ||||
| 
 | ||||
| - [dottorrent](https://github.com/kz26/dottorrent/blob/fea5714efe0cde2a55eabfb387295781a78d84bb/dottorrent/__init__.py#L154) | ||||
| 
 | ||||
| - [Torrent File Editor](https://github.com/torrent-file-editor/torrent-file-editor/blob/811e401b38f26b6d94c4808c54ae2dcc7bbc27dd/mainwindow.cpp#L1210) | ||||
| @ -1,21 +1,5 @@ | ||||
| // The piece length picker attempts to pick a reasonable piece length
 | ||||
| // for a torrent given the size of the torrent's contents.
 | ||||
| //
 | ||||
| // Constraints:
 | ||||
| // - Decreasing piece length increases protocol overhead.
 | ||||
| // - Decreasing piece length increases torrent metainfo size.
 | ||||
| // - Increasing piece length increases the amount of data that must be thrown
 | ||||
| //   away in case of corruption.
 | ||||
| // - Increasing piece length increases the amount of data that must be
 | ||||
| //   downloaded before it can be verified and uploaded to other peers.
 | ||||
| // - Decreasing piece length increases the proportion of disk seeks to disk
 | ||||
| //   reads. This can be an issue for spinning disks.
 | ||||
| // - The BitTorrent v2 specification requires that piece sizes be larger than 16
 | ||||
| //   KiB.
 | ||||
| //
 | ||||
| // These constraints could probably be exactly defined and optimized
 | ||||
| // using an integer programming solver, but instead we just copy what
 | ||||
| // libtorrent does.
 | ||||
| //! See [the book](https://imdl.io/book/bittorrent/piece-length.html) for more
 | ||||
| //! information on Intermodal's automatic piece length selection algorithm.
 | ||||
| 
 | ||||
| use crate::common::*; | ||||
| 
 | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user