End-to-end versus hop-by-hop: They're not equivalent. Hop-by-hop reliability is not end-to-end reliable. Even if every link on a path is reliable, that doesn't mean a sent packet will reach its destination. Intermediate nodes (routers/switches) can have buffers overflows or crash or lose power, links can be cut, etc. You can however be sure that the message *has* reached the destination if you receive an acknowledgement from the destination. Hop-by-hop reliability, therefore, is optional. It can improve performance when used on links where losses are not uncommon (like wireless links), because the retransmissions will come from the previous node, rather than from the original source. On the other hand, the overhead of hop-by-hop reliability can degrade performance if used on links where losses are very rare (like wires and optical fibers). Retransmission protocols: The sink sends acknowledgements, saying what has been received. The source waits for acknowledgements, and if too much time passes (a timer expires) sends a retransmission. A protocol is correct if it delivers every message exactly once. The efficiency of a protocol is the average rate at which it delivers messages (number per second) divided by the packet rate of the underlying unreliable channel (also number per second, so efficiency is unitless). Stop-and-fail: Assume there is a bound T on round-trip time (which includes forward transmission time and propagation delay, processing time at the sink, reverse transmission time and propagation delay, processing time at the source). source: M = next message while M exists send M wait up to T for ack if received ack then M = next message endwhile sink: while true wait for message send ack deliver message endwhile Assume the link is empty at the beginning. Since the source waits T before resending, it always sends in a fresh state, with no other packets in the system. So an ack can only mean the last packet was received. So every packet is delivered. But a packet can be delivered more than once, if an ack is lost. Solution: one-bit sequence number in messages. Stop-and-wait: source: S = 0 M = next message while M exists send S,M wait up to T for ack if received ack then M = next message S = !S endif endwhile sink: R = 0 while true wait for any seq,message send ack if seq == R then deliver message R = !R endif endwhile Now when an ack is lost, the sink sees that it has the same sequence number as the previous message, and does not deliver it again. Efficiency: Suppose a packet gets acknowledged with probability p, and the actual round-trip time is r. Call the average time to send a packet (as many times as necessary) and get an acknowledgement x. This time is r with probability p, and T+x with probability 1-p, so the expected time can also be expressed as r*p + (T+x)*(1-p). x = r*p + (T+x)*(1-p) x = r*p + T*(1-p) + x*(1-p) x*p = r*p + T*(1-p) x = r + T*(1-p)/p The throughput is 1/x. If the transmission time of a packet is tau, then the packet rate is 1/tau, and the efficiency is (1/x)/(1/tau) = tau/x. Assuming a maximum round-trip delay is dangerous. But if the source sends a message while an ack is still on its way, how can it know whether the ack acknowledges the packet it just sent, or the one before? Solution: Add a one-bit sequence number to the ack. Alternating bit: source: S = 0 M = next message while M exists send S,M wait up to T for S,ack if received S,ack then M = next message S = !S endif endwhile sink: R = 0 while true wait for any seq,message send seq,ack if seq == R then deliver message R = !R endif endwhile Now the T timer can expire while acks are still in transit from the sink, causing the source to send a duplicate message, causing the sink to send a duplicate ack. But now the source can see that the second ack is a duplicate, and avoid advancing too far. Notice that we can now tolerate the channel duplicating packets. But giving up bounded round-trip time requires us to assume a FIFO channel. If the channel could reorder ack0, ack0, ack1 to ack0, ack1, ack0, we'd be in trouble, because the second ack0 now looks like it counts. The efficiency is the same expression as for stop-and-wait (assuming we can abort an outgoing retransmission when we receive an acknowledgement for it), but now T can be made small, so the efficiency approaches tau/r. This is more efficient from the point of view of the source trying to use as much of the link as possible. However, the source is sending lots of duplicate packets, because it's not waiting for the acknowledgements before retransmitting. If the link is shared, this is inefficient from the point of view of the other users.