|Burst1| |Burst2| |Burst3| dq : ...ZZ d1d2d3d4 ZZ ZZ D1D2D3D4 ZZ D1D2D3D4 ZZZZ... dqs: ...00 10101010 ZZ 00 10101010 00 10101010 00ZZ... ^ ^ ^ ^ preamble preamble preamble preamble
The diagram shows the DDR bursts in halfcycles. Please, note the high impendance ZZ on data line between the bursts. 00 at DQS is the preamble, starting every new burst.
For bandwidth, the reads can be issued earlier, filling the gaps between bursts:
dq : ----DdDdDdDd DdDdDdDd DdDdDdDd DdDdDdDd --... dqs: --0010101010 10101010 10101010 10101010 --... ^ single preamble per continuous stream of bursts
This works because every chip has a centrally-controlled DQS driver. If new Read is issued to chip at the right time while the previous is still in progress, the preamble is omitted and back-to-back read results.
JEDEC also specifies organization of the chips into data-parallel ranks on the DIMM. That is, you may have 8 chips receiving the Read at the same time and responding each on its own DQS. The problem emerges when you decide to switch to another rank. This is another chip, connected to the same data line.
While rank1 chips complete producing data on the bus, you issue the Read to rank2. However, this time, you cannot fill the ZZ-gaps with back-to-back data because Chip Select is used to choose the rank - the chips do not listen what their counterpart is doing. Rank2 chips, therefore, are not aware that rank1 finishes the burst with last dqs=10 and start their output by preamble, creating a congestion if you try the consecutive read
rank_1 dq : ... Dd Dd Dd Dd -- -- ... rank_1 dqs: ... 10 10 10 10 -- -- ... rank_2 dqs: ... -- -- -- 00 10 10 10 10 10 ... rank_2 dq : ... -- -- -- -- Dd Dd Dd Dd Dd ... ^ Here is the DQS conflict
You can handle the postamble cycle by overcomplicating HW design and introducing the rank-switch latency. What are the advantages?
PS. They call it tRTRS timing