This event counts the number of retired instructions that caused split loads. A split load occurs when a data value is read, and part of the data is located in one cache line and part in another. Split loads reduce performance because they force the processor to read two cache lines separately and then paste the two parts of data back together. Reading data from two cache lines is several times slower than reading data from a single cache line even if the data is not otherwise properly aligned.
Generally, the compiler aligns data to avoid placing values across cache-line boundaries. However, if you cast a C or C++ pointer (for example, when using SIMD intrinsics) to one of a larger data size or otherwise manipulate a pointer's address, your chance of crossing a boundary increases. You can reduce the likelihood of splits by using the __declspec(align(n)) attribute to align arrays or structures of small data values that will be later accessed as larger values via casting. See your compiler documentation for information on using this attribute.
On the Pentium
MOB Load Replays Retired (Blocked Store-to-Load Forwards Retired)