Id guess it would be that the sound file has empty sound at the end.
I'd wish it was that simple (unless it is and I haven't noticed). I've been doing many tests to see if there is a problem with the sound file, and there is one: It's length. (In some cases. It applies to short lines of text, but larger always have delay.)
You see... I've tested with this line of dialogue:
I've created a sine-wave waveform and made it last an exact amount of time, then saved the file and tested on editor to see if there is any delay.
My results for this specific line are:
1 sec (Apparently works); 2 sec (Apparently works); 3 secs (Works); 4 sec (Works); 5 secs (Works); 6 secs (Works); 7 sec (Works); 7,5 sec (Doesn't work); 8 sec (Doesn't work).
I've also tested with the following line, which is longer than the previous one:
"The moustache gives me luck like this, soldier! It has saved my life many times! Are you ready for your training, soldier?"
3 sec, 4 secs, 5 sec, 6 sec (Apparently works). At 7 seconds it works. And on 7,5 sec and further (8 secs, for example) it doesn't work.
From what I can see the problem isn't if there is a "small silence" at the end. For example this:
That is the original waveform (Not the sine-wave test waveform) of the first line I tried: "Squadron! ATTEN-TION!"
It works perfectly without any delay. As you can see it lasts 2,35 sec which is one of the "acceptable" lengths for the dialogue according to my "research".
Now, try to use a dialog like THIS:
That text if measured with a waveform that lasts just 7 seconds (Acceptable limit) will have delay. Note: when I say waveform I mean the entire sound file length, I tested with a pure waveform without any silences, all constant from beginning to end.
For this line you'll need about 15 seconds (I "recorded" Powell's voice with my sound program using "Stereo Mix" input and his recording lasted around 15 seconds) of recording. In a regular mod it will have delay! Unlike in the "original" game! (I've taken this dialog from Mission Am 03).
But that's not all!!
Look at this line of dialogue: "Sir! May I ask why everybody has been cut their hair and is bald unlike me?"
Pretty "short", isn't it? - The "waveform" lasts 5,23 seconds. But sadly the dialog has delay. Not a good suprise. This makes me think there is something "else" that is producing this unwanted delay.
I've tried to find a simple mathematical calculation to see what is the relation, but I didn't reach any representative result.
I've also tried leaving the dialogue without sound, which is removing the .Wav file from the directory, and checking how long did OW made the dialog line last in seconds. Then, after measuring, I used the same value with a generated waveform and put the sound in the directory and tested. Delay was present.
I think it should be healthy to say a conclusion (But not solution, since I haven't found one, yet.) so here it goes:
My conclusion is that very short texts that are like "Sir yes sir!" or "Affirmative, sir!" or "Form up!" etc. can last up to 7 seconds without any delay. But longer texts like "Very well, soldiers! We are going to proceed with the Training now!" have delay even if they last less than 7 seconds.
Therefore more research is required to find a proper solution. I hope my work and results are of enough motivation to make someone finish what I tried to research, or at least bring the rest of the results needed to arrive to a solution.
Edit: It's very nice to see how in the original game the mission's dialogues have no delay. No matter how long or short they are.
Edit 2: The attachment has my results and math trials.