@simone, Thanks for those papers which are interesting (even if the quality of their editing left a bit to be desired :-)

However neither of them really addresses my concern, which is that I suspect (without being enough of a mathematician to be able to prove) that a situation where a small amount of input is mapped through a hash function with larger output size risks compromising the one-way nature of the function used.

> remember that the probability of collisions is ideally proportional to about 1/(2^n), where n is the length of digest.

Where the unspoken assumption there is that the amount of input is large. Where the amount of input is known to not be large it approaches a chosen-plaintext scenario.

Poking around with Google, I came across

https://crypto.stackexchange.com/questions/16219/cryptographic-hash-function-for-32-bit-length-input-keys where somebody is asking a similar question. One suggestion was to use something like MD5 but to truncate the output (which is presumably better than XORing portions of the output together in case that shows up patterns), alternatively the person answering suggests and discusses a fairly simple function while emphasising that the important thing is to get the input bits diffusing thoroughly across the output.

I think that a fair example of this sort of thing I'm considering is where a short message (four to eight bytes) is being sent over a reliable data path, with an appended hash value which includes unknown "salt" as an IV. My feeling is that the size of the hash should not be substantially larger than that of the data.

MarkMLl