I just want to explain why intersection types are well-suited to characterize classes of normalization (strong, head or weak), whereas other type systems can not. (simply-typed or system F).
The key difference is that you have to say: "if I can type M2 and M1→M2 then I can type M1". This is often not true in non-intersection types because a term can be duplicated:
(λx.Mxx)N→MNN
and then typing MNN means that you can type both occurrences of N but not with the same type, for example
M:T1→T2→T3N:T1N:T2
With intersection types you can transform this into:
M:T1∧T2→T1∧T2→T3N:T1∧T2
and then the crucial step is now really easy:
(λx.Mxx):T1∧T2→T3N:T1∧T2
so
(λx.Mxx)N can by typed with intersection types.
Now about union types: suppose you can type (λx.xx)(λy.y) with some union type, then you can also type λx.xx and then get for some types S,T1,…
x:T1∨T2∨⋯∨Tn⊢xx:S
But you still have to prove that for every
i,
x:Ti⊢xx:S which seems impossible even is
S is an union type.
This is why I don't think there is an easy characterization about normalization for union types.