Information-theoretic Multimodal Representation Learning for Electrocardiogram Signals
MERIT, a multimodal pretraining framework, combines masked ECG modeling with ECG–text contrastive alignment to learn cardiac representations. On PTB-XL: +3% F1 (All) and +5% F1 (SubClass), +2.66% AUC zero-shot. Also improves clinical text generation with LLMs.