Edition of2026-05-22

Federated learning delivers in two real clinical sites — but generalization remains the invisible wall of medical ML

By the editorial team

Today's 5 picks

Embedding-Based Federated Learning with Runtime Governance for Iron Deficiency Prediction

Real-world deployment of federated learning pipeline for iron deficiency prediction from full blood count data. Uses DeepCBC (frozen haematology foundation model) + FedMAP (personalised aggregation). Tested across two clinical sites (AUMC, NHSBT) with non-IID data. FedMAP improves ROC-AUC from 0.947→0.959 (AUMC) and 0.856→0.867 (NHSBT) versus local-only training.

Embeddings Benchmarks

arXiv cs.LG·SIG 75

Calibration, Uncertainty Communication, and Deployment Readiness in CKD Risk Prediction: A Framework Evaluation Study

Comparative study of 5 classifiers (logistic regression, random forest, XGBoost, SVM, naive Bayes) for chronic kidney disease risk prediction. All achieve AUROC 1.00 internally (UCI, 400 patients) but collapse on external MIMIC-IV data (AUROC 0.48-0.58). Calibration and conformal coverage severely degraded. No model meets clinical deployment criteria.

Evals AI safety

Simon Willison·SIG 75

FTC to Require Cox Media Group, Two Other Firms to Pay Nearly $1 Million to Settle Charges They Deceived Customers About “Active Listening” AI-Powered Marketing Service

FTC requires Cox Media Group and two other firms to pay nearly $1 million to settle charges they deceived customers about an "Active Listening" AI marketing service. The service claimed to listen to conversations via smart devices for ad targeting, but actually used no voice data at all.

Regulation AI safety Business

arXiv cs.LG·SIG 72

Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition

Self-Paced Curriculum Learning (SPCL) framework for multimodal emotion recognition in conversations. Dual-level Difficulty Measurer (utterance and conversation level) guides training from easier to harder instances. IEMOCAP tests show +1.2% to +6.6% F1 improvement, MELD reaches +10.4%, addressing modality imbalance.

Reasoning Benchmarks

Latent Space·SIG 65

[AINews] New AI Infra unicorns: Exa, Modal, TurboPuffer

Three AI infrastructure startups reach unicorn status: Exa (vector search), Modal (cloud platform), and TurboPuffer (distributed cache). Major funding rounds confirm consolidation in the AI infrastructure market.

Infrastructure Funding Vector search