Anton Xue

New methods enhance the trustworthiness of machine learning model predictions.

2025-10-20T17:43:18+00:00 ― 5 min read

A study on how language models can ignore instructions and their implications.

2025-07-26T00:20:06+00:00 ― 7 min read

Examining the role of feature extraction in improving machine learning interpretability.

2025-06-08T23:33:24+00:00 ― 7 min read