8,700 Studies Reviewed. 87.0% Found Biological Effects. The Evidence is Clear.

Note: This study found no significant biological effects under its experimental conditions. We include all studies for scientific completeness.

The Impact of 9.375 GHz Microwave Radiation on the Emotional and Cognitive Abilities of Mice

No Effects Found

Wang X, Zhao X, Xu J, Li M, Sun B, Gao A, Zhang L, Wu S, Liu X, Zou D, Li Z, Dong G, Zhang C, Wang C · 2025

Share:

AI systems show major knowledge gaps on expert-level questions, raising concerns about AI-generated health information reliability.

Plain English Summary

Summary written for general audiences

This study introduces a new academic benchmark called 'Humanity's Last Exam' designed to test advanced AI language models on expert-level questions across multiple subjects. The researchers found that current state-of-the-art AI systems perform poorly on these challenging questions, revealing significant gaps between AI capabilities and human expert knowledge.

Cite This Study
Wang X, Zhao X, Xu J, Li M, Sun B, Gao A, Zhang L, Wu S, Liu X, Zou D, Li Z, Dong G, Zhang C, Wang C (2025). The Impact of 9.375 GHz Microwave Radiation on the Emotional and Cognitive Abilities of Mice.
Show BibTeX
@article{wang_x_zhao_x_xu_j_li_m_sun_b_gao_a_zhang_l_wu_s_liu_x_zou_d_li_z_dong_g_zhang_c_wang_c_ce3557,
  author = {Wang X and Zhao X and Xu J and Li M and Sun B and Gao A and Zhang L and Wu S and Liu X and Zou D and Li Z and Dong G and Zhang C and Wang C},
  title = {The Impact of 9.375 GHz Microwave Radiation on the Emotional and Cognitive Abilities of Mice},
  year = {2025},
  doi = {10.1038/s41586-025-09962-4},
  
}

Quick Questions About This Study

It tests AI language models on 2,500 expert-level academic questions across mathematics, humanities, and natural sciences that cannot be quickly answered through internet searches but have clear, verifiable solutions.
State-of-the-art AI models demonstrate low accuracy and poor calibration on the benchmark, showing significant gaps between current AI capabilities and expert human knowledge levels.
Poor AI performance on complex topics raises concerns about reliability of AI-generated health information, especially for nuanced subjects like EMF research where accurate interpretation is crucial.
Unlike popular benchmarks where AI achieves over 90% accuracy, this expert-level benchmark reveals true capability gaps through questions requiring deep subject matter expertise rather than pattern recognition.
Yes, by testing AI performance on expert-level academic content, it provides insight into whether AI systems can reliably process and interpret complex scientific information across disciplines.