<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[PublMe - Space: Posted Reaction by PublMe bot in PublMe]]></title>
	<link>https://publme.space/reactions/v/41052</link>
	<atom:link href="https://publme.space/reactions/v/41052" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://publme.space/reactions/v/41052</guid>
	<pubDate>Sat, 22 Jun 2024 22:00:00 +0200</pubDate>
	<link>https://publme.space/reactions/v/41052</link>
	<title><![CDATA[Posted Reaction by PublMe bot in PublMe]]></title>
	<description><![CDATA[
<p>Uncovering ChatGPT Usage in Academic Papers Through Excess Vocabulary</p>
<div><img width="800" height="282" src="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?w=800" alt="" srcset="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg 1094w, https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?resize=250, 88 250w, https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?resize=400, 141 400w, https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?resize=800, 282 800w" data-attachment-id="692427" data-permalink="https://hackaday.com/2024/06/22/uncovering-chatgpt-usage-in-academic-papers-through-excess-vocabulary/excess_vocabulary_chatgpt_2024_kobak_et_al_2024/" data-orig-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg" data-orig-size="1094,386" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="excess_vocabulary_chatgpt_2024_kobak_et_al_2024" data-image-description="" data-image-caption="" data-medium-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?w=400" data-large-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_2024_kobak_et_al_2024.jpg?w=800"></div><figure aria-describedby="caption-attachment-692428"><a rel="nofollow" href="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg"><img data-attachment-id="692428" data-permalink="https://hackaday.com/2024/06/22/uncovering-chatgpt-usage-in-academic-papers-through-excess-vocabulary/excess_vocabulary_chatgpt_vs_natural_change/" data-orig-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg" data-orig-size="518,518" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="excess_vocabulary_chatgpt_vs_natural_change" data-image-description="" data-image-caption="&lt;p&gt;Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations&lt;br /&gt; from 2021–22 to 2023–24. The first six words are affected by&lt;br /&gt; ChatGPT; the last three relate to major events that influenced&lt;br /&gt; scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)&lt;/p&gt;" data-medium-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg?w=400" data-large-file="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg?w=518" src="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg?w=400" alt="" width="400" height="400" srcset="https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg 518w, https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg?resize=250, 250 250w, https://hackaday.com/wp-content/uploads/2024/06/excess_vocabulary_chatgpt_vs_natural_change.jpg?resize=400, 400 400w"></a><figcaption>Frequencies of PubMed abstracts containing certain words. Black lines show counterfactual extrapolations<br />from 2021–22 to 2023–24. The first six words are affected by<br />ChatGPT; the last three relate to major events that influenced<br />scientific writing and are shown for comparison. (Credit: Kobak et al., 2024)</figcaption></figure><p>That students these days love to use ChatGPT for assistance with reports and other writing tasks is hardly a secret, but in academics it’s becoming ever more prevalent as well. This raises the question of whether ChatGPT-assisted academic writings can be distinguished somehow. According to [Dmitry Kobak] and colleagues this is the case, with a strong sign of ChatGPT use being the presence of a lot of flowery excess vocabulary in the text. As detailed in <a rel="nofollow" href="https://arxiv.org/abs/2406.07016" target="_blank">their prepublication paper</a>, the frequency of certain style words is a remarkable change in the used vocabulary of the published works examined.</p><p>For their study they looked at over 14 million biomedical abstracts from 2010 to 2024 obtained via PubMed. These abstracts were then analyzed for word usage and frequency, which shows both natural increases in word frequency (e.g. from the SARS-CoV-2 pandemic and Ebola outbreak), as well as massive spikes in excess vocabulary that coincide with the public availability of ChatGPT and similar LLM-based tools.</p><p>In total 774 unique excess words were annotated. Here ‘excess’ means ‘outside of the norm’, following the pattern of ‘excess mortality’ where mortality during one period noticeably deviates from patterns established during previous periods. In this regard the bump in words like <em>respiratory</em> are logical, but the surge in style words like <em>intricate</em> and <em>notably</em> would seem to be due to LLMs having a penchant for such flowery, overly dramatized language.</p><p>The researchers have made the <a rel="nofollow" href="https://github.com/berenslab/chatgpt-excess-words" target="_blank">analysis code available</a> for those interested in giving it a try on another corpus. The main author also <a rel="nofollow" href="https://twitter.com/hippopedoid/status/1804127070642950331" target="_blank">addressed</a> the question of whether ChatGPT might be influencing people to write more like an LLM. At this point it’s still an open question of whether people would be more inclined to use ChatGPT-like vocabulary or actively seek to avoid sounding like an LLM.</p>]]></description>
	<dc:creator>PublMe bot</dc:creator>
</item>

</channel>
</rss>