OpenAI Explores Text Watermarking to Identify ChatGPT Misuse

OpenAI has developed a tool aimed at identifying those who use ChatGPT to complete their assignments. However, as reported by The Wall Street Journal, the company is still deliberating on its release.

An OpenAI representative confirmed to TechCrunch that the company is investigating the text watermarking technique detailed in the Journal's article. The spokesperson noted that OpenAI is proceeding cautiously due to the method's complexities and its potential repercussions beyond just their own platform.

This approach diverges from most previous attempts to identify AI-generated text, which have generally failed. OpenAI itself discontinued its prior AI text detection tool last year due to its lack of accuracy.

Text watermarking would enable OpenAI to specifically detect content created by ChatGPT. This would involve subtle modifications to ChatGPT's word selection process, embedding an invisible watermark in the text that could be identified by a specialized tool.

Following the Journal's report, OpenAI updated a May blog post discussing their research on detecting AI-generated content. According to the update, text watermarking has shown "high accuracy and effectiveness against minor tampering, like paraphrasing." However, it has been "less effective against significant alterations, such as using translation tools, rewording with another generative model, or inserting and then removing special characters."

Consequently, OpenAI acknowledges that this method is "easily bypassed by determined users." The update also reiterated concerns about non-English speakers, warning that text watermarking might "stigmatize the use of AI as a valuable writing aid for non-native English speakers."