A vector representation of a confused brain.
This article is a condensed version of a longer research paper I wrote one year ago about a rising issue - the increasing difficulty of distinguishing human-written content from AI-generated text. Recent events in schools and institutions have shown that people often misidentify genuine human work as artificial intelligence output. What motivated me to bring this up again was the information I saw circulating, explaining that the constitution itself is flagged by AI detectors. Whether it's true or not, I found the subject interesting.
To access the full research paper, please contact me.
When the Line Between Human and AI Becomes Blurred
My interest in this problem began when I was suspected of using AI to write an assignment, despite the fact that I had written everything myself. Multiple detection tools classified my work as more than 90% AI-generated. Similar issues occurred when I submitted to detection tools others writtings for me.
These repeated misunderstandings pushed me to investigate why AI detectors often mislabel human writing.
Real-World Cases and the Scale of the Problem
Increasingly, teachers rely on AI detection tools to evaluate student work, Thinking that we can believe them with our eyes closed. With the recents really quick events, and emergence of AI, we can't blame them. However, these tools frequently produce false positives. Online discussions reveal numerous cases where students were unfairly penalized for work they created themselves. Even controlled experiments - such as those performed by testers - show that trained teachers cannot reliably distinguish human and AI-generated essays.
How Writers Can Reduce False Positives
To avoid being incorrectly flagged by AI detectors, peoples can adapt their style with simple strategies:
- Vary sentence structure and avoid overly rigid formatting.
- Use vocabulary appropriate to personal knowledge rather than excessively polished phrasing.
- Include natural imperfections or small stylistic quirks.
- Support ideas with personal experiences or citations.
- Keep drafts to demonstrate the writing process to teachers.
- Discuss concerns openly with instructors if needed.
These methods strengthen writing clarity and help avoid misclassification. Even though I think it's pretty crazy to have to change our sentence structures to avoid trouble in an academic environment...
Technical Reasons for False Positives
AI detectors rely on classification algorithms such as logistic regression, SVMs, random forests, and neural networks (I address it in depth in my research paper). They analyze linguistic features and compare them to large datasets of text produced by humans and AI models. However, several technical limitations make them unreliable:
- Biased or incomplete training data that do not represent the full diversity of human writing.
- Outdated datasets that cannot keep up with rapidly evolving AI models.
- Overly strict thresholds that lead to unnecessary false positives.
- Domain mismatches between training data and real-world writing.
- Lack of transparency that prevents users from understanding why a text was flagged.
I also used my own project, ZoryaTrace, as an example to show how simple TF-IDF and Naïve Bayes classifiers operate ; éand how small biases in datasets can severely affect results.
Improving Detection Systems
To reduce false positives and improve reliability, several steps can be taken:
- Expand and diversify datasets used for training.
- Regularly update detection models.
- Use ensembles combining several algorithms.
- Increase transparency in how detectors work.
- Adopt interpretability tools such as LIME and SHAP.
- Encourage open-source collaboration to advance the field.
I am currently working and experimenting on my own system I mentionned before, ZoryaTrace, which attempts to incorporate some of these improvements.
Sources
- My research notes and tests performed in 2024