An AILabs Project

Effective Deepfake Detection begins with the models

Our initial release of Fakespot Deep Fake Detection will analyze any text you highlight to determine if it shows any signs of AI manipulation.To ensure the most reliable detection, we feature our superior ApolloDFT Model alongside three additional open-source detection engines, providing you a range of robust options for comprehensive analysis.

ApolloDFT

In-House
Developed in-house by the Fakespot team, the ApolloDFT Model excels in versatility across various domains and comprehensive coverage of different LLMs. It provides robust defense against adversarial attacks and ensures effective performance on text samples of various lengths.

Binoculars

Open Source
Binoculars is a state-of-the-art method for detecting AI-generated text. The Binoculars method requires no training data, but is based on the huge overlap in causal language model data sets. Binoculars takes a little longer to analyze and performs less well on short samples, but is very versatile and does well with longer samples.

UAR

Open Source
This model works by analyzing the text using a pre-trained system called LUAR (Learning Universal Authorship Representations). The highlighted text is compared to the closest matches in its training data. LUAR model is quick to analyze, and is average across domains and with short text, but performs less well on longer text.

ZipPy

Open Source
A research model for fast AI detection using compression. The Zippy model analyzes by making a statistical comparison between LLM generated text and the sample. This model is speedy and does an average job with short and long text across domains.

ApolloDFT’s comparison with other open source models

simplifiedClick here for simplified version
ApolloDFT

Many LLM detection products report nearly 100% accuracy on their evaluation datasets, but users of these products know the detectors make mistakes. This is likely because these products are evaluated on text that is very similar to their training dataset as seen in the “InDistrib.” column of our results table. 


Our training dataset includes as much data as we could collect, but in order to simulate APOLLO’s performance on text encountered by our users,i.e. potentially far from our training set, we evaluated APOLLO in the out-of-domain, out-of-LLM, and out-of-attack settings on the RAID dataset.


For these tests, we hold-out one key slice of data and use what remains for training. The above table reports the average AUROC score across all domains, LLMs, and attacks and should give an idea of how well the APOLLO method generalizes to documents far from our training set. The domain, LLM, and attack scores for APOLLO, while higher than the other methods, give a more realistic but maybe pessimistic view of how the model will generalize to text our users will see in the wilds of the internet.


UAR, zippy, and Binoculars are other open-source methods available in our extension.UAR is a nearest-neighbors based method and leverages a different open-sourced dataset.Zippy is a compression based method and leverages its own dataset as a dictionary.Binoculars is a SOTA zero-shot method that leverages perplexity; notice Binoculars does particularly well on the “no_attack” datasets.Finally, APOLLO is our method; it is an ensemble of both supervised and zero-shot methods to deliver superior detection of AI-generated text across diverse scenarios.We have made the supervised component of the APOLLO method available as open-source. For more details and to view a full report of our benchmark results, please visit our GitHub page.

Average AUROC across different out-of-Attack/Domain/LLM evaluations on the RAID dataset

  • APOLLO AI detection model leverages state-of-the-art perplexity and supervised learning methods enabling it to detect text generated by both popular and niche LLMs.
  • When evaluated on new and unseen text domains, LLMs, and attack strategies, APOLLO proved to be the most robust model. It is designed to handle the diverse contents found on the internet.
  • We have made the supervised component of the APOLLO method available as open-source. For more details and to view a full report of our benchmark results, please visit our GitHub page.
  • Add Deepfake Text Detector to your browser

    Download IconDownload for Chrome