Wednesday, May 22, 2024

AI Instruments to Analyze Complicated Paperwork

Generative AI is useful for condensing and simplifying huge quantities of information. Use instances for entrepreneurs and enterprise execs embrace:

  • Explaining sophisticated paperwork, equivalent to patents.
  • Summarizing lengthy narratives into key factors and definitions.
  • Extracting vital sections of ebooks and whitepapers for blogs and promotional supplies.

What follows are 4 gen AI instruments to perform these duties. To check, I instructed the instruments to research a 2023 patent from Google for Search Generative Expertise, with three prompts:

  • What is that this doc about?
  • How does SGE choose exterior sources in its solutions? Do the sources originate from natural search outcomes?
  • Outline important phrases from the doc.

My objectives have been to judge the accuracy of the responses and examine for “hallucinating” — inventing information.


Home page of ChatGPT


ChatGPT can now course of uploaded paperwork, together with PDF information. It offered a generic overview of the immediate, “What is that this doc about?”

However it was the one software that answered the second query, “Did the sources originate from natural search outcomes?”

In keeping with the patent doc, not all paperwork are chosen solely from natural search outcomes. The choice course of entails totally different classes of search end result paperwork:

Question-Responsive Search End result Paperwork: These are paperwork obtained in response to the unique question submitted by the person.

Associated-Question-Responsive Search End result Paperwork: These paperwork are derived from searches carried out for queries associated to the unique question.

Current-Search-Responsive Search End result Paperwork: These are paperwork that responded to current queries, presumably associated to the present question or person’s search historical past.

ChatGPT made up the names of the classes (in daring)! The patent doesn’t use these phrases.

So I prompted, “I don’t see these class names within the unique doc. Did you make them up?”

ChatGPT’s response:

You might be appropriate, and I apologize for the confusion. The class names I used (“Question-Responsive Search End result Paperwork,” “Associated-Question-Responsive Search End result Paperwork,” and “Current-Search-Responsive Search End result Paperwork”) have been certainly my very own phrasing to summarize the ideas described within the patent. They aren’t immediately quoted from the doc.

The patent really discusses the method of choosing paperwork based mostly on their responsiveness to a given question, associated queries, and up to date searches. These alternatives can embrace paperwork that responded to the unique question, in addition to these responding to associated or current queries. My intent was to make clear this course of, however I ought to have acknowledged clearly that these class names have been my very own summarizations, not direct quotations from the doc.

Thanks for mentioning this discrepancy.

ChatGPT’s response is an instance of hallucinating, a weak point of many gen-AI instruments.


Home page of Claude


Claude is an unbiased generative AI mannequin from Anthropic that doesn’t use ChatGPT, not like the opposite instruments on this listing. In my testing, Claude was probably the most cautious in responding.

For instance, the opposite three instruments answered “Sure” to “Are the paperwork chosen from the natural search outcomes?” Solely Claude supplied a nuanced response:

The patent doesn’t explicitly state if the paperwork are chosen from natural search outcomes.

Nonetheless, there are just a few contextual clues that recommend they’re referring to natural search outcomes:

    • The background describes offering query-based summaries of “search end result paperwork” with out specifying if these are natural or paid outcomes. Usually “search outcomes” refers to natural outcomes returned by a search engine.
    • There is no such thing as a point out of filtering out or dealing with paid/promoted end result paperwork in a different way than natural paperwork when deciding on paperwork to summarize.
    • The purpose appears to be summarizing probably the most related paperwork to the search question to offer an outline for customers. Natural search outcomes are usually extra optimized for relevancy than promotional or paid outcomes.
    • Associated queries are used to develop doc choice. Associated/expanded queries are usually used with natural doc units, not paid promoting outcomes.

So whereas not definitively acknowledged, the patent language strongly suggests the query-responsive paperwork being chosen to generate summaries are natural search outcomes slightly than paid or promotional outcomes.

PDF Friends

Home page of PDF Pals

PDF Friends

PDF Friends is a desktop utility that provides 30 free questions for testing. It requires a ChatGPT API key for these assessments.

As soon as put in, PDF Friends can course of uploaded PDF paperwork.

The software helpfully contains web page numbers with its responses, making it simpler to confirm the information. Clicking any web page quantity will take you to that part within the doc.

In my testing, PDF Friends didn’t simplify the patent to my stage of understanding. Its responses have been too technical, regardless of my prompting it in any other case. Nonetheless, the summaries have been helpful, albeit sophisticated.


Home page of AskYourPDF


AskYourPDF is an internet app requiring no API key for testing. After scanning a doc, AskYourPDF suggests optionally available follow-up questions. Like PDF Friends, it contains web page numbers, though they don’t seem to be clickable.

AskYourPDF’s responses have been simpler to grasp than PDF Friends’ and, conversely, much less complete. And it didn’t extract definitions from the PDF patent, stating incorrectly that none have been there.

Thus AskYourPDF in my testing was useful for higher-level overviews however not detailed. A advantage of that strategy, nonetheless, is probably going fewer hallucinations.

Curiously, all 4 instruments analyzed the Google PDF patent barely in a different way. Every offered distinctive explanations. The secret’s verifying the information. The entire instruments made errors.

Related Articles


Please enter your comment!
Please enter your name here

Latest Articles