Last updated: April 19, 2026
Application No. 18/636,555
PROCESSING MULTI-TYPE DOCUMENT FOR MACHINE LEARNING COMPREHENSION

Non-Final OA §101§103
Filed
Apr 16, 2024
Examiner
HUYNH, LINDA TANG
Art Unit
2172
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
This examiner grants 36% of cases after interview

— +31.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 274 resolved cases, 2023–2026
Examiner Intelligence

HUYNH, LINDA TANG View full profile →
Grants only 36% of cases
Career Allow Rate
100 granted / 274 resolved
-18.5% vs TC avg
Strong +32% interview lift
Without
With
+31.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
30 currently pending
Career history
304
Total Applications
across all art units
Statute-Specific Performance

§101
9.8%
-30.2% vs TC avg
§103
53.4%
+13.4% vs TC avg
§102
13.4%
-26.6% vs TC avg
§112
18.6%
-21.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 274 resolved cases
Office Action

§101 §103
DETAILED ACTION
This Office Action is sent in response to Applicant's Communication received 04/16/2024 for 18636555. Claims 1-20 are presented.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/16/2024, 06/16/2025 was filed before the mailing date of a first action. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the IDS is being considered by the examiner.

Specification
The disclosure is objected to because of the following informalities: reference character "704" has been used to designate both "large language model" (Figure 7, paragraph 0059-0060) and "tree of thoughts" (paragraph 0059-0060).
Appropriate correction is required.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “703” has been used to designate both "natural language text" (Figure 7), "text format", and "text description" (paragraph 0059) and reference character "704" has been used to designate both "large language model" (Figure 7, paragraph 0059-0060) and "tree of thoughts" (paragraph 0059-0060).  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application.
Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to under 37 CFR 1.83(a).  The drawings must show every feature of the invention specified in the claims.  Therefore, the "wherein the result output by the language machine learning model is presented on a user interface" (claims 4, 11, 18) must be shown or the feature(s) canceled from the claim(s).  No new matter should be entered.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Objections
Claim 5 is objected to because of the following informalities.
Claim 5 recites "the large language model" which lacks antecedent basis and has been interpreted as "the [[large]] language --machine learning-- model".
Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
The claim recites receiving, by a processor set, a digital image of a document and a workflow describing an automation task; converting, by the processor set, the digital image of the document into rich text that includes layout information in the document; creating, by the processor set, based on the rich text and the workflow, a tree of thoughts that includes nodes and edges connecting the nodes and that binds at least some nodes representing the rich text with a task node representing the automation task; converting, by the processor set, the nodes and edges of the tree of thoughts into a natural language text; and inputting, by the processor set, the natural language text into a language machine learning model with attention given to a token in the natural language text representing the task node, the language machine learning model, in response, outputting a result of completing the automation task.
The limitation of converting the digital image of the document into rich text that includes layout information in the document, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, "converting" in the context of this claim encompasses a user observation of a digital image of a document and a user evaluation or judgement of rich text in the observed digital document image.
The limitation of creating based on the rich text and the workflow, a tree of thoughts that includes nodes and edges connecting the nodes and that binds at least some nodes representing the rich text with a task node representing the automation task, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, "creating" in the context of this claim encompasses a user observation or evaluation of mental thoughts and a user evaluation of relationships between the observed thoughts and an observation or an evaluation of a task based on an observed rich text and observed workflow.
The limitation of converting the nodes and edges of the tree of thoughts into a natural language text, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, "converting" in the context of this claim encompasses a user evaluation or judgement of natural language text based on observing or evaluating mental thoughts and relationships.
The limitation of inputting, by the processor set, the natural language text into a language machine learning model with attention given to a token in the natural language text representing the task node, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. For example, "inputting into a language machine learning model" in the context of this claim encompasses a user evaluation of mathematical calculations with a weighted value toward observed natural language text.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the "Mental Processes" grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application. In particular, the claim recites a processor set; receiving, by a processor set, a digital image of a document and a workflow describing an automation task; and outputting a result of completing the automation task. The processor set, digital document image, workflow automation task, and language machine learning model is recited at a high level of generality and recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer [MPEP 2106.05(f)]. These limitations can also be viewed as nothing more than an attempt to generally link the use of the judicial exception to the technological environment of a computerized workflow [MPEP 2106.05(h)].
The receiving and outputting represent mere data gathering and data output necessary for use of the recited judicial exception, as the obtained digital image and workflow are used in the abstract mental process of converting and creating. The receiving and outputting are recited at a high level of generality and are therefore insignificant extra-solution activity. Even when viewed in combination, the additional elements in this claim do no more than automate the mental processes that the user performs, using the computer components as a tool.
Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a processor set amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept.
The receiving and outputting, as discussed above, represents mere data gathering and mere data output, provide nominal or tangential additions to the claim, and are insignificant extra-solution activity. Further, both of these elements are well-understood, routine and conventional.
With respect to the "receiving", the courts have found limitations directed to obtaining information electronically, recited at a high level of generality, to be well-understood, routine, and conventional [MPEP 2106.05(d))(II), "electronic recordkeeping," and "storing and retrieving information in memory"]. With respect to the "outputting" limitation, the courts have similarly found limitations directed to displaying a result, recited at a high level of generality, to be well-understood, routine, and conventional. [MPEP 2106.05(d)(II), "presenting offers and gathering statistics"].
Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.
The dependent claims also recite limitations of iteratively inputting the natural language text into the language machine learning model (claims 2, 9, 16) and feeding the output result into another automation task (claims 5, 12, 19) that are processes that, under its broadest reasonable interpretation, cover performance of the limitation in the mind but for the recitation of generic computer components encompassing user evaluations of mathematical calculations of a model using previously observed or evaluated natural language text or results and thus fall within the "Mental Processes" grouping of abstract ideas.
This judicial exception is not integrated into a practical application. The dependent claims recite additional limitations including computer vision and optical character recognition (claims 3, 10, 17), layout information including images and associated captions (claims 6, 13, 20), layout information including document content formatting (claims 7, 14) that are recited at a high level of generality and recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer [MPEP 2106.05(f)] and generally link the use of the judicial exception to the technological environment of electronic document systems [MPEP 2106.05(h)] and do not impose any meaningful limits on practicing the abstract idea. The dependent claims also recite additional limitations of presenting the output result on a user interface (claims 4, 11, 18) represent insignificant extra-solution activity including nominal or tangential additions to the claim, amounting to mere data output [MPEP 2106.05(g)]. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The dependent claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of data display are recited at a high level of generality which are well-understood, routine, or conventional activities [MPEP 2106.05(d))(II), "presenting offers and gathering statistics"] and remain insignificant extra-solution activity even upon reconsideration [MPEP 2106.05(g)]. Mere instructions to apply an exception using generic computer components, linking the use of an exception to a technological field of use, and insignificant extra-solution activity cannot provide an inventive concept. The claims are not patent eligible.
Claim 8 recites method steps substantially similar to those recited in claim 1 and recite an abstract idea. While the claim recites additional elements of a computer program product, storage media, and program instructions, the elements are recited at a high level of generality and recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer [MPEP 2106.05(f)] and do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a computer program product, storage media, and program instructions amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.
Claim 15 recites method steps substantially similar to those recited in claim 1 and recite an abstract idea. While the claim recites additional elements of a system, processor set, storage media, and program instructions, the elements are recited at a high level of generality and recited so generically that they represent no more than mere instructions to apply the judicial exception on a computer [MPEP 2106.05(f)] and do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of a system, processor set, storage media, and program instructions amount to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Considering the additional elements individually and in combination and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. The claim is not patent eligible.
Claims 8-14 are also directed to a "computer program product".  The specification discloses a computer program product as including open ended language and thus it is reasonable to interpret it to include all possible media, including non-statutory media ["computer program product embodiment (“CPP embodiment” or “CPP”) is a term used in the present disclosure to describe any set of one, or more, storage media (also called “mediums”) collectively included in a set of one, or more, storage devices that collectively include machine readable code corresponding to instructions and/or data for performing computer operations specified in a given CPP claim", Specification, para 0025]. The words "product" and/or "media" are insufficient to convey only statutory embodiments to one of ordinary skill in the art absent an explicit and deliberate limiting definition or clear differentiation between storage media and transitory media in the disclosure.
As such, the claim(s) is/are drawn to a form of energy. Energy is not one of the four categories of invention and therefore this/these claim(s) is/are not statutory. Energy is not a series of steps or acts and thus is not a process. Energy is not a physical article or object and as such is not a machine or manufacture. Energy is not a combination of substances and therefore not a composition of matter. Applicants are advised to insert the phrase "non-transitory" prior to "computer program product" or "computer-readable storage media" to overcome rejection of claims 8-14 under 35 U.S.C. 101
Claims 15-20 are also directed to a "computer system" comprising a "processor set". The specification discloses the term "processor set" as including open-ended language ("PROCESSOR SET 110 includes one, or more, computer processors of any type now known or to be developed in the future" [0028]) and thus it is reasonable to interpret the term to include all possible processing, including non-structural processing per se, and the claims reasonably read on the corresponding software portion of the disclosure.
Therefore, a reasonable interpretation in light of the specification leads to the conclusion that the claim as a whole is directed to entirely a software embodiment, i.e., encompasses pure software, not a hardware embodiment, which does not fall within the definition of a process, machine, manufacture, or composition of matter. Applicants are advised to insert structural processor hardware to overcome rejection of claims 15-20 under 35 U.S.C. 101.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over He et al. (US 20240362286 A1) in view of Contryman et al. (US 12056771 B1).

As to claim 1, He discloses a computer-implemented method comprising:
receiving, by a processor set, a digital image of a document and a workflow describing an automation task [para 0065, 0192-0194, 0201, system including processor receives document image and query for document search (read: automation task)];
converting, by the processor set, the digital image of the document into rich text that includes layout information in the document [para 0111-0114, 0195-0196, process document image into document components including text organized in format (read: layout information)];
creating, by the processor set, based on the rich text and the workflow, a tree … that includes nodes and edges connecting the nodes and that binds at least some nodes representing the rich text with a task node representing the automation task [para 0116, 0138, generate embedding (read: tree) including vectors representations of document and search query];
… a natural language text [para 0133, search query comprising text in natural language representation]; and
inputting, by the processor set, the natural language text into a language machine learning model with attention given to a token in the natural language text representing the task node, the language machine learning model, in response, outputting a result of completing the automation task [para 0133, 0138-0140, search model (read: language machine learning model) computes resulting scores of searching relevant content within document using attention mechanism weighting document token and query embedding].
However, He does not specifically disclose creating, by the processor set, based on the rich text and the workflow, a tree of thoughts that includes nodes and edges connecting the nodes and that binds at least some nodes representing the rich text with a task node representing the automation task; and converting, by the processor set, the nodes and edges of the tree of thoughts into a natural language text.
Contryman discloses:
creating, by the processor set, based on the rich text and the workflow, a tree of thoughts that includes nodes and edges connecting the nodes and that binds at least some nodes representing the rich text with a task node representing the automation task [cols. 10:5-11:24, system processor generates tree-based explainability graph (read: tree of thoughts) based on extracted document feature text (read: rich text) and underwriting decision making (read: workflow), where graph includes nodes and edges joining nodes with feature node associated with document feature text and decision node (read: task node) of model making determination (read: automation task)]; and
converting, by the processor set, the nodes and edges of the tree of thoughts into a natural language text [cols. 10:54-11:24, 12:23-43, generate verbose explanation of decision (read: natural language text) from explainability graph nodes and edges].
He and Contryman are analogous art to the claimed invention being from a similar field of endeavor of document management systems. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the tree including nodes and edges binding nodes representing rich text with a task node representing the automation task and determined natural language text as disclosed by He with the tree of thoughts including nodes and edges and converting nodes and edges into natural language text as disclosed by Contryman with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify He as described above to ensure model compliance [Contryman, cols. 10:54-11:24].

As to claim 2, He discloses the computer-implemented method of claim 1, wherein the inputting the natural language text into the language machine learning model includes iteratively inputting the natural language text to complete the automation task [para 0119, perform search to iterate over search results of search query to generate additional search results].

As to claim 3, He discloses the computer-implemented method of claim 1.
However, He does not specifically disclose wherein computer vision and optical character recognition techniques are used to convert the digital image of the document into the rich text.
Contryman discloses wherein computer vision and optical character recognition techniques are used to convert the digital image of the document into the rich text [col. 10:27-53, perform text segmentation and text recognition on document image to extract document feature text].
He and Contryman are analogous art to the claimed invention being from a similar field of endeavor of document management systems. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the document conversion as disclosed by He with the document conversion techniques including computer vision and optical character recognition as disclosed by Contryman with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify He as described above to handle forms of varying quality, skew, language, orientation [Contryman, cols. 10:54-11:24].

As to claim 4, He discloses the computer-implemented method of claim 1, wherein the result output by the language machine learning model is presented on a user interface [para 0222, render summary on graphical user interface].

As to claim 5, He discloses the computer-implemented method of claim 1, wherein the result output by the large language model is fed into another automation task [para 0119, perform search to iterate over search results of search query to generate results of additional search (read: another automation task)].

As to claim 6, He discloses the computer-implemented method of claim 1, wherein the layout information includes images and associated captions contained in the document [para 0111-0114, document components include figures and caption of figures].

As to claim 7, He discloses the computer-implemented method of claim 1, wherein the layout information includes formatting of document content contained in the document [para 0111-0114, document components include formatted and structured text].

As to claim 8, He and Contryman, combined at least for the reasons above, He discloses a computer program product comprising: a set of one or more computer-readable storage media; and program instructions, collectively stored in the set of one or more storage media, for causing a processor set to perform the following computer operations [para 0055, memory stores instructions executed by processing circuitry]: comprising limitations substantially similar to those recited in claim 1 and is rejected under similar rationale.

As to claims 9-14, He and Contryman, combined at least for the reasons above, discloses the computer program product of claim 8 comprising limitations substantially similar to those recited in claims 2-7, respectively, and are rejected under similar rationale.

As to claim 15, He and Contryman, combined at least for the reasons above, He discloses a computer system comprising: a processor set; a set of one or more computer-readable storage media; and program instructions, collectively stored in the set of one or more computer-readable storage media, for causing the processor set to perform the following computer operations [para 0053, 0055, device includes memory storing instructions executed by processing circuitry]: comprising limitations substantially similar to those recited in claim 1 and is rejected under similar rationale.

As to claims 16-20, He and Contryman, combined at least for the reasons above, discloses the computer system of claim 15 comprising limitations substantially similar to those recited in claims 2-6, respectively, and are rejected under similar rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Long et al. (US 20250232187 A1) generally discloses utilizing tree-of-thought search algorithms.
Xu (CN 119884357 A) generally discloses generating a thought guide from analyzing document content.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA HUYNH whose telephone number is (571)272-5240 and email is linda.huynh@uspto.gov. The examiner can normally be reached M-F between 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Queler can be reached at (571) 272-4140. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LINDA HUYNH/Primary Examiner, Art Unit 2172
Read full office action
Prosecution Timeline

Apr 16, 2024
Application Filed
Feb 17, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/367,977
Patent 12578837
USER INTERFACES FOR MANAGING SHARING OF CONTENT IN THREE-DIMENSIONAL ENVIRONMENTS
2y 5m to grant Granted Mar 17, 2026
17/956,135
Patent 12547310
INFORMATION PROCESSING DEVICE
2y 5m to grant Granted Feb 10, 2026
18/553,223
Patent 12541287
INTEGRATED ENERGY DATA SCIENCE PLATFORM
2y 5m to grant Granted Feb 03, 2026
17/694,349
Patent 12524136
EVENT TRANSCRIPT PRESENTATION
2y 5m to grant Granted Jan 13, 2026
17/901,611
Patent 12524124
RECORDING FOLLOWING BEHAVIORS BETWEEN VIRTUAL OBJECTS AND USER AVATARS IN AR EXPERIENCES
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
36%
Grant Probability
68%
With Interview (+31.9%)
3y 8m
Median Time to Grant
Low
PTA Risk
Based on 274 resolved cases by this examiner. Grant probability derived from career allow rate.