Last updated: April 19, 2026
Application No. 17/249,584
KNOWLEDGE-BASED MANAGEMENT OF RECOGNITION MODELS IN ARTIFICIAL INTELLIGENCE SYSTEMS

Final Rejection §103
Filed
Mar 05, 2021
Examiner
WU, NICHOLAS S
Art Unit
2148
Tech Center
2100 — Computer Architecture & Software
Assignee
Huawei Cloud Computing Technologies Co. Ltd.
OA Round
4 (Final)
Interview Optional

— +43.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 38 resolved cases, 2023–2026
Examiner Intelligence

WU, NICHOLAS S View full profile →
Grants 47% of resolved cases
Career Allow Rate
18 granted / 38 resolved
-7.6% vs TC avg
Strong +43% interview lift
Without
With
+43.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
44 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
26.7%
-13.3% vs TC avg
§103
52.6%
+12.6% vs TC avg
§102
3.1%
-36.9% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 38 resolved cases
Office Action

§103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 09/03/2025 have been fully considered but are not fully persuasive.
Regarding the 101 rejections, after further consideration and taking the claim as a whole, applicant’s arguments are persuasive and overcome the previous 101 rejections. Specifically, applicant argues that the limitations “compare the at least one parameter to resources available to the one or more processors to determine that the available resources are insufficient to execute the at least one recognition model; obtain access to additional resources sufficient to execute the at least one recognition model; and process the data set using the at least one recognition model and the additional resources based on the indicated computing resources to provide an indication of whether the data set includes the at least one target object.” provides a technical improvement because the claimed invention improves the functioning of a computer system by providing an automated mechanism for resource management in the execution of machine learning models. See pg. 12-13 of “Remarks”: “The Claim Provides a Specific Technological Improvement The claim integrates the alleged abstract ideas into a practical application by providing a concrete technological solution for managing computing resources in AI systems. The invention automatically determines when additional computing resources are needed and obtains access to those resources, thereby improving the functioning of computer systems executing recognition models. This represents a technological improvement to how computers handle resource-intensive AI applications, not merely the application of abstract ideas using generic computer components. The Integration Creates a Practical Application The claim elements work together as an integrated device that addresses the practical problem of resource management in AI systems: 1. Knowledge-based selection mechanism: The system uses a knowledge database with entity labels and links to select appropriate recognition models 2. Resource assessment and acquisition: The system compares required computing resources against available resources and automatically obtains additional resources when needed 3. Optimized execution: The system processes datasets using both the selected models and additional resources based on indicated computing requirements This integration creates a practical application that goes beyond merely implementing abstract ideas on a computer.” Applicant’s arguments that the claimed invention provides a technical improvement to the field of machine learning are persuasive. Therefore, the 101 rejections are withdrawn.
Regarding the 103 rejections, applicant's arguments filed with respect to the prior art rejections have been fully considered but they are not persuasive. 
Alleged no teaching of associating a query that describes a target object by comparing a target object description with entity labels in a knowledge database
	In Remarks pg. 15, applicant contends:
“There is no teaching in Ambardekar that a query that describes a target object with a target object description is associated with labels in a knowledge base even if one were to broadly interpret the WordNet hierarchy of Abdollahpour as an entity knowledge base. Ambardekar appears to both enhance visual words in a semantic ontology and perform image classification using the semantic ontology and execution of multiple models. Neither reference associates a query "by comparing the target object description with one or more entity labels" as claimed, nor identify a target object in a data set as claimed. While Ambardekar Paragraph [0022] does reference comparing text descriptions to the query, it still references the "confidence level" calculated in Paragraph [0021] for making recommendations of models. No knowledge base of entity labels is disclosed, as even the text descriptions are not described in Ambardekar as entity labels. 
The Office Action asserts:"The ranking of models based on these queries is interpreted as 
comparing the target object descriptions as under the broadest reasonable interpretation, a ranking is a comparison of elements." This assertion is respectfully traversed. The ranking, as described in Ambardekar Paragraph [0021] is based on light weight classifications, not by comparing descriptions. 
The processes of Ambardekar execute a large number of models just to classify an image. 
There is no concept of associating a "query that describes the target object" or "comparing the target object description with one or more entity labels in a knowledge database" as claimed. 
”

The relevant claim limitations appear to be: associate the query that describes the target object by comparing the target object description with one or more entity labels in a knowledge database that includes a plurality of entities and identifies corresponding recognition models for recognizing multiple different objects; in claim 1. Ambarderkar and Abdollahpour teaches:
(Ambardekar, ⁋51, “The image recognition program may prompt her to submit a query, so Ariana may interact with a digital keyboard by using a touchscreen on her smartphone to type ‘red coffee mug.’”).
 
(Ambardekar, ⁋22, “The image recognition models 22 may also include text descriptions that may be compared to the query 28.”). 

(Ambardekar, ⁋20, “the image recognition program 24 may rank the image recognition models 22 by confidence level for performing the search based on the query 28”).

(Abdollahpour, pg. 695-696, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories. WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets. All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window).

In other words, Ambardekar teaches “associate the query that describes the target object by comparing the target object description” as Ambardekar takes in a user query that describes the target object like Ariana describing the red coffee mug that she wants to find. Ambardekar then teaches that the queries with target object descriptions, like Ariana’s query, are compared to text descriptions that are associated with each machine learning model. The ranking of models based on these queries is interpreted as comparing the target object descriptions as under the broadest reasonable interpretation, a ranking is a comparison of elements. 
Applicant argues that the ranking cannot be interpreted as a comparison since the ranking is based on confidence values not text descriptions. The examiner still asserts that the mention of the ranking models still teaches “associate the query that describes the target object by comparing the target object description” because the models are compared using text descriptions of each model to the query as seen in Ambardekar ⁋22 above. The examiner notes, in this particular limitation, Ambardekar teaches the first half of the limitation of comparing a user’s target object description query to other text descriptions associated with the models in the model storage. Ambardekar was not relied upon to teach “entity labels of a knowledge database” as Abdollapour was used as the secondary reference to teach this concept.
Abdollahpour teaches the second half of this limitation “with one or more entity labels in a knowledge database that includes a plurality of entities and identifies corresponding recognition models for recognizing multiple different objects;” as seen by the cited portion above. Abdollahpour teaches using multiple classifiers that are linked to each other based on the ontology structure of WordNet and therefore teaches entity labels that correspond to multiple classifiers. Under the broadest reasonable interpretation, comparing a user query to text descriptions of the plurality of models, can be interpreted as a query being associated with labels. Therefore, one of ordinary skill in the art would reasonably be motivated to extend the associating to a label to associating to an entity label for the semantic relationship benefits that a knowledge graph/ontology provides. 

Alleged no teaching of selecting models corresponding to entity labels in Abdollahpour
	In Remarks pg. 16-17, applicant contends:
“This assertion is respectfully traversed, as Ambardekar does not select a model based on text descriptions, but based on calculated confidence levels. While text descriptions may identify possible models, the models are not selected based on the text descriptions. Even the provision of entity labels by Abdollapour cannot cure the deficiencies of Ambardekar. Abdollapour appears to evaluate multiple nodes as described on page 696 and Algorithm I."Image classification is performed as illustrated in Algorithm I. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated. The nodes with higher confidence value are recursively explored until reaching leaf node." Thus, Abdollapour also evaluates multiple models in a manner similar to Ambardekar and does not select a model "corresponding to the one or more entity labels" as claimed. Models must be executed in order for them to determine confidence values and confidence values are used to arrive at the leaf node. Claim 1 recites:"select, from the corresponding recognition models, at least one recognition model corresponding to the one or more entity labels associated with the target object description of the query via one or more links in the entities..." Instead of recursively exploring subsequent levels of nodes from a root node, as in Abdollahpour based on confidence values from executing support vector models (SVMs) at each node, claim 1 describes a particular manner of determining which at least one recognition model to use based on associating the query to labels in the knowledge base that has entities corresponding to multiple different objects. Claim 1 thus provides a way to identify corresponding recognition models more efficiently using the knowledge base, without the recursive use of SVMs at each node as described in Abdollahpour to classify images and not in response to a query that describes the target object in a data set such as an image. The combination of references does not teach or suggest amended claim 1 and the rejection should be withdrawn.”

The relevant claim limitations appear to be: select, from the corresponding recognition models, at least one recognition model corresponding to the one or more entity labels associated with the target object description of the query via one or more links in the entities… in claim 1. Abdollahpour teaches:
(Abdollahpour, pg. 695-696, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories. WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets. All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window).

(Abdollahpour, pg. 696 col. 1-2, “In this paper the CIFAR-10 data set is used. We use WordNet to generate the semantic hierarchy for this data set. The resulting sub-graph contains 18 nodes shown in Fig. 1. We associate a linear SVM to each semantic node which is responsible for discriminating among direct children of the current node. At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node…Image classification is performed as illustrated in Algorithm 1. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated. The nodes with higher confidence value are recursively explored until reaching leaf node.”).

In other words, applicant argues that Abdollahpour does not teach “select, from the corresponding recognition models, at least one recognition model corresponding to the one or more entity labels associated with the target object description of the query via one or more links in the entities”. The examiner still asserts that Abdollahpour still teaches this selection of at least one recognition model corresponding to entity labels associated with a target description query via links in the entities. As seen in the cited portions, Abdollahpour teaches using a recursive search algorithm to select/execute classifier models, SVMs, that are semantically related to the input query to teach the selecting limitation. 
Applicant also argues that the claimed invention is more efficient and that the claimed invention does not need to execute multiple SVMs in a recursive approach to select models. However, under the broadest reasonable interpretation, the claim only requires that a classifier, from a plurality of classifiers, is selected based on a label corresponding to the target description. The claimed limitations do not require that models are not executed during the selection process. While, Abdollahpour’s recursive search algorithm has more steps than what is claimed, the algorithm still includes steps that teaches selecting a classifier, from a plurality of classifiers, based on a label corresponding to the target description. Therefore, the applicant’s arguments are not persuasive. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-7, 22-28, and 30-32 are rejected under 35 U.S.C. 103 as being unpatentable over Ambardekar, et al., US Pre-Grant Publication 2015/0213058A1 (“Ambardekar”) in view of Abdollahpour, et al., Non-Patent Literature “Image Classification Using Ontology Based Improved Visual Words” (“Abdollahpour”) and further in view of Khare, et al., US Pre-Grant Publication 2019/0278640A1 (“Khare”).
Regarding claim 1 and analogous claims 22 and 30, Ambardekar discloses:
A device for identifying at least one target object in a data set, (Ambardekar, ⁋3, “A computing device having adaptable image search and methods for operating an image recognition [A device for identifying at least one target object in a data set,] program on the computing device are disclosed herein. One disclosed embodiment may include non-volatile memory configured to store a plurality of image recognition models and the image recognition program executed by a processor of the computing device.”).
the device comprising: a memory comprising instructions; and one or more processors in communication with the memory, wherein the one or more processors execute the instructions to: (Ambardekar, ⁋61, “Computing system 600 includes a logic subsystem 606 [and one or more processors in communication with the memory,] and a storage subsystem 608 [the device comprising: a memory comprising instructions;]. Computing system 600 may optionally include a display subsystem 610, input subsystem 612, communication subsystem 614, and/or other components not shown in FIG. 6. Server 602 may have an additional communication subsystem 616, logic subsystem 618, and storage system 620 and be configured to host a web service 622 as described above.” [wherein the one or more processors execute the instructions to:]).
receive a data set and a query that describes the target object with a target object description to identify the at least one target object in the data set; (Ambardekar, ⁋18, “The image recognition program 24 may be configured to receive a query 28 from a user [receive a data set and a query]. An input device 30 of computing device 10 may include a microphone, a keyboard, a touchscreen, etc. The query 28 may be, for example, text that is typed on the keyboard or touchscreen, converted from speech captured by the microphone, converted via optical character recognition (OCR) from an image that may be, for instance, captured by the camera 34 or stored in the non-volatile memory 20, or produced by other techniques. Audio, text, etc. may also be stored in the non-volatile memory 20 in advance and then used to form a query 28. Alternatively, the query 28 may be an image or video of a target object the user is interested in finding. Multiple images or frames of video may depict different viewpoints of the same target object [to identify the at least one target object in the data set;]. The user may optionally select a bounding box within the query image to help the image recognition program 24 to locate the target object, especially if there are many irrelevant objects in the image.” and Ambardekar, ⁋51, “The image recognition program may prompt her to submit a query, so Ariana may interact with a digital keyboard by using a touchscreen on her smartphone to type ‘red coffee mug.’ [that describes the target object with a target object description]”).
associate the query that describes the target object by comparing the target object description… (Ambardekar, ⁋22, “The image recognition models 22 may also include text descriptions that may be compared to the query 28 [associate the query that describes the target object].” and Ambardekar, ⁋20, “the image recognition program 24 may rank the image recognition models 22 by confidence level for performing the search based on the query 28; ranking the models based on the query is interpreted as comparing the descriptions (i.e. by comparing the target object description)”).
and process the data set using the at least one recognition model…to provide an indication of whether the data set includes the at least one target object. (Ambardekar, ⁋20, “Next, the image recognition program 24 may rank the image recognition models 22 by confidence level for performing the search based on the query 28 within the target image 12, then determine whether any of the image recognition models 22 is above a confidence threshold for performing the search  locally on the processor 26 of the computing device 10. Upon determining that at least one of the image recognition models 22 is above the confidence threshold, the image recognition program may select at least one highly ranked image recognition model 22′ and perform the search within the target image 12 for a target region of the target image 12 using at least one selected image recognition model 22′ [and process the data set using the at least one recognition model…].” and Ambardekar, ⁋23, “Finally, the image recognition program 24 may return a search result 18 to the user, which may include displaying it on display 32 and ending the search [to provide an indication of whether the data set includes the at least one target object.]. The visual displaying of the search result 18 may be accompanied by an audio alert or reporting of the search result 18, or a vibration, for example. At any time prior to receiving the search result 18, the user may indicate to the image recognition program 24 that she wishes to end the search.”).
While Ambardekar teaches a model repository that retrieves models using queries based target object and object descriptions, Ambardekar does not explicitly teach:
…with one or more entity labels in a knowledge database that includes a plurality of entities and identifies corresponding recognition models for recognizing to multiple different objects; 
select, from the corresponding recognition models, at least one recognition model corresponding to the one or more entity labels…via one or more links in the entities, wherein each recognition model of the corresponding recognition models includes at least one parameter indicating computing resources to be used to process the data set using the recognition model;
compare the at least one parameter to resources available to the one or more processors to determine that the available resources are insufficient to execute the at least one recognition model; obtain access to additional resources sufficient to execute the at least one recognition model;
…and the additional resources based on the indicated computing resources…
Abdollahpour teaches:
…with one or more entity labels in a knowledge database that includes a plurality of entities and identifies corresponding recognition models for recognizing to multiple different objects; (Abdollahpour, pg. 695-696, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy; WordNet is interpreted as a knowledge database with multiple object labels (i.e. …with one or more entity labels in a knowledge database that includes a plurality of entities) and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories [and identifies corresponding recognition models for recognizing to multiple different objects;]. WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets. All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window).
select, from the corresponding recognition models, at least one recognition model corresponding to the one or more entity labels…via one or more links in the entities… (Abdollahpour, pg. 696 col. 1-2, “In this paper the CIFAR-10 data set is used. We use WordNet to generate the semantic hierarchy for this data set. The resulting sub-graph contains 18 nodes shown in Fig. 1. We associate a linear SVM to each semantic node [select, from the corresponding recognition models,] which is responsible for discriminating among direct children of the current node. At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node [at least one recognition model corresponding to the one or more entity labels…]. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node… Image classification is performed as illustrated in Algorithm 1. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated. The nodes with higher confidence value are recursively explored until reaching leaf node.; recursively exploring through leaf nodes is interpreted as one or more links in the entities as leaf nodes are connected to the root node (i.e. via one or more links in the entities…)”).
Ambardekar and Abdollahpour are both in the same field of endeavor (i.e. classification/recognition). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Ambardekar and Abdollahpour to teach the above limitation(s). The motivation for doing so improves recognition accuracy (cf. Abdollahpour, pg. 694 col. 1, “A modified iterative approach for clustering of local features to visual words is proposed. Then we design a set of binary classifiers in a hierarchical structure corresponding to semantic taxonomic relations between classes. Experimental results show that the proposed method improves the classification accuracy by an acceptable amount.”).
While Ambardekar in view of Abdollahpour teaches using a knowledge database to retrieve models for classification tasks, the combination does not explicitly teach:
…wherein each recognition model of the corresponding recognition models includes at least one parameter indicating computing resources to be used to process the data set using the recognition model;
compare the at least one parameter to resources available to the one or more processors to determine that the available resources are insufficient to execute the at least one recognition model; obtain access to additional resources sufficient to execute the at least one recognition model;
…and the additional resources based on the indicated computing resources…
Khare teaches:
…wherein each recognition model of the corresponding recognition models includes at least one parameter indicating computing resources to be used to process the data set using the recognition model; (Khare, ⁋45 and see Figure 3, “FIG. 3; Figure 3 shows the parameters that are given to each model (i.e. …wherein each recognition model of the corresponding recognition models) illustrates embodiments of formats of listings. An algorithm listing format 301 includes one or more of: a category:subcategory (subcategories) of the algorithm, an API definition (input/output format), suggested resource requirements to train the algorithm [includes at least one parameter indicating computing resources to be used to process the data set using the recognition model;], relative usage of the algorithm in the category:subcategory (subcategories), and a storage location of the algorithm (so that it can be hosted/used by the requester).”).
compare the at least one parameter to resources available to the one or more processors to determine that the available resources are insufficient to execute the at least one recognition model; obtain access to additional resources sufficient to execute the at least one recognition model; (Khare, ⁋81, “in those cases, resources are allocated for the pipeline if it is ready for execution (for example, contains only models). For example, a request to perform one or more of these acts is received by the model/algorithm/data API frontend 109 which calls on the publishing/listing agent 125 to provide necessary information to an execution service 111 (such as a location of the selected algorithm, model, data, pipeline, and/or notebook in algorithm/model/data store 123) which then allocates execution resources 113 including compute resources 117; allocating resources once the model is ready to be executed is interpreted as determining that there were insufficient resources available to execute the model and therefore resources were found and allocated to execute the model (i.e. to determine that the available resources are insufficient to execute the at least one recognition model;) and storage 115.” and Khare, ⁋82, “For example, execution service 111 causes execution of a pipeline have a selected model, trains a selected algorithm using (selected) training data, etc. In some embodiments, different resources are allocated for different stages of the pipeline. These different resources are selected based on the information of the listing (such as suggested resource requirements, latency, etc.) [compare the at least one parameter to resources available to the one or more processors]. Further, in some embodiments, different resources are selected based on what hardware resources are available to the requesting user [obtain access to additional resources sufficient to execute the at least one recognition model;].”).
…and the additional resources based on the indicated computing resources… (Khare, ⁋82, “Once the resources have been allocated the selected algorithm, model, data, notebook, and/or pipeline (and pipeline being used as needed) is trained or executed as desired at 813 using the allocated resources […and the additional resources based on the indicated computing resources…].”).
Ambardekar, in view of Abdollahpour, and Khare are both in the same field of endeavor (i.e. machine learning model database). It would have been obvious for a person having ordinary skill in the art before the effective filing date of the claimed invention to combine Ambardekar, in view of Abdollahpour, and Khare to teach the above limitation(s). The motivation for doing so is that including the required computing requirements of each model improves the selection of the optimal model for the given task/query (cf. Khare, see ⁋35-36).
Regarding claim 2 and analogous claims 23 and 31, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. Abdollahpour further teaches:
wherein the one or more processors execute further instructions to: receive the corresponding recognition models, each recognition model including at least one annotation; (Abdollahpour, pg. 695 col. 2, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories.”; each category/subcategory is interpreted as an annotation (i.e. wherein the one or more processors execute further instructions to: receive the corresponding recognition models, each recognition model including at least one annotation;)).
and for each recognition model: identify at least one entity of the knowledge database that corresponds to the recognition model, based on the at least one annotation of the recognition model; (Abdollahpour, pg. 695 col. 2, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories.”; each category is interpreted as an annotation, thus there is a SVM for each subclass (i.e. and for each recognition model: identify at least one entity of the knowledge database that corresponds to the recognition model, based on the at least one annotation of the recognition model;)).
and link the recognition model to the at least one entity in the knowledge database. (Abdollahpour, pg. 696 col. 1, “At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node; one-vs-all for the sub-concepts is interpreted as linking the recognition model to one entity as there is a SVM created for each sub-concept (i.e. and link the recognition model to the at least one entity in the knowledge database.). The training data of each node is recursively obtained from the leaf nodes. Therefore, for each target node, samples of all images associated with all of its child leaf nodes are taken as positive samples. So, if an image is classified as ‘dog’ it will also serve to train the classifiers for ‘carnivore’, ‘mammal’, etc. Negative samples are all images of sibling nodes (see Fig. 2).”).
Regarding claim 3 and analogous claim 24, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. Khare further teaches:
wherein: each recognition model of the corresponding recognition models includes at least one parameter indicating computing resources to be used to process the data set using the recognition model; (Khare, ⁋45 and see Figure 3, “FIG. 3; Figure 3 shows the parameters that are given to each model (i.e. wherein: each recognition model of the corresponding recognition models) illustrates embodiments of formats of listings. An algorithm listing format 301 includes one or more of: a category:subcategory (subcategories) of the algorithm, an API definition (input/output format), suggested resource requirements to train the algorithm [includes at least one parameter indicating computing resources to be used to process the data set using the recognition model;], relative usage of the algorithm in the category:subcategory (subcategories), and a storage location of the algorithm (so that it can be hosted/used by the requester).”).
and the at least one recognition model indicating computing resources that are compatible with resources available to the one or more processors. (Khare, ⁋82, “For example, execution service 111 causes execution of a pipeline have a selected model, trains a selected algorithm using (selected) training data, etc. In some embodiments, different resources are allocated for different stages of the pipeline. These different resources are selected based on the information of the listing (such as suggested resource requirements [and the at least one recognition model indicating computing resources that are compatible with resources available to the one or more processors.], latency, etc.).
Regarding claim 4 and analogous claim 25, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. Ambardekar further teaches wherein the one or more processors execute the instructions to obtain additional resources from a network-connected service before selecting the at least one recognition model. (Ambardekar, ⁋44, “Alternatively to or in conjunction with step 400, new, popular, promoted, or otherwise selected image recognition models may be delivered to the computing device on a basis other than case-by-case. For example, new or improved image recognition models may be packaged as updates for the image recognition program and the user may be prompted to download the updates at regular intervals, the image recognition program may be configured to download the updates automatically, or the user may be able to choose among model packages based on what appeals to him [wherein the one or more processors execute the instructions to obtain additional resources from a network-connected service before selecting the at least one recognition model.]. Updating the image recognition models may include updating, replacing, or adding to the algorithms included with the image recognition models, for example. Image recognition models may also be downloaded or updated on the device based on secondary signals such as GPS data, SSIDs associated with known geographic locations, ambient noise, etc.”).
Regarding claim 5 and analogous claims 26 and 32, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. Abdollahpour further teaches:
wherein: the knowledge database is an entity knowledge graph database; (Abdollahpour, pg. 695 col. 2, “WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets; WordNet is interpreted as a entity knowledge graph database as it has nodes that represent entities and where relationships to other entities are represented as edges between the entities (i.e. wherein: the knowledge database is an entity knowledge graph database;). All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window).”).
entities of the entity knowledge graph database are ontologically coupled nodes of the entity knowledge graph database such that a recognition model which is directly linked to one node in the entity knowledge graph database is linked to all nodes in the entity knowledge graph database that are ontologically coupled to the one node; (Abdollahpour, pg. 695-696, “For hierarchical image classification, we use the semantic relations between classes based on WordNet hierarchy and then assign a set of linear Support Vector Machines (SVM) to each semantic node. Each semantic node is responsible for discriminating among its immediate object subcategories. WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets. All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation [entities of the entity knowledge graph database are ontologically coupled nodes of the entity knowledge graph database] (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window). In this paper the CIFAR-10 data set is used. We use WordNet to generate the semantic hierarchy for this data set. The resulting sub-graph contains 18 nodes shown in Fig. 1. We associate a linear SVM to each semantic node which is responsible for discriminating among direct children of the current node.” [such that a recognition model which is directly linked to one node in the entity knowledge graph database is linked to all nodes in the entity knowledge graph database that are ontologically coupled to the one node;]).
the entity knowledge graph database includes a node corresponding to the at least one target object and the node corresponding to the at least one target object is not directly linked to any recognition model; (Abdollahpour, pg. 696 see Figure 2 below,

    PNG
    media_image1.png
    384
    861
    media_image1.png
    Greyscale

In Figure 2, the even toed ungulate node is interpreted as a node related to a carnivore because they are both mammals but they are not directly linked to one another (i.e. the entity knowledge graph database includes a node corresponding to the at least one target object and the node corresponding to the at least one target object is not directly linked to any recognition model;)).
and the one or more processors execute the instructions to select, as the at least one recognition model, one or more recognition models directly linked to at least one node in the entity knowledge graph database that is ontologically coupled to the node corresponding to the at least one target object. (Abdollahpour, pg. 696 col. 1, “At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node [and the one or more processors execute the instructions to select, as the at least one recognition model,]. The training data of each node is recursively obtained from the leaf nodes. Therefore, for each target node, samples of all images associated with all of its child leaf nodes are taken as positive samples. So, if an image is classified as ‘dog’ it will also serve to train the classifiers for ‘carnivore’, ‘mammal’, etc. Negative samples are all images of sibling nodes; sibling nodes are interpreted as being ontologically coupled to the target node as they share the same parent (i.e. one or more recognition models directly linked to at least one node in the entity knowledge graph database that is ontologically coupled to the node corresponding to the at least one target object.) (see Fig. 2).”).
Regarding claim 6 and analogous claim 27, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 5. Abdollahpour further teaches:
wherein the one or more processors execute the instructions to: select, as the one or more recognition models associated with the ontologically coupled nodes, a plurality of recognition models linked to a respective plurality of nodes ontologically coupled to the node corresponding to the at least one target object; (Abdollahpour, pg. 696 col. 1, “At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node; creating a classifier for each sub concept is interpreted as a plurality of recognition models that are connected to nodes that are ontologically connected to a target node (i.e. wherein the one or more processors execute the instructions to: select, as the one or more recognition models associated with the ontologically coupled nodes, a plurality of recognition models linked to a respective plurality of nodes ontologically coupled to the node corresponding to the at least one target object;). The training data of each node is recursively obtained from the leaf nodes. Therefore, for each target node, samples of all images associated with all of its child leaf nodes are taken as positive samples. So, if an image is classified as ‘dog’ it will also serve to train the classifiers for ‘carnivore’, ‘mammal’, etc. Negative samples are all images of sibling nodes (see Fig. 2).”).
process the data set using the selected plurality of recognition models; and combine results of processing the selected plurality of recognition models to provide the indication of whether the data set includes the at least one target object. (Abdollahpour, pg. 696 col. 1-2, “Image classification is performed as illustrated in Algorithm 1. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated. The nodes with higher confidence value are recursively explored until reaching leaf node.”; recursively exploring the confidence values of the classifiers until the target node is found is interpreted as combining the results of a plurality of recognition models because results from parent level classifiers are processed before reaching the child/target node (i.e. process the data set using the selected plurality of recognition models; and combine results of processing the selected plurality of recognition models to provide the indication of whether the data set includes the at least one target object.)). 
Regarding claim 7 and analogous claim 28, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. Abdollahpour further teaches:
wherein: the knowledge database is a graph database; (Abdollahpour, pg. 695 col. 2, “WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets; WordNet is interpreted as a graph database as it has nodes that represent entities and where relationships to other entities are represented as edges between the entities (i.e. wherein: the knowledge database is a graph database;). All synsets are connected to other synsets by means of semantic relations like hypernyme/hyponymy or is-a relation (like the relation between vehicle and airplane) and meronym/holonymy or part-of relation (like the relation between building and window).”).
and the graph database includes a plurality of ontologically organized nodes corresponding to the at least one target object at different levels of generality (Abdollahpour, pg. 696 see Figure 1 below, 
    PNG
    media_image2.png
    468
    1055
    media_image2.png
    Greyscale

In Figure 1, the hierarchical structure is interpreted as a plurality of ontologically organized nodes at different levels because there are classes and subclasses like animal, mammal, and carnivore which show the different levels of generality (i.e. and the graph database includes a plurality of ontologically organized nodes corresponding to the at least one target object at different levels of generality)).
and the one or more processors execute the instructions to: select, as the at least one recognition model, a plurality of recognition models associated with the plurality of nodes corresponding to the different levels of generality of the at least one target object; (Abdollahpour, pg. 696 col. 1-2, “At each node of the semantic taxonomy, We use linear SVM in a one-vs-all manner over the sub-concepts of the current node. At each node N, ρ numbers of linear SVM classifiers are trained where p is the number of sub concepts of the current node [and the one or more processors execute the instructions to: select, as the at least one recognition model,]. The training data of each node is recursively obtained from the leaf nodes. Therefore, for each target node, samples of all images associated with all of its child leaf nodes are taken as positive samples. So, if an image is classified as ‘dog’ it will also serve to train the classifiers for ‘carnivore’, ‘mammal’, etc. Negative samples are all images of sibling nodes (see Fig. 2). Image classification is performed as illustrated in Algorithm 1. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated [a plurality of recognition models associated with the plurality of nodes corresponding to the different levels of generality of the at least one target object;]. The nodes with higher confidence value are recursively explored until reaching leaf node.”).
process the data set using the selected plurality of recognition models; and combine results of the processing of the selected plurality of recognition models to provide the indication of whether the data set includes the at least one target object. (Abdollahpour, pg. 696 col. 1-2, “Image classification is performed as illustrated in Algorithm 1. Starting from the root node and using the decision functions of linear SVM classifier, subsequent level nodes are evaluated. The nodes with higher confidence value are recursively explored until reaching leaf node.”; recursively exploring the confidence values of the classifiers until the target node is found is interpreted as combining the results of a plurality of recognition models because results from parent level classifiers are processed before reaching the child/target node (i.e. process the data set using the selected plurality of recognition models; and combine results of processing the selected plurality of recognition models to provide the indication of whether the data set includes the at least one target object.)).

Claims 21, 29, and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Ambardekar, et al., US Pre-Grant Publication 2015/0213058A1 (“Ambardekar”) in view of Abdollahpour, et al., Non-Patent Literature “Image Classification Using Ontology Based Improved Visual Words” (“Abdollahpour”) and further in view of Khare, et al., US Pre-Grant Publication 2019/0278640A1 (“Khare”) and Subbian, et al., US Pre-Grant Publication 2019/0114362A1 (“Subbian”).
Regarding claim 21 and analogous claims 29 and 33, Ambardekar in view of Abdollahpour and Khare teaches the device of claim 1. 
While Ambardekar in view of Abdollahpour and Khare teaches wherein the one or more processors associate the query with one or more corresponding entity labels in a knowledge database by: attempting to match a word describing the target object in the query with an entity label; in claim 1, the combination does not explicitly teach and in response to no matching word and entity label: generating data set model embeddings; obtaining entity model embeddings; and performing a similarity analysis of the data set model embeddings to the entity model embeddings to associate the query with one or more entity labels. 
Subbian teaches and in response to no matching word and entity label: generating data set model embeddings; obtaining entity model embeddings; and performing a similarity analysis of the data set model embeddings to the entity model embeddings to associate the query with one or more entity labels. (Subbian, ⁋5, “Not all entities may comprise enough content. A page may have only few words, which may not be enough to properly represent the page. The social-networking system may use entity embedding inference techniques to generate an embedding for the page [and in response to no matching word and entity label:].” and Subbian, ⁋10, “The social-networking system may generate, using the entity embedding model, a query embedding using the generated term embeddings corresponding to the search query [generating data set model embeddings;]. The social-networking system may identify one or more entities in the online social networking matching the one or more n-grams of the search query. The social-networking system may retrieve a plurality of entity embeddings corresponding to a plurality of entities [obtaining entity model embeddings;], respectively, from one or more production data stores. Each entity embedding represents the corresponding entity as a point in the d-dimensional embedding space. The social-networking system may calculate a similarity metric between the query embedding and the entity embedding for each of the retrieved entity embeddings. The similarity metric measures a degree of similarity of the query embedding to the entity embedding [and performing a similarity analysis of the data set model embeddings to the entity model embeddings to associate the query with one or more entity labels.]. The similarity metric may be a cosign similarity. The social-networking system may rank the entities based on their respective calculated similarity metrics.”).
Ambardekar, in view of Abdollahpour and Khare, and Subbian are both in the same field of endeavor (i.e. classification). It would have been obvious 
Read full office action
Prosecution Timeline

Mar 05, 2021
Application Filed
May 01, 2024
Non-Final Rejection — §103
Aug 06, 2024
Response Filed
Oct 22, 2024
Final Rejection — §103
Nov 21, 2024
Applicant Interview (Telephonic)
Nov 21, 2024
Examiner Interview Summary
Jan 02, 2025
Request for Continued Examination
Jan 13, 2025
Response after Non-Final Action
May 30, 2025
Non-Final Rejection — §103
Sep 03, 2025
Response Filed
Dec 03, 2025
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/882,311
Patent 12488244
APPARATUS AND METHOD FOR DATA GENERATION FOR USER ENGAGEMENT
2y 5m to grant Granted Dec 02, 2025
17/444,687
Patent 12423576
METHOD AND APPARATUS FOR UPDATING PARAMETER OF MULTI-TASK MODEL, AND STORAGE MEDIUM
2y 5m to grant Granted Sep 23, 2025
17/265,476
Patent 12361280
METHOD AND DEVICE FOR TRAINING A MACHINE LEARNING ROUTINE FOR CONTROLLING A TECHNICAL SYSTEM
2y 5m to grant Granted Jul 15, 2025
17/191,518
Patent 12354017
ALIGNING KNOWLEDGE GRAPHS USING SUBGRAPH TYPING
2y 5m to grant Granted Jul 08, 2025
17/161,152
Patent 12333425
HYBRID GRAPH NEURAL NETWORK
2y 5m to grant Granted Jun 17, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
47%
Grant Probability
90%
With Interview (+43.1%)
3y 9m
Median Time to Grant
High
PTA Risk
Based on 38 resolved cases by this examiner. Grant probability derived from career allow rate.