Last updated: April 19, 2026
Application No. 17/656,781
COMPUTER VISION FRAMEWORK FOR REAL ESTATE

Non-Final OA §101§103
Filed
Mar 28, 2022
Examiner
CAMPEN, KELLY SCAGGS
Art Unit
3691
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Reai Inc.
OA Round
5 (Non-Final)
This examiner grants 50% of cases after interview

— +32.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 533 resolved cases, 2023–2026
Examiner Intelligence

CAMPEN, KELLY SCAGGS View full profile →
Grants 50% of resolved cases
Career Allow Rate
269 granted / 533 resolved
-1.5% vs TC avg
Strong +32% interview lift
Without
With
+32.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 12m
Avg Prosecution
18 currently pending
Career history
551
Total Applications
across all art units
Statute-Specific Performance

§101
35.0%
-5.0% vs TC avg
§103
21.0%
-19.0% vs TC avg
§102
15.2%
-24.8% vs TC avg
§112
21.6%
-18.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 533 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
The following is in response to the amendments and arguments filed with the RCE entered 12/22/2025.  Claims 1-2, 4-20 are pending.  Claim 3 has been canceled.  Claims 17-20 have been withdrawn as being directed to non-elected invention.  

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/22/2025 has been entered.
 
Specification
	Applicant’s amendments have overcome the previous objections to the abstract. 
	
	
	
	
Claim Rejections - 35 USC § 101
	Applicants amendments and arguments (rem 8-12) overcome the prior rejections under 35 USC 101.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.   

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-2, 4, 7-8, 12, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Ziaeefard et al. (US 11,599,749 B1) in view of Ploegert et al. (US 2021/0200912 A1) further in view of Chang et al. (US 2021/0263962 A1).

Regarding claim 1, Ziaeefard et al. (hereinafter Ziaeefard) discloses:
A method implemented by a computing system including at least one processor and at least one memory, the method comprising (See an electronic device 100 suitable for use with one or more implementations of the disclosed methods, where the electronic device 100 comprises a processor 110 and a random access memory 130 Col. 13, lines 32-40, Fig. 1): 
receiving, by the computing system, an image and a knowledge graph (See obtaining a given image from a set of images, processing the image to obtain a labeled set of objects… and accessing a knowledge graph Col. 14, lines 40-56, Fig. 2. Examiner interprets the obtaining of the image and the accessing of the knowledge graph as receiving the image and the knowledge graph), wherein the knowledge graph includes a first node, a second node, and a relationship between the first node and the second node (See that the database stores the knowledge graph 235, which is a representation of information in the form of a graph, the graph including a set of nodes connected by a set of edges (i.e., the relationships between the nodes). The knowledge graph 235 has been created based on an ontology defining the types of nodes in the set of nodes, and the type of edge relations. The knowledge graph 235 is stored in the form of triples, where each triple includes a head entity, a tail entity, and a predicate. The head entity corresponds to a given node, the tail entity to another given node, and the predicate corresponds to a relation between the head entity and the tail entity, which corresponds to an edge type in the knowledge graph 235. In one or more embodiments, the knowledge graph 235 comprises or is associated with at least semantic types of relations between entities. Col. 16, lines 16-30, Fig. 2. Examiner interprets the head & tail as the first and second nodes, and the predicate as the relationship between the first and second nodes); 
detecting, using an object detection network applied to the image, a plurality of objects present in the image, wherein the plurality of objects includes the object in the room (see col 24 lines 1-18; col 17 lines 4-17);
extracting a subgraph from the knowledge graph based on the detection of the objects, wherein the subgraph includes the second node of the knowledge graph (see col 22 lines 19-32, col 17 lines 4-12 and 19-35);
encoding, using the at least one processer applying parameters of a convolutional neural network stored in the at least one memory, the image to obtain an image embedding representing the image (See obtaining a set of labeled objects 426 from the image, each object being associated with the respective region proposal 432, the respective feature vector 434, and the respective label 436…and generating a scene graph 440 of the image, the scene graph encoding spatial features, semantic features, and relational features of the labeled objects. The scene graph generation procedure 320 is performed by a scene graph generation ML model 280 performing encoding  Col. 18, lines 15-29, Figs. 3-4. See that the machine learning models may comprise neural networks, including convolutional neural networks Col. 7, line 60-Col. 8, line 25. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2. Examiner interprets the generation of the scene graph as the encoding of the image); 
encoding, using the at least one processer applying parameters stored in the at least one memory, the knowledge graph to obtain an object embedding (See a scene graph augmentation procedure 330 obtaining at least a portion of the knowledge graph 235 from the database 230, and embedding by using a word embedding procedure 340, at least a portion of the obtained knowledge graph to obtain an embedded knowledge graph 440. Col. 19, lines 25-47, Fig. 3-4. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2. Examiner interprets this as the encoding of the knowledge graph);
combining, by the computing system, the image embedding representing the image and the object embedding representing the object to obtain a final representation of the image (See that the scene graph augmentation procedure 330 then concatenates the scene graph 440 (which is an embedded set of triples) with the embedded knowledge graph 450 (represented as an embedded set of additional triples) to obtain an augmented scene graph 460 Col. 19, line 48-Col. 20, line 5, Figs. 3-4. Examiner interprets the concatenation to obtain an augmented scene graph as the combining of the two embeddings to generate a final representation); and 
generating, by the computing system, a natural language description based on the final representation of the image (See after generating the augmented scene graph at step 508, the server 220 accesses the word embedding ML model 285 to generate, for a given question 472 and corresponding answer 474, an embedded question 482 and associated answer 383 to obtain a set of embedded question and answer for a given image 412 Col. 24, line 60-Col. 25, line 5, Figs. 5-6. See that natural language processing techniques may be used to provide human-readable explanations for answers Col. 23, lines 1-3, Fig. 4. Examiner interprets this as generating a natural language description).

Ziaeefard does not explicitly disclose:  
wherein the image is an image of a property, and wherein the knowledge graph is knowledge graph,
wherein the first node is a first node representing a room of the property wherein the second node is a second node representing an object located within the room of the property, and wherein the relationship is a relationship representing a location of the object within the room;
wherein the encoding of the knowledge graph is performed by applying parameters of a graph neural network, 
wherein the object embedding is an object embedding representing an object in the room; and
wherein the natural language description is a natural language description of the property

However, Ziaeefard does disclose systems and methods directed to training machine learning models to answer questions regarding images. This process involves the utilization of knowledge graphs and image embeddings (see Ziaeefard Col. 6, line 35-Col. 7, line 5; Col. 14, lines 40-56, Fig. 2). While Ziaeefard does not explicitly disclose that the images and knowledge graphs are related to real estate, one of ordinary skill in the art would understand that the methods of Ziaeefard could apply to images of any kind and contain knowledge graphs on any subject. Nevertheless additional art has been cited.

Ploegert et al. (hereinafter Ploegert) also discloses methods that utilize knowledge graphs to answer questions or queries.

Ploegert, on the other hand, teaches:
wherein the knowledge graph is a property knowledge graph (See a building system of a building including  a building graph representing information related to the building (i.e., real estate) [0044]),
wherein the first node is a first node representing a room of the property wherein the second node is a second node representing an object located within the room of the real estate property, and wherein the relationship is a relationship representing a location of the object within the room (See FIG. 13 depicting a knowledge graph. In the knowledge graph, see specifically nodes representing a building 1304 (building 120), a floor 1306 (floor 2) and a room 1308 (room 2023), with objects inside the room, including a light 1316, a bedside lamp 1314. The edges e.g., 1362, 1366, etc., all depict the objects (e.g., the light and the lamp) within the room [0327-0328], Fig. 13);
wherein the object embedding is an object embedding representing an object in the room (See FIG. 13 depicting a knowledge graph. In the knowledge graph, see specifically nodes representing a building 1304 (building 120), a floor 1306 (floor 2) and a room 1308 (room 2023), with objects inside the room, including a light 1316, a bedside lamp 1314. The edges e.g., 1362, 1366, etc., all depict the objects (e.g., the light and the lamp) within the room [0327-0328], Fig. 13); and
wherein the natural language description is a natural language description of the property (See the cloud platform 106 can provide a user and/or service an indication of the portion of the graph in response to determining that the policy indicates that the user and/or service has access to view the portion of the graph. The cloud platform 106 can cause a display device of the user device 176 to display the indication of the portion of the graph in some embodiments. In step 2008, the cloud platform 106 can receive a command for a piece of equipment. The command may be a command to operate the piece of equipment, in some embodiments. In some embodiments, the command is a command to perform an action on behalf of a user, e.g., send an email to a user, schedule a meeting with the user, etc. (i.e., a natural language description of the real estate property) [0371])

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

The combination of Ziaeefard and Ploegert does not explicitly teach:
wherein the image is an image of a property, and
applying parameters of a graph neural network stored in the at least one memory

Chang et al. (hereinafter Chang) also discloses systems and methods directed to the application of component graph (i.e., a knowledge graph) (see Chang [0031]).

Chang, on the other hand, teaches: 
wherein the image is an image of a property (See a graphical user interface including an image 704 of a room with various pieces of furniture when a user provides a query string of “chair” [0143], Figs. 7A-D),
wherein the encoding of the knowledge graph is performed by applying parameters of a graph neural network (See that a neural network may include a graph neural network [0060]), 

Ziaeefard, Ploegert, and Chang all disclose systems and methods directed to knowledge graphs. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Chang. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would allowed for improved object recognition in natural language object selection requests (See Chang [0001-0007]).

Regarding claim 2, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard does not explicitly disclose: 
identifying a plurality of room types and a plurality of objects within the room types, wherein the nodes of the knowledge graph include the plurality of room types and the plurality of objects within the room types.

Ploegert, on the other hand, teaches:
identifying a plurality of room types and a plurality of objects within the room types, wherein the nodes of the knowledge graph include the plurality of room types and the plurality of objects within the room types (See FIG. 13 depicting a knowledge graph. In the knowledge graph, see specifically nodes representing a building 1304 (building 120), a floor 1306 (floor 2) and a room 1308 (room 2023), with objects inside the room, including a light 1316, a bedside lamp 1314. The edges e.g., 1362, 1366, etc., all depict the objects (e.g., the light and the lamp) within the room [0327-0328], Fig. 13. See further identifying one or more entities and/or one or more relationships  of a graph related to the event. The entities could be an indication of a location of the event, e.g., what room, what floor, what building, etc.  (i.e., a plurality of rooms) [0348], Fig. 16).

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).


Regarding claim 4, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
wherein encoding the image further comprises: performing a convolution operation on the image to obtain a convolution representation, wherein the image embedding is based on the convolution representation  (See obtaining a set of labeled objects 426 from the image, each object being associated with the respective region proposal 432, the respective feature vector 434, and the respective label 436…and generating a scene graph 440 of the image, the scene graph encoding spatial features, semantic features, and relational features of the labeled objects. The scene graph generation procedure 320 is performed by a scene graph generation ML model 280 performing encoding  Col. 18, lines 15-29, Figs. 3-4. See that the machine learning models may comprise neural networks, including convolutional neural networks, which use convolution in place of matrix multiplication (i.e., performing a convolution operation) Col. 7, line 60-Col. 8, line 25).

Regarding claim 7, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
applying an RNN, wherein the RNN includes parameters stored in at least one memory (See that machine learning models include a recurrent neural network (RNN) Col. 7, lines 63-Col. 8, line 37. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2), to the embedded representation of the image to obtain the natural language description (See that machine learning models include a recurrent neural network (RNN) Col. 7, lines 63-Col. 8, line 37. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2).

Regarding claim 8, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
receiving a search query that includes attributes of the real estate property; and 
retrieving the image (See obtaining a given image from a set of images, processing the image to obtain a labeled set of objects Col. 14, lines 40-56, Fig. 2) based on the search query and the embedded representation of the image (See that the scene graph augmentation procedure 330 then concatenates the scene graph 440 with the embedded knowledge graph 450 to obtain an augmented scene graph 460 Col. 19, line 48-Col. 20, line 5, Figs. 3-4. Examiner interprets the concatenation as the combining of the two embeddings).

Ziaeefard does not explicitly disclose:
receiving a search query that includes attributes of the real estate property; and 
retrieving the image based on the search query and the embedded representation of the image.

Ploegert, on the other hand, teaches:
receiving a search query that includes attributes of the real estate property (See receiving a query from a query manager 706, wherein the query is for information of a graph projection (i.e., information included in the knowledge graph) [0296]); and 
retrieving  the image based on the search query and the embedded representation of the image (See retrieving the graph projection (i.e., the knowledge graphs) in response to the query [0296-0297]).

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

Regarding claim 12, the combination of Ziaeefard, Cervantes, and Lin teaches the method of claim 1. Ziaeefard further discloses: 
generating a description based on the embedded representation of the image (See after generating the augmented scene graph at step 508, the server 220 accesses the word embedding ML model 285 to generate, for a given question 472 and corresponding answer 474, an embedded question 482 and associated answer 383 to obtain a set of embedded question and answer for a given image 412 Col. 24, line 60-Col. 25, line 5, Figs. 5-6. See that natural language processing techniques may be used to provide human-readable explanations for answers Col. 23, lines 1-3, Fig. 4).

Ziaeefard does not explicitly disclose:
wherein the description is a description of a maintenance condition of the real estate property

Ploegert, on the other hand, teaches:
wherein the description is a description of a maintenance condition of the real estate property (See performing energy and space optimization, predictive maintenance, and/or remote operations of a building using a building data platform [0235]. See viewing the data associated with HVAC systems (e.g., the maintenance condition) [0272]) based on the embedded representation of the image.

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

Regarding claim 14, Ziaeefard discloses:
A method implemented by a computing system including at least one processor and at least one memory, the method comprising (See an electronic device 100 suitable for use with one or more implementations of the disclosed methods, where the electronic device 100 comprises a processor 110 and a random access memory 130 Col. 13, lines 32-40, Fig. 1): 
receiving, by the computing system, an image of a property and a knowledge graph (See obtaining a given image from a set of images, processing the image to obtain a labeled set of objects… and accessing a knowledge graph Col. 14, lines 40-56, Fig. 2), wherein the knowledge graph includes a first node representing a room of the property, a second node representing an object in the room, and a relationship between the first node and the second node representing a location of the object within the room (See that the database stores the knowledge graph 235, which is a representation of information in the form of a graph, the graph including a set of nodes connected by a set of edges (i.e., the relationships between the nodes). The knowledge graph 235 has been created based on an ontology defining the types of nodes in the set of nodes, and the type of edge relations. The knowledge graph 235 is stored in the form of triples, where each triple includes a head entity, a tail entity, and a predicate. The head entity corresponds to a given node, the tail entity to another given node, and the predicate corresponds to a relation between the head entity and the tail entity, which corresponds to an edge type in the knowledge graph 235. In one or more embodiments, the knowledge graph 235 comprises or is associated with at least semantic types of relations between entities. Col. 16, lines 16-30, Fig. 2); 
detecting, using an object detection network applied to the image, a plurality of objects present in the image, wherein the plurality of objects includes the object in the room (see col 24 lines 1-18; col 17 lines 4-17);
extracting a subgraph from the knowledge graph based on the detection of objects, wherein the subgraph includes the second node of the knowledge graph (see col 22 lines 19-32, col 17 lines 4-12 and 19-35);
encoding, using the at least one processor applying parameters of a convolutional neural network stored in the at least one memory, the image to obtain an image embedding representing the image (See obtaining a set of labeled objects 426 from the image, each object being associated with the respective region proposal 432, the respective feature vector 434, and the respective label 436…and generating a scene graph 440 of the image, the scene graph encoding spatial features, semantic features, and relational features of the labeled objects. The scene graph generation procedure 320 is performed by a scene graph generation ML model 280 performing encoding  Col. 18, lines 15-29, Figs. 3-4. See that the machine learning models may comprise neural networks, including convolutional neural networks Col. 7, line 60-Col. 8, line 25. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2); 
encoding, using the at least one processor applying parameters stored in the at least one memory, the knowledge graph to obtain an object embedding (See a scene graph augmentation procedure 330 obtaining at least a portion of the knowledge graph 235 from the database 230, and embedding by using a word embedding procedure 340, at least a portion of the obtained knowledge graph to obtain an embedded knowledge graph 440. Col. 19, lines 25-47, Fig. 3-4. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2);
combining, by the computing system, the image embedding representing the image and the object embedding to obtain a final representation of the image  (See that the scene graph augmentation procedure 330 then concatenates the scene graph 440 with the embedded knowledge graph 450 to obtain an augmented scene graph 460 Col. 19, line 48-Col. 20, line 5, Figs. 3-4. Examiner interprets the concatenation as the combining of the two embeddings);
receiving, by the computing system, a search query that includes attributes of the real estate property; and 
information that is the image  (See obtaining a given image from a set of images, processing the image to obtain a labeled set of objects Col. 14, lines 40-56, Fig. 2) and a representation that is the final representation of the image  (See that the scene graph augmentation procedure 330 then concatenates the scene graph 440 with the embedded knowledge graph 450 to obtain an augmented scene graph 460 Col. 19, line 48-Col. 20, line 5, Figs. 3-4. Examiner interprets the concatenation as the combining of the two embeddings).

Ziaeefard does not explicitly disclose:
wherein the image is an image of a property, wherein the knowledge graph is a knowledge graph, 
wherein the first node is  a first node representing a room of the property, the second node is a second node representing an object in the room, and the relationship is a relationship representing a location of the object within the room; 
applying parameters of a graph neural network 
wherein the object embedding is an object embedding representing the object in the room;
receiving, by the computing system, a search query that includes attributes of the real estate property; and 
retrieving, by the computing system, information based on the search query and the representation.

However, Ziaeefard does disclose systems and methods directed to training machine learning models to answer questions regarding images. This process involves the utilization of knowledge graphs and image embeddings (see Ziaeefard Col. 6, line 35-Col. 7, line 5; Col. 14, lines 40-56, Fig. 2). While Ziaeefard does not explicitly disclose that the images and knowledge graphs are related to real estate, one of ordinary skill in the art would understand that the methods of Ziaeefard could apply to images of any kind and contain knowledge graphs on any subject. Nevertheless additional art has been cited.

Ploegert et al. (hereinafter Ploegert) also discloses methods that utilize knowledge graphs to answer questions or queries.

Ploegert et al. (hereinafter Ploegert), on the other hand, teaches:
wherein the knowledge graph is a knowledge graph (See a building system of a building including a building graph representing information related to the building (i.e., real estate) [0044]), 
wherein the first node is  a first node representing a room of the property, the second node is a second node representing an object in the room, and the relationship is a relationship representing a location of the object within the room (See FIG. 13 depicting a knowledge graph. In the knowledge graph, see specifically nodes representing a building 1304 (building 120), a floor 1306 (floor 2) and a room 1308 (room 2023), with objects inside the room, including a light 1316, a bedside lamp 1314. The edges e.g., 1362, 1366, etc., all depict the objects (e.g., the light and the lamp) within the room [0327-0328], Fig. 13); 
wherein the object embedding is an object embedding representing the object in the room  (See FIG. 13 depicting a knowledge graph. In the knowledge graph, see specifically nodes representing a building 1304 (building 120), a floor 1306 (floor 2) and a room 1308 (room 2023), with objects inside the room, including a light 1316, a bedside lamp 1314. The edges e.g., 1362, 1366, etc., all depict the objects (e.g., the light and the lamp) within the room [0327-0328], Fig. 13);
receiving, by the computing system, a search query that includes attributes of the property (See receiving a query from a query manager 706, wherein the query is for information of a graph projection (i.e., information included in the knowledge graph) [0296]); and 
retrieving, by the computing system, information based on the search query and the representation (See retrieving the graph projection (i.e., the knowledge graphs) in response to the query [0296-0297]).

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

The combination of Ziaeefard and Ploegert does not explicitly teach:
wherein the image is an image of a property, and
applying parameters of a graph neural network stored in the at least one memory

Chang also discloses systems and methods directed to the application of component graph (i.e., a knowledge graph) (see Chang [0031]).

Chang, on the other hand, teaches: 
wherein the image is an image of a property (See a graphical user interface including an image 704 of a room with various pieces of furniture when a user provides a query string of “chair” [0143], Figs. 7A-D),
wherein the encoding of the knowledge graph is performed by applying parameters of a graph neural network (See that a neural network may include a graph neural network [0060]), 

Ziaeefard, Ploegert, and Chang all disclose systems and methods directed to knowledge graphs. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Chang. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would allowed for improved object recognition in natural language object selection requests (See Chang [0001-0007]).

Regarding claim 16, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 14. Ziaeefard does not explicitly disclose: wherein: the search query comprises an image, a text description, or both.

Ploegert, on the other hand, teaches: wherein: the search query comprises an image, a text description, or both (See a query such as “what space, build, floor, is that badge scanner in?” (i.e., a text query) [0252]).

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

Claims 5-6, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over  Ziaeefard, Ploegert and Chang as recited in the above rejection, further in view of Lin et al. (US 2022/0014807 A1).

Regarding claim 5, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
applying a transformer network, wherein the transformer network includes parameters stored in the at least one memory (See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2), to the image to obtain a transformer representation, wherein the embedded representation of the image is based on the transformer representation.

Ziaeefard does not explicitly disclose: 
applying a transformer network, wherein the transformer network includes parameters stored in the at least one memory, to the image to obtain a transformer representation, wherein the embedded representation of the image is based on the transformer representation

Lin, on the other hand, teaches:
applying a transformer network, wherein the transformer network includes parameters stored in the at least one memory, to the image to obtain a transformer representation, wherein the embedded representation of the image is based on the transformer representation (See in addition to applying a feature extraction network to extract features of each region of an image, the system may further add encoders that learn relationships between features of the regions. The encoder may be implemented by a self-attention-based-encoder (e.g., a transformer encoder) and a self-attention-based decoder (e.g., a transformer decoder) [0159]. See using a self-attention-based encoder to encode graph convolution features to perform encoder embeddings [0140]. Examiner interprets the transformer encoder & decoder as the transformer network).

Ziaeefard, Ploegert, Chang, and Lin all disclose systems and methods directed to knowledge graphs. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Lin. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have optimized image captioning beyond just an encoder-decoder structure (see Lin [0003]).

Regarding claim 6, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
applying a knowledge transformer network, wherein the knowledge transformer network includes parameters stored in the at least one memory (See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2), to the knowledge graph to obtain an embedded representation of the knowledge graph, wherein the embedded representation of the image is based on the embedded knowledge representation.

Ziaeefard does not explicitly disclose:
applying a knowledge transformer network, wherein the knowledge transformer network includes parameters stored in the at least one memory, to the extracted subgraph to obtain the object embedding.

Lin, on the other hand, teaches:
applying a knowledge transformer network, wherein the knowledge transformer network includes parameters stored in the at least one memory, to the extracted subgraph to obtain the object embedding (See a self-attention-based intra-encoder (e.g., the intra-frame transformer encoder shown in FIG. 17b) can be used to separately perform the encoding operation on the graph convolution features of each frame, where the graph convolution features are encoded from a scene graph. The function of the self-attention-based intra-encoder is to learn the infra-frame information, that is, the self-attention mechanism can be used to further learn the association information between the objects in the frame. [0179]. See that an encoder may be implemented by a self-attention-based-encoder (e.g., a transformer encoder) and a self-attention-based decoder (e.g., a transformer decoder) [0159]. Examiner interprets the transformer encoder as the knowledge transformer network because the system is encoding the scene graph)

Ziaeefard, Ploegert, Chang, and Lin all disclose systems and methods directed to knowledge graphs. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Lin. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have optimized image captioning beyond just an encoder-decoder structure (see Lin [0003]).

Regarding claim 9, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
performing the object detection on the image embedding to obtain an image tag corresponding to an object represented in the knowledge graph (See an object detection procedure 310 that has access to the object detection ML model 275 having been trained to generate a set of labelled objects 426 from an input image 412. The object detection ML model 275 may be a Faster R-CNN (i.e., a convolutional neural network) Col. 17, lines 3-18, Fig. 3-4. See that based on a portion of the scene graph, the server 220 obtains a set of additional objects from the knowledge graph 235 Col. 24, lines 18-29, Fig. 5).

Ziaeefard does not explicitly disclose:
performing object detection on the image embedding to obtain an image tag corresponding to an object represented in the knowledge graph.

Lin, on the other hand, teaches:
performing object detection on the image embedding (See the obtained relationship features and attribute features of the target region can be encoded into the target dimension of feature vector (i.e. embedded representation), and then a graph convolution network is applied to the encoded feature vectors to learn the relationship between adjacent nodes and edges in the scene graph, so as to obtain the graph convolved features of each node contained in the scene graph (i.e., the graph convolution features) [0101-0102]) to obtain an image tag corresponding to an object represented in the knowledge graph.

Ziaeefard, Ploegert, Chang, and Lin all disclose systems and methods directed to knowledge graphs. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Lin. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have optimized image captioning beyond just an encoder-decoder structure (see Lin [0003]).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over previously cited Ziaeefard in view of previously cited Ploegert in view of newly cited Chang in view of previously cited Cervantes et al. (US 2022/0179882 A1). 

Regarding claim 10, the combination of Ziaeefard, Cervantes, and Lin teaches the method of claim 1. Ziaeefard does not explicitly disclose: 
classifying the image based on a set of real estate property types based on the embedded representation of the image.

Cervantes, on the other hand, teaches:
classifying the image based on a set of real estate property types based on the embedded representation of the image (See performing classification of a first location entity using machine learning [0064], Fig. 3-4. See that the system receives inputs from different sources, including image data 131 [0047], Fig. 1. See that the knowledge graph contains embeddings of multi-modal data, including the image data 131 [0077], Fig. 1).

Ziaeefard, Ploegert, Chang, and Cervantes all disclose systems and methods directed to knowledge graphs. These four references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Cervantes. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for improved reconciliation of variations between data sources (See Cervantes [0001]).


Claims 11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ziaeefard, Ploegert, and Chang further in view of Bui  et al. (US 2019/0080425 A1)

Regarding claim 11, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Ziaeefard further discloses: 
displaying the natural language description  (See after generating the augmented scene graph at step 508, the server 220 accesses the word embedding ML model 285 to generate, for a given question 472 and corresponding answer 474, an embedded question 482 and associated answer 383 to obtain a set of embedded question and answer for a given image 412 Col. 24, line 60-Col. 25, line 5, Figs. 5-6. See that natural language processing techniques may be used to provide human-readable explanations for answers Col. 23, lines 1-3, Fig. 4) of the real estate property on a website.

Ziaeefard does not explicitly disclose: displaying the natural language description of the real estate property on a website.

Bui et al. (hereinafter Bui), on the other hand, teaches: displaying the natural language description of the property on a website (See generating and displaying listing of real estate agents and associated information and/or parameters [0041]).

Ziaeefard, Ploegert, Chang, and Bui disclose systems and methods directed to utilizing artificial intelligence to analyze and classify objects in images. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the real estate classifications of Bui. The claimed invention is merely a combination of existing elements, and in combination, each element would function the same as they would separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed users to receive recommended real estate agents and properties in accordance with attributes derived from the machine learning/artificial intelligence modules (See Bui [0058], [0061]).

Regarding claim 13, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 1. Lin further discloses: 
encoding a user profile of a user to obtain an encoded user profile (See obtaining a set of labeled objects 426 from the image, each object being associated with the respective region proposal 432, the respective feature vector 434, and the respective label 436…and generating a scene graph 440 of the image, the scene graph encoding spatial features, semantic features, and relational features of the labeled objects. The scene graph generation procedure 320 is performed by a scene graph generation ML model 280 performing encoding  Col. 18, lines 15-29, Figs. 3-4. See that the machine learning models may comprise neural networks, including convolutional neural networks Col. 7, line 60-Col. 8, line 25. See that the database 230 (i.e., a memory) stores parameters related to the ML models Col. 16, line 50-Col. 17, line 12, Fig. 2); 
generating a recommendation score based on the embedded representation of the image and the encoded user profile (See that the scene graph augmentation procedure 330 then concatenates the scene graph 440 with the embedded knowledge graph 450 to obtain an augmented scene graph 460 Col. 19, line 48-Col. 20, line 5, Figs. 3-4. Examiner interprets the concatenation as the combining of the two embeddings)

Ziaeefard does not explicitly disclose: 
encoding a user profile of a user to obtain an encoded user profile; 
generating a recommendation score based on the embedded representation of the image and the encoded user profile; and 
recommending the property to the user based on the recommendation score.

Bui, on the other hand, teaches: 
encoding a user profile of a user to obtain an encoded user profile (See that an account is created and relevant profile information is entered for the user, customer, service requesters, etc. [0031]); 
generating a recommendation score based on the embedded representation of the image and the encoded user profile (See recommending real estate properties having real estate classification values and neighborhood classification values closes to one or more standard searchable real estate property preferences, one or more user-based preferences, etc. The user-based preferences can be derived based on the user profile [0102]); and 
recommending the real estate property to the user based on the recommendation score (See recommending real estate properties having real estate classification values and neighborhood classification values closes to one or more standard searchable real estate property preferences, one or more user-based preferences, etc. The user-based preferences can be derived based on the user profile [0102]).

Ziaeefard, Ploegert, Chang, and Bui disclose systems and methods directed to utilizing artificial intelligence to analyze and classify objects in images. These references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the real estate classifications of Bui. The claimed invention is merely a combination of existing elements, and in combination, each element would function the same as they would separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed users to receive recommended real estate agents and properties in accordance with attributes derived from the machine learning/artificial intelligence modules (See Bui [0058], [0061]).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Ziaeefard,  Ploegert and Chang as applied above, further in view of Lin et al. (US 2022/0036127 A1),

Regarding claim 15, the combination of Ziaeefard, Ploegert, and Chang teaches the method of claim 14. Ziaeefard does not explicitly disclose: 
encoding the search query in a same embedding space as the image to obtain an encoded search query; and 
generating a similarity score between the image and the search query, wherein the image is retrieved based on the similarity score.

Ploegert, on the other hand, teaches:
encoding the search query (See receiving a query from a query manager 706, wherein the query is for information of a graph projection (i.e., information included in the knowledge graph) [0296]) in the same embedding space as the image to obtain an encoded  (See receiving a query from a query manager 706, wherein the query is for information of a graph projection (i.e., information included in the knowledge graph) [0296]) ; and 
generating a similarity score between the image and the search query (See receiving a query from a query manager 706, wherein the query is for information of a graph projection (i.e., information included in the knowledge graph) [0296]), wherein the image is retrieved (See retrieving information in response to a query from the query manager 706 [0296-0297], Fig. 7) based on the similarity score.

Ziaeefard and Ploegert both disclose systems and methods directed to knowledge graphs. These two references are in the same field of endeavor and are therefore analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard with the teachings of Ploegert. The claimed invention is merely a combination of existing elements, and in combination, each element would have functioned the same as it would have separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have allowed for automatic validating within the schema and ontology of the knowledge graphs (see Ploegert [0280]).

The combination of Ziaeefard, Ploegert, and Chang does not explicitly teach: 
encoding the search query in a same embedding space as the image to obtain an encoded search query; and 
generating a similarity score between the image and the search query, wherein the image is retrieved based on the similarity score

Liu, on the other hand, teaches: 
encoding the search query in a same embedding space as the image to obtain an encoded search query (See using an image encoder and a text encoder to embed the image feature map and the textural feature vectors into a visual-semantic joint embedding space and generating a new image feature man that modifies the visual attributes of the image by manipulating the image feature map by the textural feature vectors within the joint embedding space [0006]); and 
generating a similarity score between the image and the search query, wherein the image is retrieved based on the similarity score (See normalizing the visual feature vector and the textual feature vector to determine a similarity score, where the joint embedding space is trained based on the similarity score [0075-0076]).

Ziaeefard,  Ploegert, and Liu all disclose systems and methods directed to analyzing images and text. These three references are in the same field of endeavor. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system of Ziaeefard and Ploegert with the vector spaces and similarity scores of Liu. The claimed invention is merely a combination of existing elements, and in combination, each element would function the same as they would separately. One of ordinary skill in the art would have recognized that the results were repeatable and would have removed the need for training data when utilizing the semantic network (see Liu [0004]).



Response to Arguments
Applicant's arguments filed 12/22/2025 have been fully considered but they are not persuasive. 
Applicant’s arguments with respect to claim(s) 1-2, 4-16  have been considered but are moot because the new ground of rejection necessitated by amendment.


35 U.S.C. 101
Applicant’s arguments (see rem 8-12) with respect to the rejection under 35 U.S.C. 101 have been fully considered and are persuasive. 

35 U.S.C. 103
Regarding applicant’s argument the cited art does not teach the “claimed detection based extraction” Examiner respectfully disagrees(see rem 13).  In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., extracting a subgraph from a pre-existing knowledge graph and using detection results to determine which portion of a knowledge graph to extract (rem 13-14).) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).  Ziaeefard is relied upon to teach: detecting, using an object detection network applied to the image, a plurality of objects present in the image, wherein the plurality of objects includes the object in the room (see col 24 lines 1-18; col 17 lines 4-17) and, extracting a subgraph from the knowledge graph based on the detection of objects, wherein the subgraph includes the second node of the knowledge graph (see col 22 lines 19-32, col 17 lines 4-12 and 19-35); (see col 24 lines 1-18; col 17 lines 4-17).
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Accordingly, the claims remain rejected under 35 U.S.C. 103.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Li discloses systems and methods utilizing knowledge graphs/scene graphs to caption images.  Wang et al. disclose systems and methods directed to matching user profiles to available real estate properties.  Akiyama et al. disclose evaluation unit which manages the images relating to the property together with image evaluation information expressing the object of the image, which is given by using the first evaluation model generated by machine learning. Lee discloses a system for  searching a real estate area.  Bly discloses a system for organizing, representing, finding, discovering, and accessing data.  Tremblay et al. disclose a multi service business platform.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Kelly Campen whose telephone number is (571)272-6740. The examiner can normally be reached Monday-Thursday 6am-3pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abhishek Vyas can be reached at 571-270-1836. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Kelly S. Campen
Primary Examiner
Art Unit 3691



/KELLY S. CAMPEN/            Primary Examiner, Art Unit 3691
Read full office action
Prosecution Timeline

Mar 28, 2022
Application Filed
Mar 08, 2024
Non-Final Rejection — §101, §103
Jun 17, 2024
Response Filed
Sep 25, 2024
Final Rejection — §101, §103
Nov 25, 2024
Response after Non-Final Action
Dec 23, 2024
Request for Continued Examination
Dec 31, 2024
Response after Non-Final Action
Jan 23, 2025
Non-Final Rejection — §101, §103
Apr 30, 2025
Response after Non-Final Action
Apr 30, 2025
Response Filed
Jun 03, 2025
Response Filed
Sep 17, 2025
Final Rejection — §101, §103
Dec 22, 2025
Request for Continued Examination
Jan 12, 2026
Response after Non-Final Action
Jan 24, 2026
Non-Final Rejection — §101, §103
Apr 06, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

18/418,894
Patent 12585729
VISUAL REPRESENTATION GENERATION FOR BIAS CORRECTION
2y 5m to grant Granted Mar 24, 2026
18/118,043
Patent 12518314
METHOD AND SYSTEM FOR INTERACTIVE VIRTUAL CUSTOMIZED VEHICLE DESIGN, PURCHASE, AND FINAL ACQUISITION
2y 5m to grant Granted Jan 06, 2026
18/456,005
Patent 12217315
SYSTEMS AND METHODS FOR GENERATING CONTEXTUALLY RELEVANT DEVICE PROTECTIONS
2y 5m to grant Granted Feb 04, 2025
18/312,712
Patent 12190375
PROCESSING SYSTEM TO GENERATE RISK SCORES FOR ELECTRONIC RECORDS
2y 5m to grant Granted Jan 07, 2025
18/616,408
Patent 12086882
FEE/REBATE CONTINGENT ORDER MATCHING SYSTEM AND METHOD
2y 5m to grant Granted Sep 10, 2024
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
50%
Grant Probability
83%
With Interview (+32.2%)
3y 12m
Median Time to Grant
High
PTA Risk
Based on 533 resolved cases by this examiner. Grant probability derived from career allow rate.
COMPUTER VISION FRAMEWORK FOR REAL ESTATE

This examiner grants 50% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email