Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-6, 8-16 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et.al (US 20220108546 A1), in view of Iyer et.al (US 20210248409 A1), further in view of Ramprasaath et.al (NPL: Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance)
Xu teaches part of the limitation “A method for operating at least one trained classifier for measurement data, the classifier including a neural network with at least one feature extraction section and at least one classification section,” (paragraph 20 “a detection result of the first object may be determined by comprehensively considering an initial image feature extracted from the first object and semantic information of the person and the road”, and paragraph 26 “The graph structure information (or may be referred to as the graph structure) may include a plurality of nodes. Each node corresponds to one object category, there is a specific association relationship between object categories (or may be referred to as object classifications)”, paragraph 95 “In the object detection method provided in the embodiments of this application, input data (for example, pictures in this application) may be input into the trained neural network, to obtain output data (for example, detection results of the pictures in this application)”, and paragraph 110 “Many neural network structures end up with a classifier, for classifying an object in an image. The classifier generally includes a fully connected layer (fully connected layer) and a softmax function, and can output probabilities of different categories based on inputs.” Xu discloses a method of utilizing a trained neural network to obtain output data, wherein the neural network may be a trained classifier. The method comprises an initial image feature extracted from the input data, which corresponds to the feature extraction section, and the neural network produces classification outputs associated with object categories, which corresponds to the claimed classification section, as claimed.)
Xu teaches the limitation “processing a record of measurement data with at least the feature extraction section of the classifier” (paragraph 20 “a detection result of the first object may be determined by comprehensively considering an initial image feature extracted from the first object and semantic information of the person and the road”, and paragraph 106 “The convolutional neural network includes a feature extractor including a convolutional layer and a sub-sampling layer. The feature extractor may be considered as a filter.” Xu discloses the trained neural network (the classifier) comprises feature extractor as a layer within the architecture of the neural network, which corresponds to the claimed feature extraction. The feature extractor is utilized to extract feature of the object within an input such as an image, and because the image captured represents data describing characteristics of a scene or object as understood by a person ordinary skilled in the art, the input image corresponds to the measurement data under the broadest reasonable interpretation.)
Xu teaches the limitation “determining a set of neurons in the feature extraction section that are activated by the processing” (paragraph 95 “In the object detection method provided in the embodiments of this application, input data (for example, pictures in this application) may be input into the trained neural network, to obtain output data (for example, detection results of the pictures in this application).”, and paragraph 99 “The neural network is a network constituted by connecting a plurality of single neurons together. To be specific, an output of a neuron may be an input of another neuron. An input of each neuron may be connected to a local receptive field of a previous layer to extract a feature of the local receptive field. The local receptive field may be a region including several neurons.” Xu discloses input data may be input into the trained neural network to obtain output data, wherein the neural network comprises connection of a plurality of neurons together, wherein outputs of neurons may serve as inputs to other neurons. Thus, when the input image data is processed by the neural network to extract features, neuron within the feature extraction layers produce outputs corresponding to the extracted features, which corresponds to determining a set of neurons in the feature extraction section that are activated by the processing, as claimed.)
Xu teaches the limitation “comparing attributes to which classes are linked by a given knowledge graph with the determined set of attributes” (fig.6, paragraph 36 “With reference to the first aspect, in some implementations of the first aspect, the association relationship between different object categories corresponding to different objects in the to-be-detected image includes at least one of the following information: an attribute association relationship between different object categories”, paragraph 39 “The training image may generally include a plurality of to-be-detected objects. The object categories to which different objects in the training image belong may also be referred to as labeled data of the training image, and the labeled data may be (manually) pre-labeled data”, paragraph 41 “The initial knowledge graph information may include the association relationship between different object categories corresponding to different objects in the to-be-detected image”, paragraph 123 “The training device 120 performs object detection on an input training image, and compares an output object detection result with a pre-labeled detection result, until a difference between the object detection result output by the training device 120 and the pre-labeled detection result is less than a specific threshold, thereby completing training of the target model/rule 101” Xu discloses processing an input image to extract image features and utilizing those features during graph-based inference to product a final classification result. Because the image features extracted represent semantic characteristics of the detected objects in the image, those features represent semantic attributes in the image under the broadest reasonable interpretation, and corresponds to the attributes as recited by Iyer below, such that the object detection result produced based on the feature extraction corresponds to the determined set of attributes as claimed. Xu then discloses constructing a graph structure based on a knowledge graph that defines relationships between object categories corresponding to objects in an image including an attribute association relationship that constitutes a pre-labeled data or a pre-labeled detection result. Xu further performs a comparison between an object detection result produced from feature extraction of the input image and pre-labeled detection result associated with the object categories and attributes association relationship from the knowledge graph, which corresponds to the comparing attributes to which classes are linked by a knowledge graph with the determined set of attributes from the feature extraction section, as claimed.)
Xu teaches the limitation “evaluating, from a result of the comparison, at least one estimated class as a class to which the scene captured by the record of measurement data is likely to belong” (fig.6, paragraph 201 “Determine a candidate frame and a classification of the to-be-detected object based on the initial image feature of the to-be-detected object and the enhanced image feature of the to-be-detected object” Xu discloses a comparison between an object detection result produced from feature extraction of the input image and pre-labeled detection result associated with the object categories from the knowledge graph, and the result of this comparison is used to determine a final classification result for the to-be-detected object, which corresponds to the claimed evaluating at least one estimated class as a class to which the scene captured by the record of measurement data is likely to belong based on the comparison result.)
Xu does not teach part of the limitation “wherein activations of neurons in the feature extraction section indicate presence of features in the measurement data”. However, Iyer teaches this part of the limitation (paragraph 38 “the image processing device 102 extracts a plurality of target image attributes from the user input.”, and paragraph 41 “Herein, the mapping of each of the set of attributes to the associated set of activated neurons indicates corresponding neuron activations in the neural network.” Iyer discloses extracting target image attributes from the input image and mapping those attributes to corresponding sets of activated neurons in the neural network. Because each extracted image attribute corresponds to a feature present in the input image and the mapping identifies the neurons activated in response to those attributes, the activated neurons indicate the presence of corresponding features in the input data. Accordingly, the neuron activations disclosed by Iyer corresponds to activations of neurons in the feature extraction section indicate the presence of features in the measurement data under the broadest reasonable interpretation, as claimed.)
Xu does not teach the limitation “determining, from a given correspondence between the activated neurons and attributes, a set of the attributes whose presence in a scene captured by the measurement data is indicated by the activated neurons” However, Iyer teaches this part of the limitation (paragraph 38 “the image processing device 102 extracts a plurality of target image attributes from the user input.”, paragraph 39 “The step 306 further includes a step 306 (a), where the image processing device 102 may identify a set of activated neurons that are mapped to the matching attribute as neuron activations corresponding to the plurality of target image attributes”, and paragraph 41 “Herein, the mapping of each of the set of attributes to the associated set of activated neurons indicates corresponding neuron activations in the neural network.” Iyer discloses extracting a plurality of target images attributes from an input and identifying a set of activated neurons that are mapped to corresponding attributes in the neural network. Because the mapping established a correspondence between attributes and associated activated neurons, the activated neurons indicated which of the attributes are present in the input data. Accordingly, the activated neurons identify a set of attributes whose presence is indicated by those neurons, which corresponds to determining, from a given correspondence between the activated neurons and attributes, a set of the attributes whose presence in a scene captured by the measurement data is indicated by the activated neurons, as claimed.)
Before the effective filing date, it would have been obvious to one of ordinary skilled in the art to combine the teaching of method for object detection utilizing feature extraction, classifier and knowledge graph by Xu, with the teaching of method for identifying target image attributes based on activated neurons by Iyer. The motivation to do so is referred to in Iyer’s disclosure (paragraph 39 “Thereafter, based on the set of activated neurons identified as the neuron activations, the image processing device 102, at step 308, may identify a target image section within the target image processed through the neural network.”, paragraph 43 “the neural network may be trained for each attribute at distinct locations for each of the training images. The neuron activations may be analyzed for determining common features, by observing changes in the neuron activations as well as the excitation with the complementing attributes (such as blocked and non-blocked arteries)”, and paragraph 63 “The proposed method ensures consistent quality. Thus, irrespective of the whole image, the quality of the identified target image section is maintained. ... As a result, the proposed method reduces memory requirement, as only specific parts of a target image are enhanced or modified. Reduced memory requirement makes the proposed method feasible for portable devices. Thus, for example, radiologist, clinicians, etc., may use their mobile device to enhance required parts of an image of interest, unlike conventional methods and systems. Moreover, the proposed method reduces the overall processing time required to enhance input images and performs the image enhancement in real time.” Iyer discloses the benefit of the method, which include the benefit of the technique to map neuron activation of the neural network to feature attributes at distinct locations of each image. By mapping, the activated neurons are associating with attributes at distinct locations of each image, which is helpful to determining common features, by observing changes in the neuron activations as well as the excitation with the complementing attributes. The overall method also ensures consistent quality of image input analysis, reduces memory requirement, and reduces the overall processing time required to enhance input images and performs the image enhancement in real time. Therefore, one of ordinary skilled in the art may incorporate the method of mapping activated neurons with attributes within image to the feature extraction unit by Xu, to further improve the target object detection based on feature extraction within image input, thereby improve the overall classification accuracy.)
Xu does not teach part of the limitation “the classification section is configured to compute a classification score with respect to at least one class out of a given set of classes from output of the feature extraction section,” However, Ramprasaath teaches this limitation (Page 5 fig 2a “Our Neuron Importance-Aware Weight Transfer (NIWT) approach can be broken down in to three stages. a) class-specific neuron importances are extracted for seen classes at a fixed layer”, and Page 5 section 3.2 “Consider a deep neural network NETS(·) trained for classification which predicts scores ... for seen classes” Ramprasaath discloses zero-shot learning approach for classification of seen and unseen classes based on neuron importance. As illustrated in fig.2a, an input image is processed by the convolutional neural network to extract features and produce class scores corresponding to classes. Thus, Ramprasaath teaches producing classification scores for classes from the neural network output, which corresponds to the computing of a classification score with respect to at least one class out of a given set of classes from output of the feature extraction section, as claimed.)
Before the effective filing date, it would have been obvious to one of ordinary skilled in the art to combine the teaching of method for object detection utilizing feature extraction, classifier and knowledge graph by Xu, and the teaching of method for identifying target image attributes based on activated neurons by Iyer, with the teaching of zero-shot learning approach for classification of seen and unseen classes based on neuron importance by Ramprasaath. The motivation to do so is referred to in Ramprasaath’s disclosure (page 1 “Our approach, which we call Neuron Importance-Aware Weight Transfer (NIWT), learns to map domain knowledge about novel “unseen” classes onto this dictionary of learned concepts and then optimizes for network parameters that can effectively combine these concepts– essentially learning classifiers by discovering and composing learned semantic concepts in deep networks. Our approach shows improvements over previous approaches”, and page 12 section 8 “A high scoring ... emphasizes the relevance of that attribute for the corresponding class c. This helps us ground the class-score decisions made by the learnt unseen classifier head in the attribute space, thus, providing an explanation for the decision” Ramprasaath discloses how classification scopes produced by the neural network indicate the relevance of attributes for corresponding classes and support the classification decision by emphasizing attributes associated with higher scores. Because such class scores provide a quantitative indication of the likelihood that extracted features corresponds to particular classes, a person ordinary skilled in the art would have been motivated to incorporate the classification score teaching of Ramprasaath into the object detection and classification framework of Xu in view of Iyer’s neuron-attribute mapping. Doing so would allow the system to determine most likely class corresponding to detected features and improve the overall classification result. Ramprasaath further teaches zero-shot classification that enables recognition of both seen and unseen classes by utilizing the score, and incorporating this capability into Xu’s detection framework would allow the classifier to identify objects whose classes were not present during training or in the knowledge graph.)
Regarding claim 2 depends on claim 1, thus the rejection of claim 1 is incorporated.
Ramprasaath teaches the limitation “evaluating, from output delivered by the classifier after processing the record of measurement data, whether the scene captured by the record of measurement data belongs to a class seen by the classifier during its training” Page 5 section 3.1 “For convenience, we use the subscripts S and U to indicate subsets corresponding to seen and unseen classes respectively ... Concisely, the goal of generalized zero-shot learning is then to learn a mapping ... from the input space X to the combined set of seen and unseen class labels using only the domain knowledge K and instances DS belonging to the seen classes”, and page 5 section 3.2 “Consider a deep neural network NETS(·) trained for classification which predicts scores ... for seen classes S”. Ramprasaath discloses a generalized zero-shot learning framework in which a classifier produces outputs corresponding to seen and unseen classes. Because the classifier predicts scores for classes based on the processes input data, the output of the classifier inherently indicates whether the input corresponds to a class that was included among the seen classes during training. Under the broadest reasonable interpretation, evaluating from the classifier output whether the scene belongs to a class seen during training corresponds to determining, from the classification scores produced by the neural network, whether the input data is associated with one of the seen classes used during training. Accordingly, the classifier output in Ramprasaath corresponds to the claimed evaluation of the output as claimed.)
Ramprasaath teaches the limitation “in response to determining that the scene belongs to a seen class, determining the class to which the scene most likely belongs from the output of the classification section of the classifier” (Page 5 section 3.2 “Consider a deep neural network NETS(·) trained for classification which predicts scores ... for seen classes S”. Ramprasaath discloses a neural network classifier that predicts classification scores for seen classes based on the processed input data. Because the classifier produces scores corresponding to the seen classes, the class associated with the highest scores represents the class to which the input most likely belongs, which corresponds to determining the class to which the scene most likely belongs based on the seen class as claimed.)
Ramprasaath teaches the limitation “in response to determining that the scene belongs to an unseen class, determining the class to which the scene most likely belongs to be the estimated class” (page 7 section 3.4 “we modify NETS to extend the output space to include the unseen class– expanding the final fully-connected layer to include additional neurons ... for the unseen classes such that the network now additionally outputs scores”. Ramprasaath discloses extending the neural network classifier to include unseen classes so that the network additional outputs scores for the unseen classes. Because the classifier produces scores corresponding to the unseen classes, the unseen class associated with the highest score represents the class to which the input most likely belongs, which under the broadest reasonable interpretation corresponds to determining the class to which the scene most likely belongs to be the estimated class, as claimed.)
Regarding claim 3 depends on claim 2, thus the rejection of claim 2 is incorporated.
Ramprasaath teaches the limitation “The method of claim 2, wherein the classifier is trained to compute an additional classification score when the scene captured by the record of measurement data belongs to an unseen class” (page 7 section 3.4 “we modify NETS to extend the output space to include the unseen class– expanding the final fully-connected layer to include additional neurons ... for the unseen classes such that the network now additionally outputs scores” Ramprasaath discloses extending the neural network classifier to include unseen classes so that the network additional outputs scores for the unseen classes. Because the classifier produces scores corresponding to the unseen classes, the neural network is trained to additionally outputs scores when the input corresponds to an unseen class, which under the broadest reasonable interpretation corresponds to the classifier is trained to compute an additional classification score when the scene captured by the record of measurement data belongs to an unseen class, as claimed.)
Regarding claim 4 depends on claim 2, thus the rejection of claim 2 is incorporated.
Ramprasaath teaches the limitation “The method of claim 2, wherein it is evaluated from classification scores output by the classifier for seen classes whether the scene captured by the record of measurement data belongs to a seen class” (page 5 section 3.2 “Consider a deep neural network NETS(·) trained for classification which predicts scores ... for seen classes S”. Ramprasaath discloses a neural network classifier that predicts classification scores for seen classes based on the processed input data. Because the classifier outputs scores corresponding to the seen classes, evaluating these classification scores inherently indicates whether the input corresponding to one of the seen classes used during training, which under the broadest reasonable interpretation, corresponds to evaluating from classification scores whether the scene captured by the record of measurement data belongs to a seen class, as claimed.)
Regarding claim 5 depends on claim 1, thus the rejection of claim 1 is incorporated.
Ramprasaath teaches the limitation “evaluating, from at least one attribute in the set of determined attributes, a portion of the record of measurement data that has given rise to the at least one attribute” (page 5 section 3.2 “Class descriptions capture salient concepts about the content of corresponding images– for example, describing the coloration and shape of a bird’s head. Similarly, a classifier must also learn discriminative visual concepts in order to succeed; however, these concepts are not grounded in human interpretable language. In this stage, we identify neurons corresponding to these discriminative concepts before aligning them with domain knowledge” Ramprasaath discloses that class descriptions capture salient concepts about the content of images, such as coloration or shape of a bird’s head. These contents correspond to the claimed at least one attribute in the set of determined attributes. These contents comprise example such as coloration or shape of the bird’s head originates from a specific region of the image, which corresponds to the claimed portion of the record of measurement data that has given rise to the at least one attribute.)
Ramprasaath teaches the limitation “determining the portion of the record of measurement data to be salient for a decision of the classifier” (page 5 section 3.2 “Class descriptions capture salient concepts about the content of corresponding images– for example, describing the coloration and shape of a bird’s head. Similarly, a classifier must also learn discriminative visual concepts in order to succeed; however, these concepts are not grounded in human interpretable language. In this stage, we identify neurons corresponding to these discriminative concepts before aligning them with domain knowledge” Ramprasaath discloses that class descriptions capture salient concepts about the content of images, such as coloration or shape of a bird’s head. Ramprasaath further explains that the classifier learns discriminative visual concepts for classification, which indicates that the identified portion of the image is important for the classifier decision and corresponds to determining the portion of the record of measurement data to be salient for a decision of the classifier as claimed.)
Regarding claim 6 depends on claim 1, thus the rejection of claim 1 is incorporated.
Iyer teaches the limitation “The method of claim 1, wherein neurons in a fully connected layer of the feature extraction section are examined as to whether they are activated by the processing” (paragraph 38 “the image processing device 102 extracts a plurality of target image attributes from the user input.”, and paragraph 41 “Herein, the mapping of each of the set of attributes to the associated set of activated neurons indicates corresponding neuron activations in the neural network.” Iyer discloses extracting target image attributes from the input image and mapping those attributes to corresponding sets of activated neurons in the neural network. Because the mapping identifies neurons that are activated in response to the extracted attributes, the activated neurons indicate whether neurons in the neural network layer are activated by the processing of the input image. Accordingly, the neuron activations disclosed by Iyer corresponds to neurons in a fully connected layer of the feature extraction section are examined as to whether they are activated by the processing, as claimed)
Regarding claim 8 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches the limitation “a relationship that an entity corresponding to a class has and/or includes an entity corresponding to an attribute” (paragraph 36 “in some implementations of the first aspect, the association relationship between different object categories corresponding to different objects in the to-be-detected image includes at least one of the following information: an attribute association relationship between different object categories”, and paragraph 37 “The attribute association relationship between different object categories may specifically refer to whether objects of different categories have a same attribute. For example, if a color of an apple is red, and a color of a strawberry is also red, the apple and the strawberry have a same color attribute” Xu discloses an attribute association relationship between object categories and attributes. For example, an object category such as apple may have a color attribute. Thus, the object category corresponds to the entity corresponding to a class, as claimed. The color example corresponds to the entity corresponding to an attribute, and the attribute association relationship included in the object categories corresponds to the claimed relationship that an entity corresponding to a class has and/or includes an entity corresponding to an attribute.)
Regarding claim 9 depends on claim 1, thus the rejection of claim 1 is incorporated.
Ramprasaath teaches the limitation “The method of claim 1, wherein the knowledge graph includes superset of classes that the classifier has seen during its training” (page 4 section 3.1 “Consider a dataset ... comprised of example input-output pairs from a set of seen classes S = {1,...,s}” Ramprasaath discloses a dataset comprised of example input-output pairs from a set of seen classes. The model further performs generalized zero-shot learning over both seen and unseen classes. Thus, the set of classes represented in the knowledge representation includes the seen class and additional classes, which corresponds to the claimed knowledge graph include a super of classes that the classifier has seen during its training under the broadest reasonable interpretation.)
Regarding claim 10 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches the limitation “The method of claim 1, wherein a likelihood that the scene captured by the record of measurement data belongs to a class is determined based on how many of the attributes linked to the class by the knowledge graph are in the determined set of attributes” (paragraph 37 “The attribute association relationship between different object categories may specifically refer to whether objects of different categories have a same attribute. For example, if a color of an apple is red, and a color of a strawberry is also red, the apple and the strawberry have a same color attribute”, and paragraph 188 “Specifically, the attribute association relationship between different object categories may refer to whether objects of different categories have a same attribute”, and paragraph 240 “In step 3003, the graph structure may be constructed based on a manually designed knowledge graph or based on a feature similarity of a candidate area. An obtained graph structure records an association relationship between different nodes, for example, records an attribute relationship graph of different categories, a probability that different categories simultaneously occur, or an attribute similarity graph of different categories.” Xu discloses knowledge graph information includes attribute association relationship between object categories, indicating whether objects of different categories share the similar attribute, and that a graph structure based on the knowledge graph may record attribute relationships, and probabilities that different categories simultaneously occur. Because such similarity and probability relationship are derived from attributes associated with object categories, they correspond to determining a likelihood that a scene belongs to a particular class because the similarity and probability relationship quantify the degree of correspondence between the attributes associated with object categories in the knowledge graph and the attributes within detected feature from an input image. In other words, a higher similarity or probability indicates that a greater or stronger set or likelihood of attributes associated with that category are presented in the image. Accordingly, under the broadest reasonable interpretation, these similarity and probability measures correspond to determining a likelihood, as claimed.)
Regarding claim 11 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches a part of the limitation “The method of claim 1, wherein: for each respective class of multiple classes, likelihoods that the scene captured by the record of measurement data belongs to the respective class are determined;” (paragraph 37 “The attribute association relationship between different object categories may specifically refer to whether objects of different categories have a same attribute. For example, if a color of an apple is red, and a color of a strawberry is also red, the apple and the strawberry have a same color attribute”, and paragraph 188 “Specifically, the attribute association relationship between different object categories may refer to whether objects of different categories have a same attribute”, and paragraph 240 “In step 3003, the graph structure may be constructed based on a manually designed knowledge graph or based on a feature similarity of a candidate area. An obtained graph structure records an association relationship between different nodes, for example, records an attribute relationship graph of different categories, a probability that different categories simultaneously occur, or an attribute similarity graph of different categories.” Xu discloses knowledge graph information includes attribute association relationship between object categories, indicating whether objects of different categories share the similar attribute, and that a graph structure based on the knowledge graph may record attribute relationships, and probabilities that different categories simultaneously occur. Because such similarity and probability relationship are derived from attributes associated with object categories, they correspond to determining a likelihood that a scene belongs to a particular class because the similarity and probability relationship quantify the degree of correspondence between the attributes associated with object categories in the knowledge graph and the attributes within detected feature from an input image. In other words, a higher similarity or probability indicates that a greater or stronger set or likelihood of attributes associated with that category are presented in the image. Accordingly, under the broadest reasonable interpretation, these similarity and probability measures correspond to determining a likelihood, as claimed.)
Iyer teaches a part of the limitation “the multiple classes are ranked according to the likelihoods” (page 5 section 3.2 “Consider a deep neural network NETS(·) trained for classification which predicts scores ... for seen classes” , and page 7 section 3.4 “we modify NETS to extend the output space to include the unseen class ... such that the network now additionally outputs scores” Iyer discloses a classifier that predicts scores for multiple seen classes and unseen classes. Because such scores represent importance associated with respective classes and Xu discloses determining similarity or probability relationships derived from attributes associated with object categories which represent likelihood associated with respective categories (class), one of ordinary skilled in the art would understand that the predicted scores may be compared according to their magnitude to determine which classes have higher importance with respect to the similarity or probability of attributes associated with each class, thereby allowing the classes to be ordered or ranked, which corresponds to the ranking as claimed.)
Regarding claim 12 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches the limitation “The method of claim 1, wherein the record of measurement data is processed with feature extraction sections of multiple classifiers, and the sets of attributes whose presence in the scene captured by the record of measurement data is indicated by the activated neurons of the multiple classifiers are pooled” (paragraph 20 “a detection result of the first object may be determined by comprehensively considering an initial image feature extracted from the first object and semantic information of the person and the road”, and paragraph 143 “Therefore, a pooling layer often needs to be periodically introduced after a convolutional layer. ... The pooling layer may include an average pooling operator and/or a maximum pooling operator, to perform sampling on the input image to obtain an image with a relatively small size.” Xu discloses determining a detection result by considering an initial image feature extracted from the input image via an feature extractor unit/feature extraction layers, indicating that the input image (record of measurement data) is processed by feature extraction components of the classification framework, wherein one of ordinary skilled in the art would have bene able to configure one or more classification framework with the feature extraction unit, thus Xu’s teaching corresponds to the claimed process of record of measurement data is processed with feature extraction sections of multiple classifiers. Xu further discloses introducing a pooling layer to combine outputs of the convolutional layers. Because pooling layers combine activations produced by the feature extractor unit/feature extraction layers, the outputs from the feature extraction components, which captured the extracted feature of the input are pooled, which correspond to the pooling process as claimed.)
Regarding claim 13 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches the limitation “The method of claim 1, wherein the record of measurement data includes at least one image, and/or at least one point cloud” (paragraph 6 “According to a first aspect, an object detection method is provided. The method includes: obtaining a to-be-detected image; performing convolution processing on the to-be-detected image to obtain an initial image feature of a to-be-detected object; determining an enhanced image feature of the to-be-detected object based on knowledge graph information” Xu discloses the input comprises to-be-detected image, which is analogous to the record of measurement data includes at least one image, as claimed.)
Regarding claim 14 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu teaches the limitation “The method of claim 1, further comprising: computing, from the class to which the scene captured by the record of measurement data most likely belongs, an actuation signal; and actuating a vehicle, and/or a system for quality inspection, and/or a surveillance system, and/or a medical imaging system, with the actuation signal” (paragraph 92 “A camera deployed on a roadside may take a picture of oncoming vehicles and people. After the picture is obtained, the picture may be uploaded to a control center device. The control center device performs object detection on the picture to obtain an object detection result. When an abnormal object occurs, the control center may give an alarm.” Xu discloses the implementation of the method, such that a camera may captures images of vehicles and upload the images to a control center device to perform the method of object detection and detect abnormal object occurs, in which the control center device will send an alarm. The alarm corresponds to computing an actuation signal from the determined class to which the scene captured by the record of measurement data most likely belongs, because the alarm is generated based on the detection result of the object captured in the image having its attributes and features being compared and analyzed based on the knowledge graph.)
Regarding claim 15,
Xu teaches part of the limitation “A non-transitory machine-readable data carrier on which is stored a computer program including machine-readable instructions for operating” (paragraph 325 “the functions may be stored in a computer-readable storage medium. Based on such an understanding ... The computer software product is stored in a storage medium, and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the method described in the embodiments of this application. The storage medium includes any medium that can store program code such as a USB flash drive, a removable hard disk, a read-only memory (read-only memory, ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disc” Xu discloses the function may be stored in a computer-readable storage medium such as RAM or ROM, which constitutes the non-transitory machine-readable data carrier, and the computer software product carrying the function is stored in a storage medium, and includes several instructions for instructing a computer device to perform the function as recited.)
The applicant is further directed to the rejection of claim 1 above, because claim 15 recites similar limitations and processing steps, thus claim 15 is rejected under the same rationale as claim 1.
Regarding claim 16,
The applicant is further directed to the rejection of claim 1 above, because claim 16 recites similar limitations and processing steps, thus claim 16 is rejected under the same rationale as claim 1.
Claims 7 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et.al (US 20220108546 A1), in view of Iyer et.al (US 20210248409 A1), further in view of Ramprasaath et.al (NPL: Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance), further in view of Yagi et.al (US 20130212053 A1)
Regarding claim 7 depends on claim 1, thus the rejection of claim 1 is incorporated.
Xu/Iyer/Ramprasaath does not teach the limitation “The method of claim 1, wherein a neuron is determined as an activated neuron in response to its activation value exceeding a predetermined threshold value” However, Yagi teaches this limitation (paragraph 80 “each of the hierarchical layers includes a group of neurons which are calculation units for outputting feature quantities indicating features of input data. Each of the neurons is initially in a non-activated state, and is switched between an activated state and a non-activated state”, paragraph 87 “a method of binarizing input data (outputting 1 as a first signal value in response to an input larger than or equal to a predetermined threshold value and outputting 0 as a first signal value in response to an input smaller than the predetermined threshold value or in the case of no input)”, Yagi discloses a neural network including neurons that output features of input data. Neurons are initially in a non-activated state and may switch between an activated state during processing. The neuron output values may be in binary to indicate when an input is greater than or equal to a predetermined threshold, and because the neuron outputs a signal indicating activation when the input values exceed the predetermined threshold, the neuron is determined as an activated neuron, which corresponds to the claimed process of a neuron is determined as an activated neuron in response to its activation value exceeding a predetermined threshold value.)
Before the effective filing date, it would have been obvious to one of ordinary skilled in the art to combine the teaching of method for object detection utilizing feature extraction, classifier and knowledge graph by Xu, and the teaching of method for identifying target image attributes based on activated neurons by Iyer, and the teaching of zero-shot learning approach for classification of seen and unseen classes based on neuron importance by Ramprasaath with the teaching of neuron activation based on threshold comparison by Yagi. The motivation to do so is referred to in Yagi’s disclosure (paragraph 245 “it is possible to select a kind of parallel logical operation to be performed by the matching network by means of the comparison units selecting the activation criteria for use in comparison processes without changing the structure of the matching network. As a result, it is possible to extract the features of the target data and to extract the features different from the features of the target data using the same feature extraction device 100.”, and paragraph 253 “For example, each neuron may select its output by comparing the total value of the inputs with a criterion based on an absolute value (that is, a predetermined threshold value), instead of a relative rank in the layer. In addition, without comparison units, the neurons in the layer may communicate their total values of input values with each other, and may thereby obtain their relative ranks based on the total values of the input values.” Yagi discloses such activation criterion for comparison operations enables features of target data to be extracted using the same feature extraction without changing the network structure. A person ordinary skilled in the art would have been motivated to incorporate Yagi’s threshold-based neuron activation mechanism into the neural network-based feature extraction framework with neurons by Xu in view of Iyer to obtain the same benefit of keeping the structure of the neural network unchanged while extracting feature, as well as the using a criterion such as threshold to determine activate neurons when input values are sufficiently significant, thereby reducing response to weaker or less relevant inputs, improving discrimination of extracted features from an input.)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DUY TU DIEP whose telephone number is (703)756-1738. The examiner can normally be reached M-F 8-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached at (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DUY T DIEP/ Examiner, Art Unit 2123
/ALEXEY SHMATOV/ Supervisory Patent Examiner, Art Unit 2123