DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Style
In this action unitalicized bold is used for claim language, while italicized bold is used for emphasis.
Information Disclosure Statement
All information disclosure statements were submitted prior to the first action and are incompliance with the provisions of 37 C.F.R. § 1.97. Accordingly, they have been considered.
Applicant Reply
“The claims may be amended by canceling particular claims, by presenting new claims, or by rewriting particular claims as indicated in 37 CFR 1.121(c). The requirements of 37 CFR 1.111(b) must be complied with by pointing out the specific distinctions believed to render the claims patentable over the references in presenting arguments in support of new claims and amendments. . . . The prompt development of a clear issue requires that the replies of the applicant meet the objections to and rejections of the claims. Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. . . . An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.” MPEP § 714.02. Generic statements or listing of numerous paragraphs do not “specifically point out the support for” claim amendments. “With respect to newly added or amended claims, applicant should show support in the original disclosure for the new or amended claims. See, e.g., Hyatt v. Dudas, 492 F.3d 1365, 1370, n.4, 83 USPQ2d 1373, 1376, n.4 (Fed. Cir. 2007) (citing MPEP § 2163.04 which provides that a ‘simple statement such as ‘applicant has not pointed out where the new (or amended) claim is supported, nor does there appear to be a written description of the claim limitation ‘___’ in the application as filed’ may be sufficient where the claim is a new or amended claim, the support for the limitation is not apparent, and applicant has not pointed out where the limitation is supported.’)” MPEP § 2163(II)(A).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 11-15, 18, 20-21, and 23-31 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) and the claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more.
Step 1: Is the claim to a process, machine, manufacture, or composition of matter?
All claims are found to be directed to one of the four statutory categories, unless otherwise indicated in this action.
Step 2A Prongs One and Two (Alice Step 1): According to Office guidance, claims that read on math do not recite an abstract idea at step 2A1, when the claims fail to refer to the math by name.1 The MPEP also equates “recit[ing] a judicial exception” with “state[ing]” or “describ[ing]” an abstract idea in the claims.2 Consistent with this guidance an abstract idea may be first recited in a dependent claim, even though the independent claims may read on the abstract idea. Claim limitations which recite any of the abstract idea groupings set forth in the manual are found to be directed, as a whole, to an abstract idea unless otherwise indicated.3 The claims do not recite additional elements that integrate the abstract ideas into a practical application.4 To confer patent eligibility to an otherwise abstract idea, claims may recite a specific means or method of solving a specific problem in a technological field.5
1. A method of training a model for object attribute classification in a computer vision task, wherein the object is an object to be recognized in an image or a video, and wherein the object comprises a face, and the method comprises steps of: (Recognizing an object in an image or video reads on a mental process. The claim language merely recites implementing the mental process using generic computer components. Limiting to recognition of objects which “compris[e] a face” merely limits the data environment to a field of use. Further, this language is written as an intended use. Intended use language is explained in MPEP §§ 2103 and 2111.02. “Claim scope is not limited by claim language that suggests or makes optional but does not require steps to be performed, or by claim language that does not limit a claim to a particular structure.” MPEP § 2111.04.) acquiring, from image samples, binary class attribute data related to a to-be-classified object attribute on which an object attribute classification task is to be performed, wherein the object attribute is related to a part of the face, wherein acquiring the binary class attribute data, comprises: determining different categories for the object attribute, determining at least one class label from each of the different categories, and for each image sample, judging whether the to-be-classified object attribute is "Yes" or "No" for each of at least one class label, (Determining “binary class attribute data” by determining whether an object is “yes or no” for a given category reads on a mental process. Specifically, consistent with the detailed explanation of identification of various types of eyebrows in the specification, determining that an object may be an eyebrow and then judging whether or not the object is in fact an eyebrow reads on a mental process. Note that the claimed “acquiring of data” is claimed as “compris[ing]: determining . . .” Based on this claim language, “acquiring” here refers to the outcome of a determination. As such, this also reads on a mental process.) wherein the binary class attribute data comprises data indicating the judging results, (The claimed data is not distinguished from information in the mind. However, in the interest of compact prosecution, note that merely storing reads on an instruction to apply the exception using conventional computer components in their ordinary capacities.) and wherein acquiring the binary class attribute data, further comprises, for each image sample: (Note that the “acquiring” step merely refers to the outcome of the following determinations. Specifically, this is not limited to require computer I/O.) determining semantic similarity between object attributes based on feature distribution of the object to be recognized, determining distances between object attributes in at least one additional image region proximate to an image region of the to-be-classified object attribute and the to-be- classified object attribute, determining at least one of an object attribute whose semantic similarity to the to- be-classified object attribute is equal to or smaller than a similarity threshold or an object attribute whose distance from the to-be-classified object attribute is equal to or smaller than a distance threshold, as at least one additional object attribute associated with the to-be-classified object attribute, (This reads on a mental process.) and judging whether each of the at least one additional object attribute is "Yes" or "No" for its corresponding class label, wherein the binary class attribute data comprises data indicating the judging results; (This reads on a mental process.) and pre-training the model for object attribute classification based on the binary class attribute data related to the to-be-classified object attribute to obtain a pre-trained version of the model, wherein the model is based on a convolutional neural network, (This reads on an instruction to apply abstract mental processes on conventional components.) and wherein the pre-trained version is capable of obtaining object attribute binary classes according to class labels corresponding to the binary class attribute data as outputs, (First, this language recites an intended result. “Claim scope is not limited by claim language that suggests or makes optional but does not require steps to be performed, or by claim language that does not limit a claim to a particular structure.” MPEP § 2111.04. “[A] whereby clause in a method claim is not given weight when it simply expresses the intended result of a process step positively recited.” MPEP § 2111.04 (quotes omitted).) wherein the pre-training the model comprises- extracting object attribute features from the image samples, and optimizing weights for parameters of the model based on the extracted object attribute features and the binary class attribute data by using an across entropy loss function. (Both feature extraction and weight optimization using a loss function read on mathematical processes.)
Step 2B (Alice Step 2): The rejected claims do not recite additional elements that amount to significantly more than the judicial exception.
All additional limitations that do not integrate the claimed judicial exception into a practical application also fail to amount to significantly more, for the reasons given at step 2A2. All limitations found to be extra-solution activity at step 2A2 are found to be WURC, including limitations that read on mere data gathering, data storage, and data input/output/transfer. This finding is based on cases which have recognized that generic input-output operations, repetitive processing operations, and storage operations are WURC.6 Other aspects of generic computing have also been found to be WURC.7 Further, the description itself may provide support for a finding that claim elements are WURC. The analysis under § 112(a) as to whether a claim element is “so well-known that it need not be described in detail in the patent specification” is the same as the analysis as to whether the claim element is widely prevalent or in common use.8 Similarly, generic descriptions in the Specification of claimed components and features has been found to support a conclusion that the claimed components were conventional.9 Improvements to the relevant technology may support a finding that the claims include a patent eligible inventive concept. But some mechanism that results in any asserted improvements must be recited in the claim, and the Specification must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing the improvement.10 This applies to the dependent claims below.
Dependent Claims
11. (Currently Amended) The method of claim 1, wherein the pre-trained version comprises a convolutional neural network model, a fully connected layer and binary class attribute classifiers corresponding to the class labels of the binary class attribute data one by one which are arranged sequentially. (This merely recites the parts used to implement the mental process using generic computer components, in this case a generic machine learning technique.)
12. (Currently Amended) The method of claim 1, further comprising: training the model for object attribute classification based on the class label data for the object attribute classification task and the pre-trained version. (Training “based on” class label data and a “pre-trained” version of a model reads on continuing to train a model that is already partly trained, based on labeled data. This reads on merely using generic computer components to implement the claimed abstract ideas.)
13. (Previously Presented) The method of claim 12, wherein the trained model comprises a convolutional neural network model and a multi-class fully connected layer corresponding to the class labels for the object attribute classification task which are arranged sequentially. (This reads on merely using generic computer components to implement the claimed abstract ideas. Note that a “multi-class” fully connected layer does not clearly limit to a particular structure.)
14. (Original) The method of claim 1, wherein the at least one class label comprises class labels of coarse classes which are greatly different from each other. (This merely limits to a data environment associated with a particular field of use.)
15. (Previously Presented) The method of claim 1, wherein the class labels involved in the object attribute classification task include class labels of fine class. (This merely limits to a data environment associated with a particular field of use.)
18. (Original) An electronic device comprising: a memory; and a processor coupled to the memory, the memory having stored therein instructions that, when executed by the processor, cause the electronic device to perform the method of claim 1. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
20. (Previously Presented) The electronic device of claim 18, wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 12. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
21. (Previously Presented) A non-transitory computer-readable storage medium having stored therein instructions that, when executed by a processor, cause the processor to perform the method of claim 1. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
23. (Previously Presented) The non-transitory computer-readable storage medium of claim 21, wherein the instructions, when executed by the processor, cause the processor to further perform the method of claim 12. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
24. (New) The electronic device of claim 18, wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 11. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
25. (New) The electronic device of claim 20, wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 13. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
26. (New) The electronic device of claim 18, wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 11. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
27. (New) The electronic device of claim 18, wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 11. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
28. (New) The non-transitory computer-readable storage medium of claim 21, wherein the instructions, when executed by the processor, cause the processor to further perform the method of claim 11. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
29. (New) The non-transitory computer-readable storage medium of claim 23, wherein the instructions, when executed by the processor, cause the processor to further perform the method of claim 13. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
30. (New) The non-transitory computer-readable storage medium of claim 21, wherein the instructions, when executed by the processor, cause the processor to further perform the method of claim 14. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
31. (New) The non-transitory computer-readable storage medium of claim 21, wherein the instructions, when executed by the processor, cause the processor to further perform the method of claim 15. (This reads on an instruction to implement of the abstract ideas of another claim using generic computer components.)
All dependent claims are rejected as containing the material of the claims from which they depend.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1, 11-15, 18, 20-21, and 23-31 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
Claim 1 recites “A method of training a model” in the pre-amble. The body of claim 1 only recites steps of “pre-training” without any operations designated as training of the model. This leaves claim 1 unclear, because it is not clear whether “training” and “pre-training” should be interpreted as distinct sets of operations. If they are distinct from one another, the distinction is unclear. Claim 12 depends from claim 1 and recites “further comprising: training the model for object attribute classification based on the class label data for the object attribute classification task and the pre-trained version.” While reciting further training of a model that is pre-trained indicates separate sets of operations and an intermediate model, the language fails to provide any objective measure that would allow one of ordinary skill to distinguish between a partially trained model and a pre-trained model. This issue is compounded because the language of claim 1 describes training of the pretrained model using “binary class attribute data” which is acquired by “judging whether each of the at least one additional object attribute is "Yes" or "No" for its corresponding class label.” That is, the claims do not clearly distinguish the data used to create the claimed “pre-trained version” from data used to train the model. Since there is no objective distinction between the claimed “pre-trained version” of the model and a trained “model,” the claims are indefinite.
Claim 1 recites “pre-training the model for object attribute classification based on the binary class attribute data related to the to-be-classified object attribute to obtain a pre-trained version of the model, wherein the model is based on a convolutional neural network, and wherein the pre-trained version is capable of obtaining object attribute binary classes according to class labels corresponding to the binary class attribute data as outputs[.]” It is not clear whether the pre-trained model creates the “object attribute binary classes according to class labels” or acquires the “object attribute binary classes according to class labels” from an external source. This issue is partly due to the unconventional use of the word “obtaining,” which appears in the context of this claim to reference an output of a partially trained model. Generally “obtain” refers to getting something from an external source (e.g. “the information was difficult to obtain” or “we obtained a copy of the original letter.”) The use of “capable” compounds this issue because this language merely asserts an intended result without reciting any corresponding operations which might clarify the meaning of obtain. As such, this claim language is indefinite.)
Claim 1 recites “wherein acquiring the binary class attribute data, further comprises, for each image sample: [1] determining semantic similarity between object attributes based on feature distribution of the object to be recognized, [2] determining distances between object attributes in at least one additional image region proximate to an image region of the to-be-classified object attribute and the to-be classified object attribute, [3] determining at least one of an object attribute whose semantic similarity to the to-be-classified object attribute is equal to or smaller than a similarity threshold or an object attribute whose distance from the to-be-classified object attribute is equal to or smaller than a distance threshold, as at least one additional object attribute associated with the to-be-classified object attribute, and judging whether each of the at least one additional object attribute is "Yes" or "No" for its corresponding class label, wherein the binary class attribute data comprises data indicating the judging results[.]” (Numbers in hard brackets added.) This language purports to make at least three determinations. But the Specification describes all of the claimed determinations as merely determining the distance between two objects in an image. There are at least three ways this claim language could be read. See infra. All seem incorrect in some way, and none is clearly the right way of reading the claim in light of the Specification. As such, the scope of the claim is unclear. Since this language appears to be right at the point of asserted novelty, this language must be clear to move forward. The Specification explains “semantic similarity” as attributes that “have strong relevance and [sic] close relationship” but only offer the example of distance between objects on a face, where objects that are physically closer to one another would be more proximate. See Spec. P. 10, emphasis added. (“In some embodiments, attributes being similar semantically means that the attributes have strong relevance and close relationship, for example, they can jointly constitute a feature representing an object. For example, in the case where the object is a human face and the to-be-classified attribute is an eyebrow type, the attributes which are semantically similar with the eyebrow type can include attributes that can be used to characterize the human face and usually be recognized together with the eyebrows, for example, face parts near the eyebrows, such as eyes, eye bags, and so on. The conditions regarding semantic similarity between attributes, for example, indicating which features can be considered as being semantically similar therebetween, can be set appropriately, for example, it can be set by the user empirically, or it can be set depending on the feature distribution characteristics of the object to be recognized, which will not be described in detail here.”) The Specification indicates that both semantic similarity and physical distance can be separately considered, see spec. 11 but nothing in the Specification indicates any measure for semantic similarity other than distance. This results in at least three inconsistent but similarly reasonable interpretations. First (and second), consistent with the Specification, “semantic” can merely refer to a measure of physical distance. Given this interpretation for semantic, the three “determining” steps could be read as verbose ways of claiming a single determination of the distance between two facial objects, or could be interpreted as repeated determinations of distances between other facial objects. Third, the meaning of “semantic” used in LLM’s can be imported into the claims, but this interpretation is unsupported. Further, using the LLM definition of semantic would not clarify the claims because there is no objective measure of the semantic similarity between different facial features on the same face (as opposed to two mouths on different faces, which would be semantically similar using the LLM definition). In any case, the claim language is indefinite because semantic similarity is only described in the specification in terms of distance and the claims purport to recite separate operations based on semantic similarity and distance.
All dependent claims are rejected as containing the limitations of the claims from which they depend.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 11-15, 18, 20-21, and 23-31 are rejected under 35 U.S.C. 103 as being unpatentable in view of Sun (WO2021078157 Published April 2021), and Zhou (Application of semantic features in face recognition; 2008)
A method of training a model for object attribute classification in a computer vision task, wherein the object is an object to be recognized in an image or a video, and wherein the object comprises a face, and the method comprises steps of: (“The attribute recognition technical solution of the existing image is mainly based on a traditional machine learning attribute recognition scheme and a convolutional neural network model-based attribute recognition scheme.” Sun ¶4. Further, note that this language is written as an intended use of a model “for object attribute classification . . .”) acquiring, from image samples, binary class attribute data related to a to-be-classified object attribute on which an object attribute classification task is to be performed, (“S101: Obtaining image data to be processed. . . . The image data may be an offline image file that has been downloaded in the electronic device in advance, or an online image file. In this embodiment of this application, the image data may be online image data, for example, may be an image acquired in real time.” Sun ¶¶39-40.) wherein the object attribute is related to a part of the face, (“Since each attribute label in the image data corresponds to one position in the image, taking the face image as an example, the position within the image corresponding to the attribute label of the color of the hair is the hair position, and the position within the image data corresponding to the attribute label of the eye color is the eye position, so that the position corresponding to each attribute label may be predetermined. In some implementations, a region identifier corresponding to each attribute tag may be set. Taking the face image as an example, the area identifier may be different areas in the face such as eyes, nose, and hair.” Sun ¶70.) wherein acquiring the binary class attribute data, comprises: determining different categories for the object attribute, determining at least one class label from each of the different categories, and for each image sample, judging whether the to-be-classified object attribute is "Yes" or "No" for each of at least one class label, wherein the binary class attribute data comprises data indicating the judging results, and (“Each specific network is used to determine an attribute tag corresponding to the image data, and attribute tags determined by the specific networks are different from each other.” Sun ¶45. “Specifically, when the specific network is learned in advance, the sample image data in the input-specific network includes a plurality of attribute tags, for example, the color of the hair in the face image is black, the white car in the vehicle image, etc. and the value of each attribute tag is 0 or 1, 0 indicates that there is no property, 1 indicates that there is no property, and the attribute tags are feature values of images preset in order to obtain an image recognition result, and the function of the specific network determines whether the image includes a preset attribute tag.” Sun ¶46. See also Sun ¶47. The typo in Sun indicating that both 1 and 0 to denote “there is no property” is noted. Given the technical context and the goal of identification of facial features of Sun, based on a preponderance of the evidence, Examiner finds that one of ordinary skill in the art reading Sun would have identified the typo and inferred that a 1 is used to indicate the existence of the given property while a 0 indicates lack of existence of the property.) wherein acquiring the binary class attribute data, further comprises, for each image sample: determining semantic similarity between object attributes based on feature distribution of the object to be recognized, (“In view of the above problems, face image ethnicity and gender recognition methods based on multi-task learning have emerged. It introduces a multi-task learning method into face image ethnicity and gender recognition, and proposes semantic-based multi-task feature selection by using different semantics as different tasks.” Sun ¶36. “[T]he sample image data in the input-specific network includes a plurality of attribute tags, for example, the color of the hair in the face image is black[.]” Sun ¶46. This teaches that semantic relationships are used for task selection, and that tasks, for example hair color, correspond to labels.
While the semantic similarity between object attributes reads on the semantic-based multi-task feature selection of the prior art, a secondary reference is used because the usage of “semantic” in the specification of this application is closer to that found in a secondary reference. Rejection over the combination applies with best art, consistent with compact prosecution.
Zhou teaches “We propose a new face recognition strategy, which integrates the extraction of semantic features from faces with tensor subspace analysis. The semantic features consist of the eyes and mouth, plus the region outlined by the centers of the three components. A new objective function is generated to fuse the semantic and tensor models for finding similarity between a face and its counterpart in the database. Furthermore, singular value decomposition is used to solve the eigenvector problem in the tensor subspace analysis and to project the geometrical properties to the face manifold. Experimental results demonstrate that the proposed semantic feature-based face recognition algorithm has favorable performance with more accurate convergence and less computational efforts.” Zhou Abstract.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Zhou because using the semantic relationship between features and their locations allows recognition of facial components with more accurate convergence and less computational efforts.) determining distances between object attributes in at least one additional image region proximate to an image region of the to-be-classified object attribute and the to-be- classified object attribute, (“Taking the face image as an example, the area identifier may be different areas in the face such as eyes, nose, and hair. Specifically, nose different attribute tags may correspond to different region identifiers, or may correspond to a same region identifier. For example, the attribute label hair color and the hair length both correspond to the area identifier hair, and the attribute label pupil color corresponds to the area identifier eye. Since each pixel value in the image data corresponds to one pixel coordinate, after obtaining the area identifier corresponding to each attribute label, the position of the area identifier corresponding to each attribute label in the pixel coordinate can be determined, so that the position of the attribute label that can be recognized by each specific network in the image can be determined.” Sun ¶70. See also Sun ¶¶69-72 teaching more details of dividing face images into regions.) determining at least one of an object attribute whose semantic similarity to the to- be-classified object attribute is equal to or smaller than a similarity threshold or an object attribute whose distance from the to-be-classified object attribute is equal to or smaller than a distance threshold, as at least one additional object attribute associated with the to-be-classified object attribute, (“In some implementations, a region identifier corresponding to each attribute tag may be set. Taking the face image as an example, the area identifier may be different areas in the face such as eyes, nose, and hair. Specifically, nose different attribute tags may correspond to different region identifiers, or may correspond to a same region identifier.” Sun ¶70.) and judging whether each of the at least one additional object attribute is "Yes" or "No" for its corresponding class label, wherein the binary class attribute data comprises data indicating the judging results; (See Sun ¶¶45-47 cited above.) and pre-training the model for object attribute classification based on the binary class attribute data related to the to-be-classified object attribute to obtain a pre-trained version of the model, (“Specifically, when the specific network is learned in advance, the sample image data in the input-specific network includes a plurality of attribute tags, for example, the color of the hair in the face image is black” Sun ¶46. “In this embodiment of this application, the specific network mainly functions to segment a target object from an image, and recognize the target object, that is, the specific network may also be a target detection network. . . . A GODET neural network is a target detection algorithm for offline training by using a convolutional neural network, which uses a CNN classification network pre-trained by existing large-scale classification datasets to extract features and recognize the features.” Sun ¶49. “Specifically, the shared network focuses on learning sharing information of all attribute tags, for example, when the attribute tag on the corner of the mouth occurs at the same time as the attribute tag of the rollover, the expressed emotion is thinking, and the correlation between the attribute tag on the corner of the mouth and the attribute tag of the rollover is identified through the shared network and obtained according to the identification result obtained by the correlation. That is, after the shared network is pre-trained, the correlation between the attribute tags can be identified, and the image recognition result is obtained according to the correlation.” Sun ¶51. “According to the image processing method and apparatus, the electronic device, and the storage medium provided in this embodiment of this application, a shared network and a plurality of specific networks are pre-trained, each of the specific networks is configured to determine an attribute tag corresponding to the image data, and attribute tags determined by each of the specific networks are different from each other” Sun ¶54. In other words, each of the shared network and plurality of specific networks are pre-trained. The specific networks for the attribute tags associated with image data and the shared network for finding correlations between those tags.) wherein the model is based on a convolutional neural network, (“Each specific network and the shared network have the same network structure, and both contain five convolutional layers and two fully connected layers. At the same time, each convolutional layer and the fully connected layer are followed by a normalization layer and a ReLU (Rectified Linear Unit) activation layer.” Sun ¶116.) and wherein the pre-trained version is capable of obtaining object attribute binary classes according to class labels corresponding to the binary class attribute data as outputs, (This is written as an intended result (of pre-training). An intended result that does not require steps to be performed or limit to a particular structure is not given patentable weight. See MPEP 2111.04.) wherein the pre-training the model comprises- extracting object attribute features from the image samples, and () optimizing weights for parameters of the model (See Sun ¶¶116-125 explaining the training of a neural network using gradient descent and a specific loss function. One of ordinary skill in the art in the area of machine learning would understand this teaching as referring to iteratively updating weights and biases in a model. “[I]n considering the disclosure of a reference, it is proper to take into account not only specific teachings of the reference but also the inferences which one skilled in the art would reasonably be expected to draw therefrom.” MPEP § 2144.01.) based on the extracted object attribute features and the binary class attribute data by using an across entropy loss function. (See Sun ¶¶124-125 showing detail of a cross-entropy function. (“[T]he present method uses cross entropy as a loss function for training, and the function serves as a standard for measuring the cross entropy between the target and the output, and the formula is as follows[.]” Sun ¶124.))
11. The method of claim 1, wherein the pre-trained version comprises a convolutional neural network model, a fully connected layer and binary class attribute classifiers corresponding to the class labels of the binary class attribute data one by one which are arranged sequentially. (“Each specific network and the shared network have the same network structure, and both contain five convolutional layers and two fully connected layers. At the same time, each convolutional layer and the fully connected layer are followed by a normalization layer and a ReLU (Rectified Linear Unit) activation layer.” Sun ¶116. “each of the specific networks is capable of identifying at least one attribute tag, and the attribute tags that can be identified by each of the specific networks are different from each other.” Sun ¶136. “for example, the color of the hair in the face image is black, the white car in the vehicle image, etc. and the value of each attribute tag is 0 or 1, 0 indicates that there is no property, 1 indicates that there is no property, and the attribute tags are feature values of images preset in order to obtain an image recognition result, and the function of the specific network determines whether the image includes a preset attribute tag” Sun ¶46.)
12. The method of claim 1, further comprising: training the model for object attribute classification based on the class label data for the object attribute classification task and the pre-trained version. (“In addition, before performing S102 or S204, the specific network and the shared network need to be trained. Specifically, the training process may be before S101 and before S102 or after S201 and before S204, or before S101 and S201. In this embodiment of this application, the specific network and the shared network may be first trained before performing the image recognition method.” Sun ¶80.)
13. The method of claim 12, wherein the trained model comprises a convolutional neural network model and a multi-class fully connected layer corresponding to the class labels for the object attribute classification task which are arranged sequentially. (“Each specific network and the shared network have the same network structure, and both contain five convolutional layers and two fully connected layers. At the same time, each convolutional layer and the fully connected layer are followed by a normalization layer and a ReLU (Rectified Linear Unit) activation layer.” Sun ¶116. “each of the specific networks is capable of identifying at least one attribute tag, and the attribute tags that can be identified by each of the specific networks are different from each other.” Sun ¶136. “for example, the color of the hair in the face image is black, the white car in the vehicle image, etc. and the value of each attribute tag is 0 or 1, 0 indicates that there is no property, 1 indicates that there is no property, and the attribute tags are feature values of images preset in order to obtain an image recognition result, and the function of the specific network determines whether the image includes a preset attribute tag” Sun ¶46. Note that the multi-class” fully connected layer reads on the fully connected layer of the “shared network” in the reference.)
14. The method of claim 1, wherein the at least one class label comprises class labels of coarse classes which are greatly different from each other. (“Each specific network is used to determine an attribute tag corresponding to the image data, and attribute tags determined by the specific networks are different from each other.” Sun ¶45. “Specifically, when the specific network is learned in advance, the sample image data in the input-specific network includes a plurality of attribute tags, for example, the color of the hair in the face image is black, the white car in the vehicle image, etc. and the value of each attribute tag is 0 or 1, 0 indicates that there is no property, 1 indicates that there is no property, and the attribute tags are feature values of images preset in order to obtain an image recognition result, and the function of the specific network determines whether the image includes a preset attribute tag.” Sun ¶46. See also Sun ¶47. “In some implementations, a region identifier corresponding to each attribute tag may be set. Taking the face image as an example, the area identifier may be different areas in the face such as eyes, nose, and hair. Specifically, nose different attribute tags may correspond to different region identifiers, or may correspond to a same region identifier. For example, the attribute label hair color and the hair length both correspond to the area identifier hair, and the attribute label pupil color corresponds to the area identifier eye.” Sun ¶70. The course classes read on area identifiers, while attribute labels read on labels of fine classes. Further, note that claims 14 and 15 do not depend from one another, so both the find and coarse labels can also both read on attribute labels.
15. The method of claim 1, wherein the class labels involved in the object attribute classification task include class labels of fine class. (See rejection of claim 14.)
18. An electronic device comprising: a memory; and a processor coupled to the memory, the memory having stored therein instructions that, when executed by the processor, (See Sun ¶11. The claimed “instructions” read on the “programs” in the reference.) cause the electronic device to perform the method of claim 1. (See rejection of claim 1.)
20. The electronic device of claim 18, (See rejection of claim 18.) wherein the instructions, when executed by the processor, cause the electronic device to further perform the method of claim 12. (See rejection of claim 12.)
21. A non-transitory computer-readable storage medium having stored therein instructions that, when executed by a processor, (See Sun ¶11. The claimed “instructions” read on the “programs” in the reference.) cause the processor to perform the method of claim 1. (See rejection of claim 1.)
For rejection of claim 23, see rejections of claims 12 and 21.
For rejection of claim 24, see rejections of claims 11 and 18.
For rejection of claim 25, see rejections of claims 13 and 20.
For rejection of claim 26, see rejections of claims 11 and 18.
For rejection of claim 27, see rejections of claims 11 and 18.
For rejection of claim 28, see rejections of claims 11 and 21.
For rejection of claim 29, see rejections of claims 13 and 23.
For rejection of claim 30, see rejections of claims 14 and 21.
For rejection of claim 31, see rejections of claims 15 and 21.
Response to Arguments
Applicant's arguments filed 11/07/2025 have been fully considered but they are not persuasive.
Rejections under paragraph 101:
Applicant asserts that pre-training improves models and posits that merely reciting pre-training in the claims should result in a technical improvement. Rem. 8-9. Absent from this argument is any technical explanation, or any specific operations that would distinguish the claimed pre-training from generic training of a machine learning model. Further, nothing in the Remarks distinguishes training of a model for a task on a computer from an instruction to implement a mental process using conventional computer components. As such, there does not appear to be any basis for a finding that pre-training would constitute a technical solution to a technical problem. While omitting any characterization of the claimed subject matter may be strategically sound, prosecution may be advanced by a clear articulation of some inventive concept supported by the Specification including specific operations which result in a technical solution to a technical problem.
Rejections under paragraph 112:
The amendments overcome the rejection under paragraph 112 from the previous Office Action.
Rejections under paragraph 103:
The Remarks fail to clearly articulate any specific distinctions between the art of record and the claimed subject matter. Applicant cites long sections of claim 1 while asserting various improvements over the prior art, then summarizes long sections from Chen and asserts the combination of claimed subject matter is non-obvious in view of the reference. But the remarks fail to explicitly identify the exact combination that is non-obvious or articulate why Chen is different from the claimed subject matter as a whole. While omitting any characterization of the claimed subject matter may be strategically sound, prosecution may be advanced by a clear articulation of some inventive concept supported by the Specification which distinguishes the claims from the art of record.
As best understood, Applicant asserts that paragraph 206 of Chen is inconsistent with the claim language. The Chen patent does not include numbered paragraphs. Further, the language from the reference does not appear to have been cited in the most recent action for teaching the claimed subject matter.
Applicant states that Chen and Lampert fail to teach the newly claimed subject matter. The claims are rejected over a new combination of art. See rejection above.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL M KNIGHT whose telephone number is (571) 272-8646. The examiner can normally be reached Monday - Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached on (571. The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
PAUL M. KNIGHT
/PAUL M KNIGHT/Examiner, Art Unit 2148
1 This distinction between claims which read on math and claims which recite an abstract idea is based on official USPTO Guidance. The 2019 Subject Matter Eligibility (SME) Examples instructs examiners that a claim reciting “training the neural network” where the background describes training as “using stochastic learning with backpropagation which is a type of machine learning algorithm that uses the gradient of a mathematical loss function to adjust the weights of the network” “does not recite any mathematical relationships, formulas, or calculations.” See 2019 SME Example 39, PP. 8-9 (emphasis added). In this example, the plain meaning of “training the neural network” read in light of the disclosure reads on backpropagation using the gradient of a mathematical loss function. See MPEP § 2111.01. In contrast, the 2024 SME Examples instructs examiners that a claim reciting “training, by the computer, the ANN . . . wherein the selected training algorithm includes a backpropagation algorithm and a gradient descent algorithm” does recite an abstract idea because “[t]he plain meaning of [backpropagation algorithm and gradient descent algorithm] are optimization algorithms, which compute neural network parameters using a series of mathematical calculations.” 2024 PEG Example 47, PP. 4-6. The Memorandum of August 4, 2025; Reminders on evaluating subject matter eligibility of claims under 35 U.S.C. 101, P. 3 also directs examiners that “training the neural network” recited in Example 39 merely “involve[s] . . . mathematical concepts” and contrasts claim 2 of example 47 as “referring to [specific] mathematical calculations by name[.]” (Emphasis added.)
2 “For instance, the claims in Diehr . . . clearly stated a mathematical equation . . . and the claims in Mayo . . . clearly stated laws of nature . . . such that the claims ‘set forth’ an identifiable judicial exception. Alternatively, the claims in Alice Corp. . . . described the concept of intermediated settlement without ever explicitly using the words ‘intermediated’ or ‘settlement.’” MPEP § 2106.04(II)(A).
3 “By grouping the abstract ideas, the examiners’ focus has been shifted from relying on individual cases to generally applying the wide body of case law spanning all technologies and claim types. . . . If the identified limitation(s) falls within at least one of the groupings of abstract ideas, it is reasonable to conclude that the claim recites an abstract idea in Step 2A Prong One.” MPEP § 2106.04(a). See also MPEP 2104(a)(2).
4 Step 2A prongs one and two are evaluated individually, consistent with the framework in the MPEP. Evaluation of relationships between abstract ideas and additional elements in one location promotes clarity of the record.
5 “In short, first the specification should be evaluated to determine if the disclosure provides sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. The specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. Second, if the specification sets forth an improvement in technology, the claim must be evaluated to ensure that the claim itself reflects the disclosed improvement. That is, the claim includes the components or steps of the invention that provide the improvement described in the specification. . . . It should be noted that while this consideration is often referred to in an abbreviated manner as the ‘improvements consideration,’ the word ‘improvements’ in the context of this consideration is limited to improvements to the functioning of a computer or any other technology/technical field, whether in Step 2A Prong Two or in Step 2B.” MPEP 2106.04(d)(1). See also Koninklijke KPN N.V. v. Gemalto M2M GmbH, 942 F.3d 1143, 1150-1152 (Fed. Cir. 2019).
6 See MPEP § 2106.05(d)(II) listing operations including “receiving or transmitting data,” “storing and retrieving data in memory,” and “performing repetitive calculations” as WURC.
7 “But ‘[f]or the role of a computer in a computer-implemented invention to be deemed meaningful in the context of this analysis, it must involve more than performance of 'well-understood, routine, [and] conventional activities previously known to the industry.’ Content Extraction, 776 F.3d at 1347-48 (quoting Alice, 134 S. Ct at 2359). Here, the server simply receives data, ‘extract[s] classification information . . . from the received data,’ and ‘stor[es] the digital images . . . taking into consideration the classification information.’ See ‘295 patent, col. 10 ll. 1-17 (Claim 17). . . . These steps fall squarely within our precedent finding generic computer components insufficient to add an inventive concept to an otherwise abstract idea. Alice, 134 S. Ct. at 2360 (‘Nearly every computer will include a 'communications controller' and a 'data storage unit' capable of performing the basic calculation, storage, and transmission functions required by the method claims.’); Content Extraction, 776 F.3d at 1345, 1348 (‘storing information’ into memory, and using a computer to ‘translate the shapes on a physical page into typeface characters,’ insufficient confer patent eligibility); Mortg. Grader, 811 F.3d at 1324-25 (generic computer components such as an ‘interface,’ ‘network,’ and ‘database,’ fail to satisfy the inventive concept requirement); Intellectual Ventures I, 792 F.3d at 1368 (a ‘database’ and ‘a communication medium’ ‘are all generic computer elements’); BuySAFE v. Google, Inc., 765 F.3d 1350, 1355 (Fed. Cir. 2014) (‘That a computer receives and sends the information over a network—with no further specification—is not even arguably inventive.’).” TLI Commc'ns LLC v. AV Auto., LLC, 823 F.3d 607, 614 (Fed. Cir. 2016), Emphasis Added.
8 “The analysis as to whether an element (or combination of elements) is widely prevalent or in common use is the same as the analysis under 35 U.S.C. 112(a) as to whether an element is so well-known that it need not be described in detail in the patent specification. See Genetic Techs. Ltd. v. Merial LLC, 818 F.3d 1369, 1377, 118 USPQ2d 1541, 1546 (Fed. Cir. 2016) (supporting the position that amplification was well-understood, routine, conventional for purposes of subject matter eligibility by observing that the patentee expressly argued during prosecution of the application that amplification was a technique readily practiced by those skilled in the art to overcome the rejection of the claim under 35 U.S.C. 112, first paragraph)[.]” MPEP § 2106.05(d)(I).
9 “Similarly, claim elements or combinations of claim elements that are routine, conventional or well-understood cannot transform the claims. (Citing BSG Tech LLC v. BuySeasons, Inc., 899 F.3d 1281, 1290-1291 (Fed. Cir. 2018)). When the patent's specification ‘describes the components and features listed in the claims generically,’ it ‘support[s] the conclusion that these components and features are conventional.’ Weisner v. Google LLC, 51 F.4th 1073, 1083-84 (Fed. Cir. 2022); see also Beteiro, LLC v. DraftKings Inc., 104 F.4th 1350, 1357-58 (Fed. Cir. 2024).” Broadband iTV, Inc. v. Amazon.com, Inc., 113 F.4th 1359 (Fed. Cir. 2024)
10 “If it is asserted that the invention improves upon conventional functioning of a computer, or upon conventional technology or technological processes, a technical explanation as to how to implement the invention should be present in the specification. That is, the disclosure must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. The specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. Conversely, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology.” MPEP § 2106.05(a).