Detailed Action
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Objections made to the specification and rejections made under 35 U.S.C. 112(b) are withdrawn.
Applicant’s arguments with respect to claims 1-16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
PNG
media_image1.png
236
652
media_image1.png
Greyscale
PNG
media_image2.png
85
658
media_image2.png
Greyscale
PNG
media_image3.png
248
626
media_image3.png
Greyscale
On pages 12-14, Applicant argues,
PNG
media_image4.png
288
632
media_image4.png
Greyscale
PNG
media_image5.png
366
648
media_image5.png
Greyscale
PNG
media_image2.png
85
658
media_image2.png
Greyscale
Examiner respectfully disagrees. Claim 1, as drafted, is directed to an abstract idea of a mental process for performing a comparative visual search of images. The limitations of the claim recite steps of organizing information that can be performed entirely by the human mind. For example, a person can mentally observe a reference image to detect key features (e.g., limb position), then mentally filter a set of images to identify those sharing similar features. This process of image observation, comparison, and refinement includes simple decision making that the human mind is capable of performing. Therefore, claim 1 recites an abstract idea in the "mental process" grouping under Step 2A, prong One.
Examiner notes the additional features of amended claim 1 recite limitation that correspond to the same mental process discussed above. The step of “receive a first input for changing a second extraction condition” is analogous to a person receiving an input, such as verbal instructions. The subsequent step “extracting a second reference image whose relationship with the target image satisfies the second extraction condition, from among the first set of reference images, based on the detected keypoints” is simply a repetition of the same mental filtering process described above, but based on a new condition. Therefore, these additional features do not add anything beyond the abstract idea.
Under Step 2A, Prong Two, the additional elements of claim 1, individually and in combination, do not integrate the exception into a practical application. The claim recites “at least one memory configured to store one or more instructions; and at least one processor configured to execute the one or more instructions” and “display apparatus.” These elements are merely generic computer components which perform generic computer functions such as detection and extraction. The claim does not specify how these components interact in any specific way to solve a technical problem.
The claim further recites “acquire a target image” and “display the second reference image on a display apparatus”. These limitations amount to insignificant extra-solution activities and therefore, does not impose a meaningful limit on the judicial exception.
Viewed as a whole, claim 1 provides mere instructions to implement the abstract idea on a generic computer (see MPEP 2106.05(f)). The claim does not specify a particular data structure for the keypoints or a specific algorithm for the detection or filtering process that would improve the functioning of the computer itself. Therefore, the additional elements do not integrate the judicial exception into a practical application.
Under Step 2B, claim 1 does not provide an inventive concept. The combination of the abstract idea with the additional elements discussed above, does not transform the claim into patent-eligible subject matter, because they recite steps that are well-understood, routine, and conventional in the field of computer vision and image processing. The functions of “detecting a plurality of keypoints…”, “extract a first set of images…”, “receive a first input…”, “extract a second reference image…”, and “display the second reference image…” are all standard computer-implemented steps. The claim merely applies these conventional steps to implement the abstract idea.
The claims limitations are recited at a high level of generality and do not specify any particular algorithm, data structure, or technical solution that would improve the function of the computer. Simply implementing the abstract idea using a generic computer which performs well-understood steps is insufficient to provide an inventive concept (see MPEP 2106.05(i)). Therefore, the claim is directed to a patent-ineligible abstract idea and is rejected under 35 U.S.C. 101.
Claim Objections
Claim 4 is objected to because of the following informalities:
“a condition that a predetermined number or more of the keypoints” of line 3, should read “a condition that a predetermined number
“the second degree is detected” of lines 7 and 8, should read “the second degree of similarity is detected”.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 3-6, 8-9, and 14-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 3-6, 8-9, 14-16 recite “the keypoints”, which lacks antecedent basis. It is unclear whether the claim limitation is meant to refer to the previously recited “the detected keypoints” of independent claims 1, 12, and 13, or if it is meant to introduce a new element. For examination purposes, the claims will be interpreted as follows:
instances of “the keypoints” in claims 3-6, 8-9, will be interpreted as “the detected keypoints”.
“detecting the keypoints of the human body included in the target image” of claims 14-16, will be interpreted as “detecting the plurality of keypoints of the human body included in the target image”.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-6 and 11-13 are rejected under 35 U.S.C. 101.
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of detecting keypoints of a human body from an image for comparison with and selection of reference images, without significantly more.
The claim recites: “An image processing system comprising: at least one memory configured to store one or more instructions; and at least one processor configured to execute the one or more instructions to: acquire a target image; perform processing of detecting a plurality of keypoints of a human body included in the target image; extract a first set of reference images whose relationship with the target image satisfies a first extraction condition, from among a plurality of reference images, based on the detected keypoints; receive a first input for changing a second extraction condition; extract a second reference image whose relationship with the target image satisfies the second extraction condition, from among the first set of reference images, based on the detected keypoints; and display the second reference image on a display apparatus.”
The limitation, as drafted, are processes that, under their broadest reasonable interpretation, cover performance of the limitation in the human mind. A person can mentally observe an image to detect keypoints of a human body (e.g., position of limbs or body parts) and mentally filter a set of images by applying conditioning to determine shared features to that of the detected keypoints. The person could then receive new instruction to change the search condition, and re-evaluate the set of images to find a new matching image.
The judicial exception is not integrated into a practical application. For example, the claim recites the additional elements “acquire a target image” and “display the second reference image on a display apparatus”, which can reasonably be interpreted as merely a data gathering/output step required to perform the method. Therefore, the additional element does not add a meaningful limitation to the method as it is an insignificant extra-solution activity. Further, the claim recites the additional element “An image processing system comprising: at least one memory configured to store one or more instructions; and at least one processor configured to execute the one or more instructions”. This is recited at a level of generality that amounts to a generic device. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly
more than the judicial exception. As discussed above with respect to integration of the abstract idea into
a practical application, the additional elements are recited at a high-level of generality. It is therefore a
judicial exception that is not integrated into a practical application, and does not include additional
elements that are sufficient to amount to significantly more than the judicial exception. This claim is not
patent eligible.
Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of extracting images based on a degree of similarity being equal or more than reference values. A person can set a reference value for each position of a limb or body part in the image for pose comparison.
Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of computing different extraction conditions based on a number and kind of keypoints. A person can consider different numbers and/or kinds of keypoints between the first evaluation of the images and the reevaluation of the images.
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of extracting images based on the second extraction condition being a predetermined number of keypoints are detected. A person can mentally filter a set of images by applying conditioning including detecting a predetermine a number of keypoints for comparison.
Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of setting different weights for each keypoint between the first and second computation methods. A person can mentally judge the position of a particular limb or body part as being more important than others for comparison when evaluating the images. These set weights can be changed for each reevaluation (e.g., given weight to different limbs or body parts) of the images.
Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract
idea of setting different weights for each keypoint between the first and second computation methods. A person can consider all limbs as the same importance in one evaluation and consider different weights for each limb in the reevaluation.
Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an
additional element of a server and client terminal for transmitting images. The server and client terminal are recited generically such that it does not provide any meaningful limitations on performing the abstract idea. This claim is not patent eligible.
Claims 12 and 13 contains elements found analogous to that of claim 1, with the addition of “non-transitory storage medium”. The additional elements can be reasonably interpreted as merely using a generic computer as a tool to implement the abstract idea. Implementing an abstract idea on a generic computer does not integrate a judicial exception into a practical application. Thus, claims 12 and 13 are similarly rejected under 35 U.S.C. 101.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 7, and 11-16 are rejected under 35 U.S.C. 103 as being unpatentable over Xie et al. (US 20220138249 A1), (hereinafter Xie) in view of Fey et al. (US 20150370833 A1), (hereinafter Fey).
Regarding claim 1, Xie teaches an image processing system comprising: at least one memory configured to store one or more instructions; and at least one processor (Xie, “Embodiments of the present disclosure may comprise or utilize a special purpose or general-purpose computer including computer hardware, such as, for example, one or more processors and system memory…”, pg. 18, paragraph 0180, lines 1-4, see Fig. 17, processor 1702 and memory 1704) configured to execute the one or more instructions to:
acquire a target image; perform processing of detecting a plurality of keypoints of a human body included in the target image (Xie, “As illustrated in FIG. 2, the pose search system 102 receives a digital image query that includes a digital image 202 together with a keyword query 204. For example, the pose search system 102 receives an indication of user input via the client device 108 to input the keyword query 204 and/or to select (or upload) the digital image 202. In response to receiving the digital image 202, the pose search system 102 determines a query pose of the digital image 202. Particularly, the pose search system 102 determines locations or arrangements of joints and segments associated with a human figure shown in the digital image 202. In certain cases, the pose search system 102 utilizes a pose neural network to process the digital image 202 and identify locations for joints and/or segments.”, pg. 6, paragraphs 0062 and 0063, To initiate a search, a user can select or upload a target digital image. The system then processes the image using a pose neural network to detect joint and limb positions corresponding to a human figure of the image. This detection is used to define a query pose.);
extract a first set of reference image whose relationship with the target image satisfies a first extraction condition, from among a plurality of reference images, based on the detected keypoints (Xie, “Based on the query pose of the digital image 202, the pose search system identifies digital images depicting human figures corresponding to the query pose from a digital image repository… In some embodiments, the pose search system 102 utilizes a pose neural network to generate candidate digital poses for stored digital images. In these or embodiments, the pose search system 102 compares candidate digital poses with the query pose to identify those candidate digital poses within a threshold similarity of the query pose. For instance, the pose search system 120 compares pose vectors for the query pose and the candidate digital poses to identify pose vectors within a threshold distance of the query pose in a pose feature space.”, pg. 6, paragraph 0065, The query pose extracted from the initial digital image is used to identify a set of similar images from a repository. This set of images is determined based on the query pose satisfying pose vector thresholds as compared to the stored images.);
receive a first input for changing a second extraction condition; extract a second reference image whose relationship with the target image satisfies the second extraction condition; and display the second reference image on a display apparatus (Xie, “In further response to receiving the digital image 202, the pose search system 102 generates a virtual mannequin 206. More specifically, based on determining the query pose for the digital image 202, the pose search system 102 generates the virtual mannequin 206 by arranging joints and segments according to the query pose.”, pg. 6, paragraph 0066, 1-6, “As also shown in FIG. 2, the pose search system 102 also (or alternatively) receives user interaction to modify or manipulate the virtual mannequin 206. Indeed, the pose search system 102 generates a modified virtual mannequin 212 based on user interaction to manipulate one or more joints or segments to change the pose of the virtual mannequin. In response to detecting the modification of the virtual mannequin 206, the pose search system 102 determines a modified query pose of the modified virtual mannequin 212. From the modified query pose, the pose search system 102 further identifies digital images depicting human figures in ( or within a threshold similarity of) the modified query pose”, pg. 7, paragraph 0071, lines 1-13, “Similarly, in one or more embodiments the pose search system 102 utilizes a digital image and query pose to identify an initial set of digital images and a virtual mannequin to with a query pose to identify a focused set of digital images.”, pg. 7, paragraph 0072, 9-13, see Fig. 2, After generating an initial set of images based on the query pose from the digital image, the system can further refine the results through user input. Specifically, a virtual mannequin depicting the pose of the initial digital image can be displayed to a user for modifications. The user is able to modify this mannequin to specify a particular pose to be searched, produce a focused set of images which fall within a threshold similarity to the modified mannequin. The focused set of images are displayed to the user as final search results.).
Xie does not teach extract a second reference image whose relationship with the target image satisfies the second extraction condition, from among the first set of reference images, based on the detected keypoints.
However, Fey teaches extract a second reference image whose relationship with the target image satisfies the second extraction condition, from among the first set of reference images, based on the detected keypoints (Fey, “Image search results that are responsive to an initial search query ("initial query) are presented in a results portion of an image results page. The image results page can also include a Suggestion portion in which image query suggestions are presented. An image query suggestion specifies a refined search query ("refined query’) and includes an image representative of the image search results that are responsive to the refined query… The user can view additional results of the refined query without leaving the results page for the initial query (e.g., without initiating a request for another web page). For example, interaction with the image query Suggestion can cause presentation of a preview window in which a set of image search results that are referenced by the image search results for the initial query, and are also responsive to the refined query.”, pg. 2, paragraphs 0021-0022, “In some implementations, the subset of image search results for the initial query that are also responsive to the refined query can be identified asynchronously relative to the presentation of the image search results. For example, when the initial search query is received, the search system can identify the image search results that are responsive to the initial query and provide those image search results and/or the image query Suggestions to the user device irrespective of whether the images that are also responsive to the refined queries have been identified. The search system can separately perform a search of the images referenced by the image search results for images that are responsive to the refined query, and asynchronously provide data identifying those image search results for the initial search query that are also responsive to each of the refined queries specified by the image query suggestions.”, pg. 4, paragraph 0049, A first set of images are identified corresponding to an initial search query. The user then selects a refined query to narrow down the search results of the initial search query, by identifying a subset of images within the first set of images which are responsive to both the initial and refined queries.).
Xie teaches a user-guided image search including detecting a pose from an initial image and applying similarity thresholding to identify a first set of images as search results (Xie, pg. 6, paragraphs 0062-0065). Xie further teaches providing a virtual mannequin to be modified by the user, and applying similarity thresholding to identifies a second set of images in an updated query response (Xie, pg. 6, paragraph 0066, pg. 7, paragraph 0071, lines 1-13, see fig. 6A and 6B). Xie does not teach identifying the second set of images from among the first set of images. Fey teaches an image search process including a user selected refined query to extract a subset of images directly from the results of a prior query (see above). Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to have modified the virtual mannequin of Xie to serve as a refined query for extracting images from an initial set as taught by Fey (Fey, pg. 2, paragraphs 0021-0022, pg. 4, paragraph 0049). The motivation for doing so would have been to refine search results from the initial query rather than repeatedly accessing and comparing the entire database for user modification, thereby reducing the computation load of the image search. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine the teachings of Xie with Fey to obtain the invention as specified in claim 1.
Regarding claim 2, Xie in view of Fey teaches the image processing system according to claim 1, wherein the first extraction condition is a condition that a degree of similarity, computed by a first computation method, of a pose of a human body included in an image is equal to or more than a first reference value, and the second extraction condition is a condition that a degree of similarity, computed by a second computation method, of a pose of a human body included in an image is equal to or more than a second reference value (Xie, “In these or embodiments, the pose search system 102 compares candidate digital poses with the query pose to identify those candidate digital poses within a threshold similarity of the query pose. For instance, the pose search system 120 compares pose vectors for the query pose and the candidate digital poses to identify pose vectors within a threshold distance of the query pose in a pose feature space. In addition, the pose search system 102 provides the identified digital images for display on the client device 108.”, pg. 6, paragraph 0065, lines 8-19, “In response to detecting the modification of the virtual mannequin 206, the pose search system 102 determines a modified query pose of the modified virtual mannequin 212. From the modified query pose, the pose search system 102 further identifies digital images depicting human figures in ( or within a threshold similarity of) the modified query pose.”, pg. 7, paragraph 0071, lines 7-12, The system applies reference thresholds to compare a degree of similarity between input query poses and a repository of images. This thresholding is applied at each comparison, such as with the query pose detected from the input image and with the modified query pose produced as a result of a user modifying the virtual mannequin.).
Regarding claim 7, Xie in view of Fey teaches the image processing system according to claim 1, wherein the processor is further configured to execute the one or more instructions to:
receive a second input for changing the second extraction condition; newly extract, in response to reception of an input for changing the second extraction condition, the second reference image whose relationship with the target image satisfies the second extraction condition after a change, from among the first set of reference images, and change a content to be displayed on the display apparatus from the second reference image that satisfies the second extraction condition before a change to the second reference image that satisfies the second extraction condition after a change. (Xie, “As illustrated in FIG. 14, the pose search system 102 includes a virtual mannequin manager 1404. In particular, the virtual mannequin manager 1404 manages, maintains, generates, accesses, modifies, manipulates, or updates a virtual mannequin... the virtual mannequin manager 1404 modifies a virtual mannequin (two-dimensional or three-dimensional) based on user interaction to manipulate a joint, a segment, or a viewing angle.”, pg. 16, paragraph 0155, Once the virtual mannequin is generated, a user is able to manipulate its joints or segments to define the modified query pose for threshold comparison. Each modification triggers a new search, replacing the previous results. This system supports an open-ended process, allowing repeated updates to the mannequin, with each iteration producing an updated set of matching images which are displayed to the client.)
Regarding claim 11, Xie in view of Fey teaches the image processing system according to claim 1, further comprising: a server; and a client terminal, wherein the server extracts the first set of reference images and, transmits the extracted first set of reference images to the client terminal, and the client terminal extracts the second reference image from among the first set of reference images received from the server (Xie, “Indeed, the pose search system generates and provides a query response for display on a client device based on receiving a digital image query that includes a keyword query and/or a query pose (e.g., determined from a digital image or a virtual mannequin).”, pg. 5, paragraph 0049, lines 9-13, Each query pose is processed by the server to identify matching images, and those images are transmitted to the client for display.).
Claim 12 corresponds to claim 1, reciting a processing method comprising steps corresponding to functions of the image processing system of claim 1. Xie in view of Fey teaches a processing method comprising steps corresponding to functions of the image processing system of claim 1 (see analysis of claim 1). As indicated in the analysis of claim 1, Xie in view of Fey teaches all the limitation according to claim 1. Therefore, claim 12 is rejected for the same reasons of obviousness as claim 1.
Claim 13 corresponds to claim 1, additionally reciting a non-transitory storage medium storing a program causing a computer comprising steps corresponding to functions of the image processing system of claim 1. Xie in view of Fey teaches the addition of a non-transitory storage medium storing a program causing a computer comprising steps corresponding to functions of the image processing system of claim 1 (Xie, “The storage manager 1410 (e.g. via a non-transitory computer memory/one or more memory devices) stores and maintain data...”, pg. 16, paragraph 0158, lines 7-12) storing a program causing a computer comprising steps corresponding to functions of the image processing system of claim 1. As indicated in the analysis of claim 1, Xie in view of Fey teaches all the limitation according to claim 1. Therefore, claim 13 is rejected for the same reasons of obviousness as claim 1.
Regarding claim 14, Xie in view of Fey teaches the image processing system according to claim 1, wherein the image processing system includes a server and a client terminal, the server performs processing of detecting the keypoints of the human body included in the target image, extracts the first set of reference images whose relationship with the target image satisfies a first extraction condition, and transmits the extracted first set of reference images to the client terminal (Xie, “As shown in FIG. 1, the server(s) 104 also includes the pose search system 102 as part of a digital content management system 106. The digital content management system 106 communicates with the client device 108 to perform various functions associated with the client application 110 such as storing and managing a repository of digital images, determining or accessing labels for digital content depicted within the digital images, and retrieving digital images based on a digital image query.”, pg. 6, paragraph 0059, lines 1-9, see fig. 1, The system includes a client device in communication with a server. The client device’s primary role is to capture user input, such as query images and mannequin modifications, and to display search results. The servers primary role is to perform computationally intensive tasks, such as keypoint pose detection and similarity thresholding, in order to provide a set of images as a search result back to the client device for display.), and the client terminal extracts the second reference image whose relationship with the target image satisfies the second extraction condition (Xie, “Although FIG. 1 illustrates a particular arrangement of the environment, in some embodiments, the environment has a different arrangement of components and/or may have a different number or set of components altogether. For instance, in some embodiments, the pose search system 102 is implemented by ( e.g., located entirely or in part on) the client device 108 and/or a third-party device. In addition, in one or more embodiments, the client device 108 communicates directly with the pose search system 102, bypassing the network 112.”, pg. 6, paragraph 0060, lines 1-10, Embodiments include the implementation of keypoint pose estimation and similarity thresholding directly on the client device itself for extracting images as search results.).
Claim 15 corresponds to claim 14, reciting a processing method comprising steps corresponding to functions of the image processing system of claim 14. Xie in view of Fey teaches a processing method comprising steps corresponding to functions of the image processing system of claim 14 (see analysis of claim 12 and 14). As indicated in the analysis of claim 14, Xie in view of Fey teaches all the limitation according to claim 14. Therefore, claim 15 is rejected for the same reasons of obviousness as claim 14.
Claim 16 corresponds to claim 14, additionally reciting a non-transitory storage medium storing a program causing a computer comprising steps corresponding to functions of the image processing system of claim 14. Xie in view of Fey teaches the addition of a non-transitory storage medium storing a program causing a computer comprising steps corresponding to functions of the image processing system of claim 14 (see analysis of claim 13) As indicated in the analysis of claim 14, Xie in view of Fey teaches all the limitation according to claim 14. Therefore, claim 16 is rejected for the same reasons of obviousness as claim 14.
Claims 3-6, 8, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Xie et al. (US 20220138249 A1) in view of Fey et al. (US 20150370833 A1) and further in view of Yoshida et al. (WO 2021084677 A1), (hereinafter, Yoshida).
Regarding claim 3, Xie in view of Fey teaches the image processing system according to claim 2, wherein, the first computation method computes a first degree of similarity of a pose of a human body based on the detected keypoints, the second computation method computes a second degree of similarity of a pose of a human body based on the detected keypoints (Xie, “The pose search system 102 thus selects, for including in a query response, those digital images whose pose vectors are within a threshold similarity (e.g., distance) of the pose vector of the query pose and whose label vectors are within a threshold similarity (e.g., distance) of the keyword vector. In certain embodiments, the pose search system 102 utilizes a pose neural network to generate and compare the pose vectors.”, pg. 8, paragraph 0081, lines 11-18, For each image search and pose comparison, thresholding is applied to both the keypoints detected from the query pose and the modified query posed.).
Xie in view of Fey does not teach wherein the first computation method and the second computation method differ from each other in that: a number of the keypoints referred to for computing the first degree of similarity and the second degree of similarity is different, a kind of the keypoints referred to for computing the first degree of similarity and the second degree of similarity is different, or both a number and a kind of the keypoints referred to for computing the first degree of similarity and the second degree of similarity are different
However, Yoshida teaches wherein the first computation method and the second computation method differ from each other in that: a number of the keypoints referred to for computing the first degree of similarity and the second degree of similarity is different, a kind of the keypoints referred to for computing the first degree of similarity and the second degree of similarity is different, or both a number and a kind of the keypoints referred to for computing the first degree of similarity and the second degree of similarity are different (Yoshida, “In the present embodiment, as in the classification method, various search methods can be used by searching based on the feature amount of the skeletal structure of the person. The search method may be preset or may be arbitrarily set by the user… (Search method 2) Partial search When a part of a person's body is hidden in an image, a search is performed using only the information of the recognizable part. For example, as in the skeletal structures 511 and 512 of FIG. 14, even if the key points of the left foot cannot be detected due to the hiding of the left foot, the features of the other key points that have been detected can be used for the search. Therefore, in the skeletal structures 511 and 512, it can be determined that the postures are the same at the time of searching (at the time of classification). That is, it is possible to perform classification and search using the features of some key points instead of all the key points. In the examples of the skeletal 5 structures 521 and 522 in FIG. 15, although the directions of both feet are different, the feature quantities of the key points of the upper body (A1, A2, A31, A32, A41, A42, A51, A52) are used as the search query. Therefore, it can be judged that the posture is the same. Further, the portion (feature point) to be searched may be weighted and searched, or the threshold value for determining the similarity may be changed. When a part of the body is hidden, the hidden part may be ignored and the search may be performed, or the hidden part may be added to the search. By searching including the hidden part, it is possible to search for a posture in which the same part is hidden.”, pg. 15, lines 17-19 and 36-38 and pg. 16, lines 1-13, “As described above, in the present embodiment, it is possible to detect the skeletal structure of a person from a two-dimensional image and perform classification and search based on the feature amount of the detected skeletal structure. As a result, it is possible to classify by similar postures having a high degree of similarity… Since the user can specify the posture of the search query from the classification results, the desired posture can be searched even if the user does not know the posture to be searched in detail in advance. For example, since it is possible to perform classification and search on the condition of the whole or part of the skeleton structure of a person, flexible classification and search is possible.”, pg. 17, lines 27-38, A user is able to set weights for keypoints of all or part of a detected skeleton structure in an image. These weights influence the similarity calculations used for image matching, allowing the system to produce search results which emphasize or ignore specific body parts.).
Xie in view of Fey teaches an image search system that refines results based on user modification of a virtual mannequin. Specifically, a query pose is extracted from an input image and similarity thresholding is applied to produce initial search results. A user can then modify a virtual mannequin to further narrow these search results using the same similarity thresholding (Xie, “In addition, the server(s) 104 transmits data to the client device 108 to provide a query response that includes digital images portraying human figures in the query response (or in poses within a threshold similarity of the query response).”, pg. 5, paragraph 0058, lines 8-12, “Similarly, in one or more embodiments the pose search system 102 utilizes a digital image and query pose to identify an initial set of digital images and a virtual mannequin to with a query pose to identify a focused set of digital images.”, pg. 7, paragraph 072, lines 9-13). Yoshida teaches applying user-defined weights for keypoints of all or part of a detected human skeleton to produce image search results (see above). Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to have modified Xie in view of Fey by incorporating Yoshida’s keypoint weighting (Yoshida, pg. 15, lines 17-19 and 36-38 and pg. 16, lines 1-13) as an additional search condition. The motivation for doing so would have been to allow the user to control the influence of specific parts of the query pose, thereby enabling a more refined image search. In the combination of Xie in view of Fey and further in view of Yoshida, keypoint weights would be applied uniformly across the whole pose to produce the initial search results, and then adjusted for selected keypoints during the mannequin modifications to further narrow additional searches. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine the teachings of Xie in view of Fey with Yoshida to obtain the invention as specified in claim 3.
Regarding claim 4, Xie in view of Fey and further in view of Yoshida teaches the image processing system according to claim 3, wherein the second extraction condition includes one or both of: a condition that a predetermined number or more of the keypoints referred to for computing the second degree of similarity are detected, and a condition that a predetermined keypoint among the keypoints referred to for computing the second degree is detected (Xie, “As further shown in FIG. 7, the pose search system 102 performs an act 714 to receive a modification to the virtual mannequin. More specifically, the pose search system 102 receives user interaction selecting and moving one or more joints or segments of the virtual mannequin. Based on the manipulation of the virtual mannequin, the pose search system 102 generates a modified virtual mannequin and determines a query pose of the virtual mannequin as well. For example, the pose search system 102 determine a query pose defined by the locations of the joints and segments as modified from the user input. As also illustrated in FIG. 7, the pose search system 102 performs an act 716 to generate a query response from the modified virtual mannequin.”, pg. 11, paragraphs 0112-0113, The refined image search is conditioned according to the selection of predetermined keypoints modified by the user, from which a pose is detected and thresholding applied.).
Regarding claim 5, Xie in view of Fey and further in view of Yoshida teaches the image processing system according to claim 3, wherein, setting contents of a weight of each of the keypoints referred to for computing the first degree of similarity and the second degree of similarity are different from each other (Yoshida, “In the present embodiment, as in the classification method, various search methods can be used by searching based on the feature amount of the skeletal structure of the person. The search method may be preset or may be arbitrarily set by the user… (Search method 2) Partial search When a part of a person's body is hidden in an image, a search is performed using only the information of the recognizable part. For example, as in the skeletal structures 511 and 512 of FIG. 14, even if the key points of the left foot cannot be detected due to the hiding of the left foot, the features of the other key points that have been detected can be used for the search. Therefore, in the skeletal structures 511 and 512, it can be determined that the postures are the same at the time of searching (at the time of classification). That is, it is possible to perform classification and search using the features of some key points instead of all the key points. In the examples of the skeletal 5 structures 521 and 522 in FIG. 15, although the directions of both feet are different, the feature quantities of the key points of the upper body (A1, A2, A31, A32, A41, A42, A51, A52) are used as the search query. Therefore, it can be judged that the posture is the same. Further, the portion (feature point) to be searched may be weighted and searched, or the threshold value for determining the similarity may be changed. When a part of the body is hidden, the hidden part may be ignored and the search may be performed, or the hidden part may be added to the search. By searching including the hidden part, it is possible to search for a posture in which the same part is hidden.”, pg. 15, lines 17-19 and 36-38 and pg. 16, lines 1-13, The combination of Xie in view of Fey and further in view of Yoshida would include applying keypoint weights uniformly across the whole pose to produce the initial search results, and then allow user modification to weights for selected keypoints during the mannequin modifications to further narrow additional searches).
Regarding claim 6, Xie in view of Fey and further in view of Yoshida teaches the image processing system according to claim 5, wherein, in the first computation method, a degree of similarity of a pose of a human body is computed by setting a same weight of all the keypoints, and, in the second computation method, a degree of similarity of a pose of a human body is computed based on a weight being set for each keypoint (Xie, “Similarly, in one or more embodiments the pose search system 102 utilizes a digital image and query pose to identify an initial set of digital images and a virtual mannequin to with a query pose to identify a focused set of digital images.”, pg. 7, paragraph 072, lines 9-13, As noted above, in the combination of Xie in view of Fey and further in view of Yoshida, keypoint weights would be applied uniformly across the whole pose to produce the initial query pose, and then adjusted on a keypoint basis during the mannequin modifications to further narrow additional searches.).
Regarding claim 8, Xie in view of Fey teaches the image processing system according to claim 7, wherein the processor is further configured to execute the one or more instructions to receive a keypoint selection input for selecting, on a human model formed of a plurality of the keypoints, one of the keypoints as a setting target (Xie, “As mentioned above, in one or more embodiments the pose search system also provide a virtual mannequin for searching digital images portraying a query pose. For example, in some embodiments the pose search system receives a reference digital image from a client device and determines a pose of a human figure shown in the selected digital image. The pose search system generates and provides a virtual mannequin with manipulable joints and segments in the determined pose.”, pg. 3, paragraph 0034, The user receives the virtual mannequin which includes keypoints which are adjustable based on user modifications. In this process, the user selects target keypoints for manipulation to further narrow the search. This mannequin can be iteratively adjusted to update the search.).
Xie in view of Fey does not teach receive a weight selection input for changing a weight of the selected keypoint via a setting screen.
However, Yoshida teaches receive a weight selection input for changing a weight of the selected keypoint via a setting screen (Yoshida, “In the present embodiment, as in the classification method, various search methods can be used by searching based on the feature amount of the skeletal structure of the person. The search method may be preset or may be arbitrarily set by the user… (Search method 2) Partial search When a part of a person's body is hidden in an image, a search is performed using only the information of the recognizable part. For example, as in the skeletal structures 511 and 512 of FIG. 14, even if the key points of the left foot cannot be detected due to the hiding of the left foot, the features of the other key points that have been detected can be used for the search. Therefore, in the skeletal structures 511 and 512, it can be determined that the postures are the same at the time of searching (at the time of classification). That is, it is possible to perform classification and search using the features of some key points instead of all the key points. In the examples of the skeletal 5 structures 521 and 522 in FIG. 15, although the directions of both feet are different, the feature quantities of the key points of the upper body (A1, A2, A31, A32, A41, A42, A51, A52) are used as the search query. Therefore, it can be judged that the posture is the same. Further, the portion (feature point) to be searched may be weighted and searched, or the threshold value for determining the similarity may be changed. When a part of the body is hidden, the hidden part may be ignored and the search may be performed, or the hidden part may be added to the search. By searching including the hidden part, it is possible to search for a posture in which the same part is hidden.”, pg. 15, lines 17-19 and 36-38 and pg. 16, lines 1-13, “As described above, in the present embodiment, it is possible to detect the skeletal structure of a person from a two-dimensional image and perform classification and search based on the feature amount of the detected skeletal structure. As a result, it is possible to classify by similar postures having a high degree of similarity… Since the user can specify the posture of the search query from the classification results, the desired posture can be searched even if the user does not know the posture to be searched in detail in advance. For example, since it is possible to perform classification and search on the condition of the whole or part of the skeleton structure of a person, flexible classification and search is possible.”, pg. 17, lines 27-38, A user is able to set weights for all or part of a detected skeleton structure in an image. These weights influence the similarity calculations used for image matching, allowing the system to produce search results which emphasize or ignore specific body parts. This includes a setting screen to allow users to adjust weights for each keypoint.).
Xie in view of Fey teaches an image search system that refines results based on user modification of a virtual mannequin via a user setting screen (Xie, “Similarly, in one or more embodiments the pose search system 102 utilizes a digital image and query pose to identify an initial set of digital images and a virtual mannequin to with a query pose to identify a focused set of digital images.”, pg. 7, paragraph 072, lines 9-13, “Furthermore, the computing device 1700 can include an input device such as a touchscreen, mouse, keyboard, etc.”, pg. 19, paragraph 0189, lines 10-12, see Figs. 6A and 6B). Yoshida teaches applying user-defined weights to all or part of a detected human skeleton to produce image search results (see above). Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to have modified Xie in view of Fey by incorporating Yoshida’s keypoint weighting (Yoshida, pg. 15, lines 17-19 and 36-38 and pg. 16, lines 1-13) as an additional search condition. The motivation for doing so would have been to allow the user to control the influence of specific parts of the query pose, thereby enabling a more refined image search. In the combination of Xie in view of Fey and further in view of Yoshida, a user could adjust individual weights of keypoints via a setting screen to define targeted keypoint of the virtual mannequin for further searching. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine the teachings of Xie in view of Fey with Yoshida to obtain the invention as specified in claim 8.
Regarding claim 9, Xie in view of Fey and further in view of Yoshida teaches the image processing system according to claim 8, wherein the processor is further configured to execute the one or more instructions to receive an input for changing the second extraction condition via the setting screen that emphasizes and displays the selected keypoint in the human model (Xie, “Thereafter, FIG. 6B illustrates the client device 108 displaying the digital image search interface 600 including an updated query result based on modifications to the virtual mannequin in accordance with one or more embodiments.”, pg. 10, paragraph 0101, lines 10-14, see Fig. 6B, The virtual mannequin is displayed along with the search results. This allows a user to modify the mannequin in real-time to receive new search results based on selecting and manipulating specific keypoints.).
Allowable Subject Matter
Claim 10 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CONNOR LEVI HANSEN whose telephone number is (703)756-5533. The examiner can normally be reached Monday-Friday 9:00-5:00 (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CONNOR L HANSEN/Examiner, Art Unit 2672
/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672