Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-22 are presented.
Response to Arguments
Applicant arguments presented in the Remarks 11/13/2025 are fully considered, but they are not persuasive.
The arguments span primarily from pages 10-13 of the Remarks and focus on the limitations of :
“computing a detected dominant class of the scene comprising
computing scores of the plurality of corresponding classes based on the segmentation map and a mapping between the plurality of classes and a plurality of previously determined class importance weights;”
The examiner asserts Applicant has taken a much narrower interpretation than the broad language of the claims can permit under BRI.
Firstly the term “a dominant class” of scene is generally vague because the terminology alone without further definition do not dictate a particular interpterion, nor does it include any standard or other objects as a basis to compared thereto.
Reference Dahrur does in fact disclose at least a reasonable interpretation for determining a dominant class.
In ¶0161, the reference discloses “In one illustrative example, if a 10-dimensional output vector represents ten different classes of objects is [0 0 0.05 0.8 0 0.15 0 0 0 0], the vector indicates that there is a 5% probability that an object in the image is the third class of object (e.g., a dog), an 80% probability that the object in the image is the fourth class of object (e.g., a human), and a 15% probability that the object in the image is the sixth class of object (e.g., a kangaroo). The probability for a class can be considered a confidence level that the object is part of that class”.
Clearly , per segmentation result, the system determine a plurality of possible classes being mapped in the output vector. The calculated score of 0.8 for a human class (i.e. confidence level score) is overwhelmingly compelling over other scores for other listed classes, as such reasonably constituting the dominant class over other possible choices (for example, class “dog” or class “kangaroo” are scored significantly lower). This effectively contradicts Applicant’s assertion that reference Dahrur does not discloses “determining the overall dominant class of the scene”.
Applicant’s interpretation seems to be inconsistent with the instant Specification, wherein it discloses “a hysteresis condition corresponds to a condition when the previously detected label is now the second ranked label in the ranked labels 642 and the difference in confidence (or score) of the first and second ranked labels of the ranked labels ” per ¶0091. It would seem the instant Specification’s of computing scores for determining a so-called dominant class is very analogous to reference Dahrur in that both involve computing confidence scores to determine a domineering class (label) among possible choices.
Applicant further argues Dahrur does not discloses disclose determining the differences in importance between these classes. The examiner respectfully asserts the claims do not recite a step of determining the differences in importance. The claims merely mention the importance weights being considered in the computation. It does not involve any comparison of weights. Thus the argument is moot.
It is noticed that the term “importance” is redundant because “weight” in this context inherently indicates a bias of one entity over another. Such bias can simply in terms of priority under BRI, unless further clarified.
Nor does the claim explicitly recite a specific role of weights in the computation. The language “based on” merely suggests an involvement, not a particular equation. Therefore, Applicant does not a basis to make a case for alleged distinction with regard to any references. In fact, reference Lin goes above and beyond to disclose such weight, namely objects may be assigned priorities to the servers 200 according to the weights in the category to which the objects belong. Here, the weight may be set in advance by the administrator of the server 200 as a value for giving a high priority to an object (main character) close to the purpose of the user, and that is more than sufficient to meet the loose requirements imposed by the ambiguous claim language.
In view of the above discussion, the examiner find the argues are not persuasive.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dharur (US 2020/0193609) in view of Lim (KR 20160125102) – published 2016 – reference of record.
As to claim 1 and 18:
Dharur discloses a method and a system comprising:
A processor; and memory storing instruction that, when executed by the processor, cause the processor to compute a dominant class of a scene (Abstract, ¶0012) by executing the method, comprising:
receiving an input image of a scene; generating a segmentation map of the input image, (See Abstract, ¶0064-0065, inputting a frame/image of a scene captured by a camera device, and perform segmentation to generate a segmentation map)
the segmentation map being labeled with a plurality of corresponding classes of a plurality of classes; (¶0065, 0068, segmentation maps includes indications/values that for pixel groups representative of objects as to classes they belong.
computing a detected dominant class of the scene comprising computing scores of the plurality of corresponding classes based on the segmentation map and a mapping between the plurality of classes and a plurality of previously determined class importance weights, based on the detected dominant class, identifying a subject of the scene.
(¶0068, 0160, 0161, computing a confidence score from the segmentation map, thereby determining a prominent class by mapping known features from the segmentation map – For example: “if a 10-dimensional output vector represents ten different classes of objects is [0 0 0.05 0.8 0 0.15 0 0 0 0], the vector indicates that there is a 5% probability that an object in the image is the third class of object (e.g., a dog), an 80% probability that the object in the image is the fourth class of object (e.g., a human), and a 15% probability that the object in the image is the sixth class of object (e.g., a kangaroo)”. In this example described in ¶0160, the dominant class is a human with 80% dominance score . ¶0165, also computing the same for less important classes (background such as dogs, cats), ¶0167, “any number of classes can be detected, such as a bicycle, a dog, a cat, a person, a car, or other suitable object class. The confidence score for a bounding box and the class prediction are combined into a final score that indicates the probability that that bounding box contains a specific type of object”, as thus the subject of the frame is identified based on the score and segmentation process and is identified as person of interest instead of unimportant objects like pets)
Dharur discloses assigning weights to the foreground content/objects as discussed above, wherein the DNN is configured to detect known classes that it was trained to (¶0119-0120, implying a previous correlation/training with such specific objects, however is silent on the assigning weight based on a mapping previously established between a plurality of classes and assigned weights.
Lim, a related field of scene analysis, discloses determining a prioritized (dominant) foreground content (objects) through an association with previously established set of weights with known classes (See page 4, “the identified objects may be classified into the corresponding categories and stored in the storage device of the server 200. The objects may be assigned priorities to the servers 200 according to the weights in the category to which the objects belong. Here, the weight may be set in advance by the administrator of the server 200 as a value for giving a high priority to an object (main character) close to the purpose of the user viewing the panorama contents, and as another example, (For example, as feedback information of users) as being the object of greatest interest, the priority may be set through this, and the image processing may be performed ”)
It would have been obvious to one of ordinary skill in the art before the effective filing time of the invention that weights assigned to objects to be assigned based on predetermined mapping established previously to be incorporated in the system of Dharur. This implementation advantageous can reduce computation (reducing computation need for weights), or to provide an alternative or additional weight assigning algorithm via a mapping table, because via this alternative way, the system can target and give priority for a particular class of objects specific desire/purpose of the user instead of objectively assigning weights.
Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dharur (US 2020/0193609) in view of Lim (KR 20160125102) and in further view of Howard (US 2019/0147318)
As to claim 5:
Dharur in view of Lim discloses all limitations of claim 1, however is silent on using atrous spatial pyramid pooling at an output of a plurality of atrous convolutional layers, and wherein the segmentation map is computed based on an output of the atrous spatial pyramid pooling.
Howard in a related field of endeavor discloses in at least ¶0133 of using neural network structure for producing feature map (for example DeepLabv3), in which atrous Spatial Pyrmid Pooling module is used after a plurality of atrous layers, which control resolution of ouput segmentation map,
It would have been obvious to one of ordinary skill in the art before the effective filing time of the invention that the system of Dharur to incorporate Howard’s atrous Spatial Pyrmid Pooling module being used after a plurality of atrous layers, which control resolution of ouput segmentation map. Such implementation provides advantage of flexibly controlling resolution of segmentation map to a desired level.
Claim(s) 10 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dharur (US 2020/0193609) in view of Lim (KR 20160125102) and in further view of Ehmann (US 2015/0054974).
As to claims 10 and 21:
Dharur in view of Lim discloses all limitations of claim 1/18, wherein each pixel of the segmentation map is associated with one or more corresponding confidence values, each of the one or more corresponding confidence values corresponding to a different one of a plurality of classes, (¶0085, Fig. 7B-E, each region of pixel(s) in the image are mapped to at least one of plurality of classes.) however is silent on wherein the method further comprises thresholding the segmentation map by selecting values from locations of the segmentation map where corresponding locations of the confidence map exceed a threshold corresponding to a class of the location of the segmentation map.
Ehmann, in a related field of endeavor discloses a system/method for image segmentation in which an object label map of an image is further refined by passing one or more thresholds to confirm/correct a class of a pixel (See at least ¶0072 and 0073).
It would have been obvious to one of ordinary skill in the art before the effective filing time of the invention that the system/method of Dharur/Lim to perform refinement of classification using threshold. Such implemention advantageously improves results / reduces loss (error) through refinement process (¶0071 of Ehmann).
Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dharur (US 2020/0193609) in view of Lim (KR 20160125102) and in further view of Cohen (US 2018/0322339).
As to claim 15:
Dharur in view of Lim discloses all limitations of claim 1, however is silent on the segmentation map is generated using a convolutional neural network trained to recognize a text class of a plurality of classes with training data comprising images of text and corresponding labels, and wherein the corresponding labels comprise bounding boxes surrounding text.
Cohen, in a related field of endeavor, discloses the segmentation map is generated using a convolutional neural network trained (¶0024) to recognize a text class of a plurality of classes (¶0025, 0026) with training data comprising images of text and corresponding labels, and wherein the corresponding labels comprise bounding boxes surrounding text (¶0065, using training data with ground truth associated with text and corresponding labels to train the CNN. ¶0089, bounding box class)
It would have been obvious to one of ordinary skill in the art before the effective filing time of the invention that the system of Dharur/Lim can be used for other type of images. Dharur might differ in the particular application of the object classification (medical image vs. document analysis), however the principle of object detection remains the same for either applications. Learning models can be taught so long as given appropriate training sets. The learning model of Dharur can be modified by feeding text-oriented learning data and would be able to learn to do the task as well, as such can provide flexibility in usage.
Claim(s) 17, 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dharur (US 2020/0193609) in view of Lim (KR 20160125102) and in view of Coettiger (US 2015/0229889).
As to claims 17 and 22:
Dharur in view of Lim discloses all limitations of claim 1/18, identifying a portion of the input image of the scene corresponding to the detected dominant class (See ¶0087); however is silent on and configuring camera settings of a digital camera in accordance with the identified portion of the input image of the scene.
Boettiger in a related field of endeavor discloses classification of an object in a scene, and configuring camera settings of a digital camera in accordance with the identified portion of the input image of the scene (¶0037, 0044, classifying a person’s face, a person or an airplane etc. Automatically changing setting such as framerate and/or resolution in response).
It would have been obvious to one of ordinary skill in the art before the effective filing time of the invention that an image device in Dharur to incorporate the feature of configuring camera settings of a digital camera in accordance with the identified portion of the input image of the scene. Such adaptive setting adjustment allows for optimal use of resources such as power and storage per ¶0003 of Boettiger.
Allowable Subject Matter
Claims 2-4, 6-9, 11-14, 16, 19-20 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Reference(s) considered relevant to the claimed invention include:
US 2015/0187076 - Disclosed herein are systems and methods for extracting person image data comprising: obtaining at least one frame of pixel data and corresponding image depth data; processing the at least one frame of pixel data and the image depth data with a plurality of persona identification modules to generate a corresponding plurality of persona probability maps; combining the plurality of persona probability maps to obtain an aggregate persona probability map; and generating a persona image by extracting pixels from the at least one frame of pixel data based on the aggregate persona probability map.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUAN M HUA whose telephone number is (571)270-7232. The examiner can normally be reached 10:30-6:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Anthony Addy can be reached on 571-272-7795. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/QUAN M HUA/Primary Examiner, Art Unit 2645