DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . The Amendment filed 23 January 2026 (hereinafter “the Amendment”) has been entered and considered. Claims 1, 8, and 15 have been amended. Claims 4, 11, and 17 have ben canceled. Claims 1-3, 5-10, 12-16, and 18-20, all the claims pending in the application, are rejected. All new grounds of rejection set forth in the present action were necessitated by Applicant’s claim amendments; accordingly, this action is made final.
Response to Amendment
Claim Interpretation – 35 U.S.C. § 112(f)
On page of the Remarks in the Amendment, Applicant contends that the claims do not require interpretation under 35 USC 112(f). In support of this argument, Applicant asserts that the claims do not recite features requiring such interpretation. The Examiner respectfully disagrees.
According to MPEP § 2181(I):
“examiners will apply 35 U.S.C. 112(f) to a claim limitation if it meets the following 3-prong analysis:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.”
Each of independent claims 1, 8, and 15 recites “a display module for displaying at least one item”. This limitation recites the nonce word “module” which is a generic placeholder for performing the claimed “displaying” which is functional language modifying the nonce word. Further, the “display module” is not modified by sufficient structure recited in the claim for performing the claimed function. Accordingly, the interpretation of this term as invoking 35 USC 112(f) is maintained.
Similarly, the claimed “imaging assembly configured to capture” of independent claims 8 and 15 recites the nonce word “assembly” which is a generic placeholder for performing the claimed “capture” step which is functional language modifying the nonce word. Further, the “imaging assembly” is not modified by sufficient structure recited in the claim for performing the claimed function. Accordingly, the interpretation of this term as invoking 35 USC 112(f) is maintained.
Prior Art rejections
Independent claim 1 has been amended to recite the subject matter of now-canceled dependent claim 4, independent claim 8 has been amended to recite the subject matter of now-canceled dependent claim 11, and independent claim 15 has been amended to recite the subject matter of now-canceled dependent claim 17. Applicant argues that the proposed combination of Chaubard and Evans does not teach or suggest the newly added features of the independent claims – namely, performing the detecting step “by applying a localizer to the first image to generate a processed first image having bounding box coordinates associated with the detected at least one attribute of the object present in the first image”. In support of this argument, Applicant appears to acknowledge that Chaubard discloses a trained CNN which draws a labeling bounding polygon around each item in the images, but argues that Chaubard’s CNN is applied to each image, rather than one image (i.e., the claimed first image), thus concluding that the limitation in question is not taught (see pages 3-5 of the Remarks). The Examiner respectfully submits that Applicant’s arguments are not commensurate with the scope of the claims and that the limitation in question is indeed taught by the very portion of Chaubard cited by the Applicant.
In particular, the claim language does not require that only the first image is processed to have bounding box coordinates, as Applicant appears to imply. Rather, the claim merely requires the first image to have the bounding box coordinates. If each image is processed to have bounding box coordinates, as Applicant acknowledges, then certainly the first image (in addition to the others) are processed to have bounding box coordinates.
Accordingly, the Examiner maintains that Chaubard does indeed disclose “applying a localizer to the first image to generate a processed first image having bounding box coordinates associated with the detected at least one attribute of the object present in the first image”, contrary to Applicant’s assertions, since each image – including the claimed first image – is processed to have bounding box coordinates. As such, the prior art rejections are maintained.
Claim Interpretation – 35 U.S.C. § 112(f)
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “display module” in claims 1, 3, 5-8, 10, 12-15, and 18-20 (notably, each of claims 2, 9, and 16 recites structural features which modify the “display module”) and “imaging assembly” in claims 8-10, 12-16, and 18-20.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
In particular, the “display module” is interpreted as the “support structure” 102 described in [0024] of the specification and shown in Fig. 1 and equivalents thereof. The “imaging assembly” is interpreted as the “camera” described in [0035] of the specification and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-2, 6-9, 13-15, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication No. 2022/0284384 to Chaubard (hereinafter “Chaubard”) in view of U.S. Patent Application Publication No. 2022/0076027 to Evans et al. (hereinafter “Evans”).
As to independent claim 1, Chaubard discloses a method (Abstract discloses that Chaubard is directed to a method for “inventory visibility management”), comprising: capturing a plurality of images of at least a portion of an object, the object being a display module for displaying at least one item ([0017-0018] discloses cameras 100 and 101 that capture images of a “set of products on opposing shelves for on-shelf inventory tracking”, wherein the shelves are part of a support structure equivalent to the claimed display module); detecting at least one attribute of the object present in a first image of the plurality of images by applying a localizer to the first image to generate a processed first image having bounding box coordinates associated with the detected at least one attribute of the object present in the first image ([0028-0030] discloses a bounding polygon detection module 170 which trains a bounding polygon convolutional neural network (CNN) 320 which identifies a bounding polygon around each product as well as a bounding polygon around each product identifier which is a “scannable code visible on each box”; [0022] discloses that the bounding boxes have “image coordinates” and “global coordinates”); extracting the at least one attribute of the object present in the first image from each image of the plurality of images ([0031] discloses that the bounding polygon detection module 170 further identifies and clusters bounding polygons that refer to the same product “across different images”); aligning the extracted at least one attribute from each image of the plurality of images with the extracted at least one attribute of the first image ([0033-0034] discloses an image merging module 175 that registers every image of the same product, wherein an ordinarily skilled artisan would recognize that such image registration is synonymous with image alignment, as evidenced by the processing described in [0033-0034]); and generating a reconstructed image based on the aligned at least one attribute of the plurality of images, wherein the resolution of the first image is different from the resolution of the reconstructed image ([0032, 0035, 0037] discloses that all the registered images are input to a trained “super-resolution” convolutional neural network CNN which outputs a final merged image which has a “higher resolution” than all the images).
Chaubard does not expressly disclose that the captured images are a burst of images captured by a camera.
Evans, like Chaubard, is directed to monitoring inventory on a rack of store shelves using a camera in which “a high resolution image…is derived from plural low-resolution images, using super-resolution techniques” (Abstract and [0091]). Evans discloses that the sequence of images “captured by a single camera” may “be taken in a burst” ([0084, 0091]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chaubard to perform the super-resolution process using plural low-resolution images captures by a single camera in a burst, as taught by Evans, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have reduced the computation requirements of Chaubard’s system since only the images from a single camera would require alignment prior to super resolution.
As to claim 2, Chaubard as modified above further teaches that the object comprises at least one support surface, the at least one support surface being at least one of a shelf, a rack, a bay, and a bin for displaying the at least one item ([0017-0018] discloses cameras 100 and 101 that capture images of a “set of products on opposing shelves for on-shelf inventory tracking”; emphasis added), and the at least one attribute of the object present in the first image is a label associated with the at least one item ([0028-0030] discloses a bounding polygon detection module 170 which identifies a bounding polygon around each product as well as a bounding polygon around each product identifier which is a “scannable code visible on each box”, wherein the scannable code corresponds to the claimed label).
As to claim 6, Chaubard as modified above further teaches detecting at least one attribute present in the reconstructed image, wherein the at least one attribute is at least one of a barcode, a feature and text ([0039-0041] discloses, “following generation of a super-resolution image, a box identifier or scan code that was previously not legible from a low-resolution image now becomes legible” which “enables the device to decode the box identifier by applying a barcode reading algorithm 520 on the high-resolution crop”, wherein the trained barcode reading algorithm 520 inputs the super-resolution image and outputs an identification of a product or a decoding of the scan code in the image patch).
As to claim 7, Chaubard as modified above further teaches that detecting the at least one attribute present in the reconstructed image comprises applying at least one of a non-transitory computer-readable medium barcode reader and a deep learning model to the reconstructed image ([0039-0040] discloses the above-discussed barcode reading algorithm 520, and [0064] discloses implementation on a machine-readable medium; [0041] discloses an alternative in which “the barcode reading module 190 may train a deep learning model (DLM) to take as input the image patch located inside the bounding polygon surrounding the scan code in the merged image and output the decoded value”).
As to independent claim 8, Chaubard as modified by Evans teaches a device, comprising: an imaging assembly configured to capture a burst of images of at least a portion of an object, the object being a display module for displaying at least one item ([0017-0018] discloses cameras 100 and 101 that capture images of a “set of products on opposing shelves for on-shelf inventory tracking”, wherein the shelves are part of a support structure equivalent to the claimed display module; [0084, 0091] of Evans discloses a camera in which “a high resolution image…is derived from plural low-resolution images, using super-resolution techniques”, wherein the sequence of images “captured by a single camera” is “taken in a burst”, wherein the Evan’s camera corresponds to the claimed imaging assembly; the reasons for combining the references are the same as those discussed above in conjunction with claim 1); one or more processors; and a non-transitory computer-readable memory coupled to the imaging assembly and the one or more processors, the memory storing instructions thereon that, when executed by the one or more processors, cause the one or more processors ([0064] of Chaubard discloses “a processor or a group of processors” which is/are “configured by software” that is “embodied on a machine-readable medium”) to perform the method steps recited in independent claim 1. Accordingly, claim 8 is rejected for reasons analogous to those discussed above in conjunction with claim 1.
Claims 9 and 13-14 recite features nearly identical to those recited in claims 2 and 6-7, respectively. Accordingly, claims 9 and 13-14 are rejected for reasons analogous to those discussed above in conjunction with claims 2 and 6-7, respectively.
As to independent claim 15, Chaubard as modified by Evans teaches a system, comprising: at least one device having an imaging assembly configured to capture a burst of images of at least a portion of an object, the object being a display module for displaying at least one item ([0017-0018] discloses cameras 100 and 101 that capture images of a “set of products on opposing shelves for on-shelf inventory tracking”, wherein the shelves are part of a support structure equivalent to the claimed display module; [0084, 0091] of Evans discloses a camera in which “a high resolution image…is derived from plural low-resolution images, using super-resolution techniques”, wherein the sequence of images “captured by a single camera” is “taken in a burst”, wherein the Evan’s camera corresponds to the claimed imaging assembly; the reasons for combining the references are the same as those discussed above in conjunction with claim 1); a server having one or more processors; and a non-transitory computer-readable memory coupled to the server and the one or more processors, the memory storing instructions thereon that, when executed by the one or more processors, cause the one or more processors ([0064] of Chaubard discloses a “server computer system” comprising “a processor or a group of processors” which is/are “configured by software” that is “embodied on a machine-readable medium”) to perform the method steps recited in independent claim 1. Accordingly, claim 15 is rejected for reasons analogous to those discussed above in conjunction with claim 1.
Claims 19-20 recite features nearly identical to those recited in claims 6-7, respectively. Accordingly, claims 19-20 are rejected for reasons analogous to those discussed above in conjunction with claims 6-7, respectively.
Claims 3, 10, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chaubard in view of Evans and further in view of U.S. Patent Application Publication No. 2024/0005451 to Patil et al. (hereinafter “Patil”).
As to claim 3, Chaubard as modified by Evans does not expressly disclose that the resolution of the reconstructed image is two to eight times greater than the resolution of the first image.
Patil, like Chaubard, is directed to “generating super-resolution images” using a CNN (Abstract and [0033-0035]). Patil discloses that the super-resolution output image is 2x the resolution of the input image ([0054]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Chaubard and Evans to produce a super-resolution output image is 2x the resolution of the input image, as taught by Patil, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have “increase[d] a confidence” in the recognition step performed using the super-resolution image ([0054] of Patil).
Claim 10 recites features nearly identical to those recited in claim 3. Accordingly, claim 10 is rejected for reasons analogous to those discussed above in conjunction with claim 3.
As to claim 16, the proposed combination of Chaubard, Evans, and Patil teaches that the object comprises at least one support surface, the at least one support surface being at least one of a shelf, a rack, a bay, and a bin for displaying the at least one item ([0017-0018] of Chaubard discloses cameras 100 and 101 that capture images of a “set of products on opposing shelves for on-shelf inventory tracking”; emphasis added), the at least one attribute of the object present in the first image is a label associated with the at least one item ([0028-0030] of Chaubard discloses a bounding polygon detection module 170 which identifies a bounding polygon around each product as well as a bounding polygon around each product identifier which is a “scannable code visible on each box”, wherein the scannable code corresponds to the claimed label), and the resolution of the reconstructed image is two to eight times greater than the resolution of the first image ([0054] of Patil discloses that the super-resolution output image is 2x the resolution of the input image; the reasons for combining the references are the same as those discussed above in conjunction with claim 3).
Claims 5, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Chaubard in view of Evans and further in view of “BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment” by Luo et al. (hereinafter “Luo”).
As to claim 5, Chaubard as modified by Evans further teaches that generating the reconstructed image based on the aligned at least one attribute of the burst of images comprises applying a super resolution deep learning network to the aligned at least one attribute of the burst of images ([0032, 0035, 0037] of Chaubard discloses that all the registered images are input to a trained “super-resolution” convolutional neural network CNN which outputs a final merged image which has a “higher resolution” than all the images; [0084, 0091] of Evans discloses a camera in which “a high resolution image…is derived from plural low-resolution images, using super-resolution techniques”, wherein the sequence of images “captured by a single camera” is “taken in a burst”, wherein the Evan’s camera corresponds to the claimed imaging assembly; the reasons for combining the references are the same as those discussed above in conjunction with claim 1).
Chaubard as modified by Evans does not expressly disclose that the network is a transformer network. However, Luo discloses aligning and fusing a burst of images for input to a swin transformer that outputs a super-resolution image (Section 3 and Fig. 3).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Chaubard and Evans to substitute Chaubard’s CNN-based architecture with the transformer-based architecture of Luo, to arrive at the claimed invention discussed above. Such a modification is the result of simple substitution of one known element for another producing a predictable result. More specifically, Chaubard’s CNN-based architecture and Luo’s transformer-based architecture perform the same general and predictable function, the predictable function being super-resolution. Since each individual element and its function are shown in the prior art, albeit shown in separate references, the difference between the claimed subject matter and the prior art rests not on any individual element or function but in the very combination itself - that is in the substitution of Chaubard’s CNN-based architecture by replacing it with Luo’s transformer-based architecture. It is predictable that the proposed modification would have better “handle[d] misalignment and aggregate[d] the potential texture information in multiframes more efficiently” while also “captur[ing] long-range dependency to further improve the performance” (Luo Abstract). Thus, the simple substitution of one known element for another producing a predictable result renders the claim obvious.
Each of claims 12 and 18 recites features nearly identical to those recited in claim 5. Accordingly, each of claims 12 and 18 is rejected for reasons analogous to those discussed above in conjunction with claim 5.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN M CONNER whose telephone number is (571)272-1486. The examiner can normally be reached 10 AM - 6 PM Monday through Friday, and some Saturday afternoons.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Greg Morse can be reached at (571) 272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SEAN M CONNER/Primary Examiner, Art Unit 2663