Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This communication is a non-Final office action in merits. Claims 1-28, as originally filed, are presently pending and have been elected and considered below.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3/7/2023 and 4/11/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-28 are rejected under 35 USC § 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. Claims 1-28 are directed to abstract idea.
Although claim 1 is directed to a statutory machine implemented method, it recites abstract mathematical concepts, algorithms, or calculations, and data manipulation: dividing an input image into a first and a second sets of tokens; generating a first and a second set of token representations; processing the first set and second of token representations using a neural network and a transformer networks; and processing, using a transformer neural network model, the first region of the input image according to the scale for the first region, which can be performed either manually or via general computing machine without integrating them into a practical application, and without additional elements that amount to significantly more than the abstract idea..
Claim 15 i s directed to a statutory machine reciting similar limitations as claim 1. It is therefore rejected with the same reason as claim 1.
Dependent claims 2-14 do not add more meaningful limitations to claim 1, thus are rejected with the same reason.
Dependent claims 16-28 do not add more meaningful limitations to claim 15, thus are rejected with the same reason.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or
nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-7, 10-21, 24-28 are rejected under 35 U.S.C. 103 as being unpatentable over US 2023/0222623 A1, Ke et al. (hereinafter Ke).
As to claim 1, Ke discloses a processor-implemented method of processing image data, the method comprising:
dividing an input image into a first set of tokens having a first resolution and a second set of tokens having a second resolution (Figs 4B; pars 0009-0011, 0030, 0040, input image being partitioned into patches and/or tokens with different resolutions);
generating a first set of token representations for one or more tokens from the first set of tokens corresponding to a first region of the input image (Figs 6, 10; pars 0002-0005, 0009, 0024, 0028-0030, 068-0069, 0078, a set of token representations being generated with a native resolution for each patch/region);
generating a second set of token representations for one or more tokens from the second set of tokens corresponding to the first region of the input image (Figs 6, 10; pars 0002-0005, 0009, 0024, 0028-0030, 068-0069, 0078, a set of token representations being generated with a scaled or resized resolution for each patch/region);
processing, using a neural network model, the first set of token representations and the second set of token representations to determine the first resolution or the second resolution as a scale for the first region of the input image (Figs 3-4, 6, 10; pars 0002-0005, 0009, 0024, 0028-0030, 0034, 068-0069, 0078, resolutions of patches in a region of the input image being identified using a transformer encoder); and
processing, using a transformer neural network model, the first region of the input image according to the scale for the first region (Figs 3-4; pars 0003, 0008-0009, 0024-0030, 0034, a multi-scale transformer being used to process patches and regions of the input image). Although Ke discloses or teaches above functions and features in different examples/embodiments, consider Ke’s teachings as a whole, it would have been obvious to combine Ke’s teachings in different examples to utilize a multi-scale transformer for providing multiscale representation of tokens, patches, and regions in an input image.
As to claim 2, Ke discloses the processor-implemented method of claim 1, wherein: generating the first set of token representations comprises processing the first set of tokens using a linear neural network layer to generate a first set of embedding vectors (pars 0034, 0066, liner encoder and a convolutional neural network are linear operations); and generating the second set of token representations comprises processing the second set of tokens using the linear neural network layer to generate a second set of embedding vectors (Fig 10; pars 0002-0003, 0009-0010, 0030, 0034, 0055, 0066).
As to claim 3, Ke discloses the processor-implemented method of claim 1, wherein the first set of token representations includes a single token representation according to the first resolution, and wherein the second set of token representations includes a plurality of token representations according to the second resolution (see citations and rejection in claim 1).
As to claim 4, Ke discloses the processor-implemented method of claim 1, further comprising: concatenating the first set of token representations and the second set of token representations to generate a set of concatenated token representations (pars 0031, 0035, 0056, token representations being aggregated or concatenated); wherein processing the first set of token representations and the second set of token representations comprises processing, using the neural network model, the set of concatenated token representations to determine the first resolution or the second resolution as the scale for the first region of the input image (pars 0031, 0035, 0056, also see citations and rejection in claim 1).
As to claim 5, Ke discloses the processor-implemented method of claim 1, further comprising: determining a respective scale for each respective region of the input image (Fig 10; pars 0003-0004, 0009-0010, 0078).
As to claim 6, Ke discloses the processor-implemented method of claim 5, further comprising: determining a respective positional encoding for each region of the input image (pars 038, 0063, positional embedding being applied to each feature block, which corresponds to each region after image segmentation).
As to claim 7, Ke discloses the processor-implemented method of claim 6, wherein, for a region of the input image determined to have a scale corresponding to the second resolution (see rejection in claim 1).
As to claim 8, Ke discloses the processor-implemented method of claim 1, further comprising: generating a mask for the input image, the mask indicating a respective scale determined for each respective region of the input image as the first resolution or the second resolution (pars 0036, 0057-0058).
As to claim 9, Ke discloses the processor-implemented method of claim 8, wherein the transformer neural network model is configured to process adaptive mixed-resolution data based on the mask (pars ).
As to claim 10, Ke discloses the processor-implemented method of claim 1, wherein the neural network model is shared across regions of the input image (par 0034, shared across all the input patches).
As to claim 11, Ke discloses the processor-implemented method of claim 1, wherein the neural network model includes a Softmax layer configured to determine a distribution over the first resolution and the second resolution (pars 0055-0057, Softmax layer for resolution distribution).
As to claim 12, Ke discloses the processor-implemented method of claim 1, wherein the first resolution or the second resolution is determined as the scale for the first region of the input image based on one or more characteristics of the input image (Figs 6, 10; pars 0005, 0031, 0025, 0079).
As to claim 13, Ke discloses the processor-implemented method of claim 12, wherein the one or more characteristics of the input image include a smoothness value associated with the first region of the input image, a complexity value associated with the first region of the input image, how many colors are associated with the input image (pars 0005, 0031), or a contrast value associated with the first region of the input image (pars 0025, 0079).
As to claim 14, Ke discloses the processor-implemented method of claim 1, wherein the input image includes an image patch of an image (Fig 3, 10; pars 0008, 0010-0011, 0069, image patches).
As to claim 15, it is an apparatus claim encompassed claim 1. Rejection of claim 1 is therefore incorporated herein.
As to claims 16-21, 24-28, they are rejected with the same reason as set forth in claims 1-7, 10-14, respectively.
Claims 8-9, 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Ke in view of US 2002/0131507 A1, Orhand et al. (hereinafter Orhand).
As to claim 8, Ke discloses the processor-implemented method of claim 1, further comprising: generating a mask for the input image, the mask indicating a respective scale determined for each respective region of the input image as the first resolution or the second resolution (pars 0036, 0057-0058).
Orhand, in the same or similar field of endeavor, further teaches a mask being generated and utilized to reshape and segment a zones or regions of the input image with a pixel scale precise resolution (pars 0034, 0051, 0053, 0060, 0094-0095, 0103).
Therefore, consider Ke and Orhand’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Orhand’s teachings in Ke’s method to identify and structure regions with corresponding resolutions.
As to claim 9, Ke as modified discloses the processor-implemented method of claim 8, wherein the transformer neural network model is configured to process adaptive mixed-resolution data based on the mask (Orhand: pars 0034, 0051, 0053, 0060, 0094, 0100, note such mixed resolution regions are adjusted through masking operation and it is adaptive or updatable in nature).
As to claims 22-23, they are rejected with the same reason as set forth in claims 8-9, respectively.
Examiner’s Note
Examiner has cited particular column, line number, paragraphs and/or figure(s) in the reference(s) as applied to the claims for the convenience of the Applicant. Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the reference(s) in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUN SHEN whose telephone number is (571)270-7927. The examiner can normally be reached on Mon-Fri 8:30-5:50 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on 571-272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/QUN SHEN/
Primary Examiner, Art Unit 2662
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662