Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This communication is a non-Final office action in merits. Claims 1-20, as originally filed, are presently pending and have been elected and considered below.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3/19/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or
nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 11-15, 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 2022/0351379 A1, Lorsakul et al. (hereinafter Lorsakul) in view of US 2024/0379190 A1, Gross et al. (hereinafter Gross).
As to claim 1, Lorsakul discloses a method for feature aggregation comprising:
extracting, with an image encoder, an image feature representation of an input image, the image feature representation corresponding to a plurality of image patches of the input image (Fig 8: 805-815, input images being divided into image patches and encoded into discriminative features; pars 0004-0005, 0007-0008), and image feature elements of the image feature representation being logically organized in a plurality of dimensions (Fig 8: 820, Fig 9; pars 0007, 0012, 0016, 0097, 0100, feature representation being projected onto pixel space and classify each pixel space; pars 0080, 0088, 0090-0093, 0106, such pixels space being associated with foreground and/or background image feature elements);
for each image feature element set of a plurality of image feature element sets divided along a predetermined dimension of the plurality of dimensions in the image feature representation (pars 0005, 0007, 0010, 0012), and ; and determining an aggregated image feature representation of the input image based on a plurality of aggregated image feature elements determined for the plurality of image feature element sets, respectively (pars 0041-0044, 0049, 0054, 0069, 0077, 0090, 0092, 0106).
Lorsakul does not expressly teach selecting a first number of image feature elements from the image feature element set based on a ranking of corresponding image feature elements in the image feature element set and determining an aggregated image feature element by aggregating the selected first number of image feature elements.
Gross, in the same or similar field of endeavor, further teaches selecting a first number of image feature elements from the image feature element set based on a ranking of corresponding image feature elements in the image feature element set (Fig 6; pars 0007, 0018, 0022-0023, 0025, 0029, features being aggregated by ranking and scores); determining an aggregated image feature element by aggregating the selected first number of image feature elements (Fig 6; pars 0007, 0018, 0022-0023, 0025, 0029, number of feature elements being selected based on ranking and score).
Therefore, consider Lorsakul and Gross’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Gross’s teachings in Lorsakul’s method to provide a ranking criterion for proper feature aggregation.
As to claim 2, Lorsakul as modified discloses the method of claim 1, wherein for each image feature element set in the plurality of image feature element sets, the image feature elements are ranked in an order from large to small in value (Gross: pars 0109), and selecting the first number of image feature elements from the image feature element set comprises: selecting the first number of image feature elements that are ranked highly in the image feature element set (Gross: pars 0007, 0032, 0058, 0101).
As to claim 3, Lorsakul as modified discloses the method of claim 1, wherein the plurality of dimensions comprises a channel dimension (Lorsakul: pars 0009, 0084) and two spatial dimensions (Lorsakul: pars 0007, 0009-0010, 0012, 0014, two dimensional model/channel with two spatial dimensions (e.g. height and width)), and the predetermined dimension comprises the channel dimension (Lorsakul: pars 0009, 0014, 0017, predetermined maximum channel dimension).
As to claim 4, Lorsakul as modified discloses the method of claim 1, wherein determining the aggregated image feature element comprises: determining the aggregated image feature element by averaging the first number of image feature elements (Gross: pars 0025, 0051).
As to claim 5, Lorsakul as modified discloses the method of claim 1, wherein the first number remains the same during a training procedure and an application procedure of the image encoder (Gross: pars 0007, 0032, 0058, 0101, the same number being remained when the ranking criterion remains the same).
As to claim 11, it is a device claim encompassed claim 1, Rejection of claim 1 is therefore incorporated herein.
As to claims 12-15, they are rejected with the same reason as set forth in claims 2-5, respectively.
As to claim 20, it recites a non-transitory CRM storing computer program executed to perform functions and features as recited in claim 1. Rejection of claim 1 is therefore incorporated herein.
Claims 6-8, 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Lorsakul in view of Gross and further in view of US 2024/0153258 A1, Mangla et al. (hereinafter Mangla).
As to claim 6, Lorsakul as modified discloses the method of claim 1, comprising: extracting, with a text encoder, a text feature representation of an input text, the text feature representation corresponding to a plurality of text units of the input text, and text feature elements of the text feature representation being logically organized in the plurality of dimensions; for each text feature element set of a plurality of text feature element sets divided along the predetermined dimension of the plurality of dimensions in the text feature representation, selecting a second number of text feature elements from the text feature element set based on a ranking of corresponding text feature elements in the text feature element set, and determining an aggregated text feature element by aggregating the selected second number of text feature elements; and determining an aggregated text feature representation of the input text based on a plurality of aggregated text feature elements determined for the plurality of text feature element sets, respectively (see citations and rejection in claim 1).
Lorsakul as modified does not expressly teach the encoder being a text encoder, however, similar or equivalent operations have been applied to image data and features.
Mangla, in the same or similar field of endeavor, further disclose a text encoder working with an image encoder as a pair to provide a multimodal image/video and text multimodal system (Figs 1, 2A, 7; pars 0035-0036, 0038, 0050-0053).
Therefore, consider Lorsakul as modified and Mangla’s teachings as a whole, it would have been obvious to one of skill in the art before the filing date of invention to incorporate Mangla’s teachings on text encoder and image encoder pair in Lorsakul as modified’s method to enable a multimodal system including both image and text feature elements.
As to claim 7, Lorsakul as modified discloses the method of claim 6, wherein for each text feature element set in the plurality of text feature element sets, the text feature elements are ranked in an order from large to small in value, and selecting the second number of text feature elements from the text feature element set comprises:
selecting the second number of text feature elements that are ranked highly in the text feature element set (see rejection in claim 2 to apply the method for the image feature elements to the text feature elements).
As to claim 8, Lorsakul as modified discloses the method of claim 6, wherein the second number is set to a first value during a training procedure of the text encoder, and is set to a second value during an application procedure of the text encoder, the second value is less than the first value (see rejection in claim 3 to apply the method for the image feature elements to the text feature elements).
As to claims 16-18, they are rejected with the same reason as set forth in claims 6-8, respectively.
Allowable Subject Matter
Claims 9-10 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Reasons for Allowance
Prior art of record (Lorsakul, Mangla, and Gross) neither discloses alone nor teaches in combination functions and features recited in claim 9. Claim 19 recites similar limitations as claim 9. Claim 10 depends from claim 9.
Examiner’s Note
Examiner has cited particular column, line number, paragraphs and/or figure(s) in the reference(s) as applied to the claims for the convenience of the Applicant. Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested from the applicant in preparing responses, to fully consider the reference(s) in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QUN SHEN whose telephone number is (571)270-7927. The examiner can normally be reached on Mon-Fri 8:30-5:50 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on 571-272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/QUN SHEN/
Primary Examiner, Art Unit 2662