Prosecution Insights
Last updated: April 19, 2026
Application No. 18/406,910

METHOD, APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM FOR MULTI-MODAL DATA PROCESSING

Non-Final OA §102§103
Filed
Jan 08, 2024
Examiner
NAKHJAVAN, SHERVIN K
Art Unit
2672
Tech Center
2600 — Communications
Assignee
BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.
OA Round
1 (Non-Final)
88%
Grant Probability
Favorable
1-2
OA Rounds
2y 7m
To Grant
99%
With Interview

Examiner Intelligence

Grants 88% — above average
88%
Career Allow Rate
544 granted / 616 resolved
+26.3% vs TC avg
Moderate +11% lift
Without
With
+10.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
23 currently pending
Career history
639
Total Applications
across all art units

Statute-Specific Performance

§101
12.3%
-27.7% vs TC avg
§103
36.4%
-3.6% vs TC avg
§102
25.3%
-14.7% vs TC avg
§112
14.6%
-25.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 616 resolved cases

Office Action

§102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 102 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention. Claims 1, 9, 10, 11, 17, 19 and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US 20240233334 A1 to Xia et al (hereinafter ‘Xia’). Regarding claim 1, Xia discloses a method for multi-modal data processing (Para [0005], wherein a multi-modal data retrieval method), comprising: acquiring data of original modality (Para [0028], wherein in step 101, target retrieval data is inputted into a first feature extraction network corresponding to a modality of the target retrieval data to acquire a data feature of the target retrieval data); and processing the data of the original modality by a target processing model to determine data of target modality corresponding to the data of the original modality (Para [0030] and [0031], wherein in step 103, retrieval is performed based on the target retrieval feature, the modality of the target retrieval data may be any modality, and to-be-retrieved data may also be any modality. The modality may include, for example, a text modality, an image modality, and a video modality, etc.); wherein the target processing model comprises a multi-modal pre-trained sub-model and a multi-modal feature correction sub-model (Para [0043], FIG. 3 shows a multi-modal retrieval network model including the first feature extraction network and the second feature extraction network.); a training process of the target processing model comprises training the multi-modal feature correction sub-model with parameters of the multi-modal pre-training sub-model fixed (Para [0042], wherein in step 203, a first loss value is determined based on a difference between the obtained retrieval features corresponding to the two or more pieces of first sample data having different modalities, and the first feature extraction networks and the second feature extraction networks corresponding to the modalities are adjusted, inherently as corrected, based on the first loss value.). Regarding claim 9, Xia discloses wherein the data of original modality comprises any one of the following types of data: voice type, video type, text type, or image type (Para [0031], wherein the modality of the target retrieval data may be any modality, and to-be-retrieved data may also be any modality. The modality may include, for example, a text modality, an image modality, and a video modality, etc.). Regarding claim 10, Xia discloses wherein the multi-modal pre-trained sub-model comprises a Transformer model (Para [0065], wherein in one possible implementation, the second feature extraction network is a Transformer model network.). Regarding claim 11, Xia discloses an electronic device, comprising: one or more processors; a storage device for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors implement acts (Para [0076], wherein s shown in FIG. 9, the electronic device 900 may include a processing unit (for example, a central processing unit and a graphics processing unit) 901 that may perform various appropriate actions and processing based on programs stored in a read-only memory (ROM) 902 or programs loaded from a storage unit 908 into a random access memory (RAM) 903.) comprising: Please refer to the corresponding method claim 1 above for further teachings. Regarding claims 17 and 19, please refer to the corresponding method claims 9 and 10 above for further teachings. Regarding claim 20, Xia discloses a non-transitory storage medium comprising computer-executable instructions which, when executed by a computer processor, are configured to perform acts (Para [0078], wherein the process described above with reference to the flow diagrams may be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product. The computer program product includes a computer program carried on a non-transitory computer-readable medium.) comprising: Please refer to the corresponding method claim 1 above for further teachings. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Xia in view of US 11,023,523 B2 to Hauptmann et al (hereinafter ‘Hauptmann’). Regarding claim 7, Xia does not specifically disclose wherein the target processing model is applied to at least one of the following tasks: a video-based text indexing task, a text-based video indexing task, a video-based text generation task, a text-based video generation task, or a video question answering task. Hauptmann discloses at least a video-based text indexing task (column 1, line 63 through column 2, line 2, wherein the system includes an indexing engine for automatically indexing data representing the audio-visual recording, with the data being indexed in association with the one or more adjusted weights for the one or more semantic features, respectively. . . . the semantic features comprise one or more of a visual feature, a textual feature, or an audio feature). Xia and Hauptmann are combinable because they both disclose image feature extraction. Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skilled in the art to combine the video based indexing of Hauptmann’s method with Xia’s in order for quick retrieval of an audio-visual recording 52 (column 6, lines 7-8). Allowable Subject Matter Claims 2-6, 8, 12-16 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The following is a statement of reasons for the indication of allowable subject matter: the prior art or the prior art of record specifically: Xia and CN 114398505 A to Huang, does not disclose: . . . . training the video feature correction branch and the text feature correction branch based on the data of the target modality and the label data corresponding to the sample data, of claims 2 and 12 combined with other features and elements of the claims; Claims 3, 4, 8, 13, 14 and 18 depend from an allowable base claim and are thus allowable themselves; Xia and CN 115238130 A to Wang et al, does not disclose: . . . . wherein the multi-modal feature correction sub-model further comprises a cross-modal interaction branch; wherein an inter-modal shared parameter, which are acquired by the cross-modal interaction branch during a training process of the multi-modal feature correction sub-model, is used for alignment cross features for data of different modalities, of claims 5 and 15 combined with other features and elements of the claim; Claims 6 and 16 depend from an allowable base claim and are thus allowable themselves. Contact Information Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHERVIN K NAKHJAVAN whose telephone number is (571)272-5731. The examiner can normally be reached Monday-Friday 9:00-12:00 PST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sue Lefkowitz can be reached at (571)272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SHERVIN K NAKHJAVAN/ Primary Examiner, Art Unit 2672
Read full office action

Prosecution Timeline

Jan 08, 2024
Application Filed
Dec 27, 2025
Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602766
METHOD, APPARATUS, DEVICE, MEDIUM AND PRODUCT FOR DETECTING ALIGNMENT OF BATTERY ELECTRODE PLATES
2y 5m to grant Granted Apr 14, 2026
Patent 12597159
SYSTEM, INFORMATION PROCESSING APPARATUS, METHOD, AND COMPUTER-READABLE MEDIUM
2y 5m to grant Granted Apr 07, 2026
Patent 12592313
ANALYZING SURGICAL VIDEOS TO IDENTIFY A BILLING CODING MISMATCH
2y 5m to grant Granted Mar 31, 2026
Patent 12579671
MINIATURIZED PHASE CALIBRATION APPARATUS FOR TIME-OF-FLIGHT DEPTH CAMERA
2y 5m to grant Granted Mar 17, 2026
Patent 12561791
METHOD TO CALIBRATE, PREDICT, AND CONTROL STOCHASTIC DEFECTS IN EUV LITHOGRAPHY
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
88%
Grant Probability
99%
With Interview (+10.9%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 616 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month