Prosecution Insights
Last updated: April 19, 2026
Application No. 18/385,141

PORTABLE LARGE LANGUAGE MODELS

Final Rejection §103§DP
Filed
Oct 30, 2023
Examiner
MONIKANG, GEORGE C
Art Unit
2692
Tech Center
2600 — Communications
Assignee
Zoom Video Communications, Inc.
OA Round
2 (Final)
74%
Grant Probability
Favorable
3-4
OA Rounds
3y 0m
To Grant
82%
With Interview

Examiner Intelligence

Grants 74% — above average
74%
Career Allow Rate
701 granted / 941 resolved
+12.5% vs TC avg
Moderate +7% lift
Without
With
+7.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
48 currently pending
Career history
989
Total Applications
across all art units

Statute-Specific Performance

§101
3.9%
-36.1% vs TC avg
§103
58.6%
+18.6% vs TC avg
§102
22.5%
-17.5% vs TC avg
§112
4.0%
-36.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 941 resolved cases

Office Action

§103 §DP
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Arguments Applicant's arguments filed 10/15/2025 have been fully considered but they are not persuasive. With regards to applicant’s argument that the Gerard et al reference fails to disclose amendment that calls for transmitting the trained reduced LLM to the remote client device, the examiner maintains. Gerard et al discloses a bi-directional wireless communication network where LLM processing for a client device is carried out at a remote location and wirelessly transmitted back to the client device after remote LLM processing (Gerard et al, fig. 1: client device 110, large language model 120; para 0020: LLM processing is carried out remotely where request for remote LLM processing is wirelessly sent to the LLM processor from the client device via bi-directional wireless communication 199 and processed LLM is wirelessly transmitted back to the client device; wherein bi-directional wireless communication implies bi-directional wireless transmission between client device and LLM processor). Double Patenting The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13. The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer. Claim 1 of 18/385,141 A method comprising: accessing a trained large language model (“LLM”), the trained LLM comprising a first set of parameters; generating a reduced LLM, the reduced LLM comprising a second set of parameters, the second set of parameters smaller than the first set of parameters; accessing a training data set and the trained LLM; training the reduced LLM based on the trained LLM and the training data set; receiving a request for the trained reduced LLM from a remote client device; and transmitting the trained reduced LLM to the remote client device. Claim 1 of 18/385,158 A method comprising: transmitting, by a client device, a request for a reduced large language model (“LLM”) to a remote server; receiving, by the client device from the remote server, and storing the reduced LLM, the reduced LLM based on a trained general LLM; receiving, by the client device, a request to generate content using the reduced LLM; providing the request to the reduced LLM; and receiving generated content from the reduced LLM based on the request. Claim 5 of 18/385,158 The method of claim 1, wherein the trained general LLM has a first set of parameters and the reduced LLM has a second set of parameters, the second set of parameters having fewer parameters than the first set of parameters. Claim 8 of 18/385,141 A system comprising: a non-transitory computer-readable medium; and one or more processors configured to execute processor-executable instructions stored in the non-transitory computer-readable medium to: access a trained large language model (“LLM”), the trained LLM comprising a first set of parameters; generate a reduced LLM, the reduced LLM comprising a second set of parameters, the second set of parameters smaller than the first set of parameters; access a training data set and the trained LLM; train the reduced LLM based on the trained LLM and the training data set; receive a request for the trained reduced LLM from a remote client device; and transmit the trained reduced LLM to the remote client device. Claim 8 of 18/385,158 A system comprising: a non-transitory computer-readable medium; and one or more processors configured to execute processor-executable instructions stored in the non-transitory computer-readable medium to: transmit a request for a reduced large language model (“LLM”) to a remote server; receive, from the remote server, and store a reduced LLM, the reduced LLM based on a trained general LLM; receive a request to generate content using the reduced LLM; provide the request to the reduced LLM; and receive generated content from the reduced LLM based on the request. Claim 12 of 18/385,158 The system of claim 8, wherein the trained general LLM has a first set of parameters and the reduced LLM has a second set of parameters, the second set of parameters having fewer parameters than the first set of parameters. Claim 15 of 18/385,141 A non-transitory computer-readable medium comprising processor-executable instructions configured to cause one or more processors to: access a trained large language model (“LLM”), the trained LLM comprising a first set of parameters; generate a reduced LLM, the reduced LLM comprising a second set of parameters, the second set of parameters smaller than the first set of parameters; access a training data set and the trained LLM; train the reduced LLM based on the trained LLM and the training data set; receive a request for the trained reduced LLM from a remote client device; and transmit the trained reduced LLM to the remote client device. Claim 15 of 18/385,158 A non-transitory computer-readable medium comprising processor-executable instructions configured to cause one or more processors to: transmit a request for a reduced large language model (“LLM”) to a remote server; receive, from the remote server, and store a reduced LLM, the reduced LLM based on a trained general LLM; receive a request to generate content using the reduced LLM; provide the request to the reduced LLM; and receive generated content from the reduced LLM based on the request. Claim 19 of 18/385,158 The non-transitory computer-readable medium of claim 15, wherein the trained general LLM has a first set of parameters and the reduced LLM has a second set of parameters, the second set of parameters having fewer parameters than the first set of parameters. Claims 1, 8 & 15 of application number 18/385,141 (hereinafter referred to as ‘141) are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 5, 8, 12, 15, 19 of copending Application No. 18/385,158 (hereinafter referred to as ‘158). Although the claims at issue are not identical, they are not patentably distinct from each other because ‘141 claims 1, 8 & 15 are obvious various wordings of ‘158 claims 1, 5, 8, 12, 15 & 19. This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Zheng et al, ‘DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization’, in view of Gerard et al, US Patent Pub. 20240411798 A1. (The Zhang et al reference is cited in IDS filed 01/22/2025) Re Claim 1, Zheng et al discloses a method comprising: accessing a trained large language model (“LLM”), the trained LLM comprising a first set of parameters (abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model); generating a reduced LLM, the reduced LLM comprising a second set of parameters, the second set of parameters smaller than the first set of parameters (abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training); accessing a training data set and the trained LLM (section 2.3: distillation-aware quantization further trains the quantization and distillation trained BART/BERT large language model to produce fine-tuned and further reduced parameters, where the distillation0aware quantization utilizes the trained quantization and distillation BART/BERT large language model along with inherent training data sets that are utilized when training models); training the reduced LLM based on the trained LLM and the training data set (section 2.3: distillation-aware quantization further trains the quantization and distillation trained BART/BERT large language model to produce fine-tuned and further reduced parameters, where the distillation0aware quantization utilizes the trained quantization and distillation BART/BERT large language model along with inherent training data sets that are utilized when training models); but fails to disclose receiving a request for the trained reduced LLM from a remote client device; and transmitting the trained reduced LLM to the remote client device. However, Gerard et al discloses a system that teaches the concept of a client device requesting large language model training at a remote server location and then transmitting said trained large language model back to the client device (Gerard et al, fig. 1: client device 110, large language model 120; para 0020: LLM processing is carried out remotely where request for remote LLM processing is wirelessly sent to the LLM processor from the client device via bi-directional wireless communication 199 and processed LLM is wirelessly transmitted back to the client device; wherein bi-directional wireless communication implies bi-directional wireless transmission between client device and LLM processor). It would have been obvious to modify the Zheng et al system such that its large language model training can be carried out remotely and transmitted back to the requesting client device as taught in Gerard et al for the purpose of reducing processing burden on the client device. Re Claim 2, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 1, wherein generating the reduced LLM comprises reducing a precision of one or more parameters of the first set of parameters (Zheng et al, abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training). Re Claim 3, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 2, wherein reducing the precision of the one or more parameters of the first set of parameters comprises converting at least one parameter to an integer value (Zheng et al, abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training; whereby quantization training aims to reduce numerical precision of model parameters from floating point numbers to lower bit-width integers). Re Claim 4, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 3, but fail to explicitly disclose wherein the converting the at least one parameter comprises comparing a difference from the integer value to a predetermined threshold, and determining the difference satisfies the predetermined threshold. However, Zheng et al includes quantization training (Zheng et al, abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training) and as such, it would have been obvious to modify the quantization training such that it can reduce from higher floating-point numbers to lower precision data types that include lower floating-point numbers, along with lower bit integers since reducing floating-point numbers to other lower floating-point numbers is a regular processing for quantization training; whereby the desired lower floating-point number and/or integer is regarding as the threshold whereby the lowered floating-point number and/or integer has to be at least the desired threshold or lower for the purpose of reducing memory footprint and computational costs. Re Claim 5, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 2, but fail to explicitly disclose wherein reducing the precision of the one or more parameters of the first set of parameters comprises converting a first parameter from a first floating-point representation to a second floating-point representation, the first floating-point representation having more bits than the second floating- point representation. However, Zheng et al includes quantization training (Zheng et al, abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training) and as such, it would have been obvious to modify the quantization training such that it can reduce from higher floating-point numbers to lower precision data types that include lower floating-point numbers, along with lower bit integers since reducing floating-point numbers to other lower floating-point numbers is a regular processing for quantization training for the purpose of reducing memory footprint and computational costs. Re Claim 6, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 2, wherein reducing the precision of the one or more parameters of the first set of parameters comprises converting a first parameter portable Large Language Models from a floating-point representation to an integer representation, the floating-point representation having more bits than the integer representation (Zheng et al, abstract: BART/BERT is a large language model, where BART/BERT is trained via quantization in section 2.1 to produce first set of parameters that are smaller than the initial BART/BERT large language model; and thereafter trained via distillation in section 2.2 to produce smaller parameters than even the first set of quantization parameters since the distillation training is carried out sequentially after the quantization training; whereby quantization training aims to reduce numerical precision of model parameters from floating point numbers to lower bit-width integers). Re Claim 7, the combined teachings of Zheng et al and Gerard et al disclose the method of claim 1, wherein generating the reduced LLM comprises selecting a second LLM having fewer parameters than the trained LLM (Zheng et al, section 2.3: distillation-aware quantization further trains the quantization and distillation trained BART/BERT large language model to produce fine-tuned and further reduced parameters, where the distillation0aware quantization utilizes the trained quantization and distillation BART/BERT large language model along with inherent training data sets that are utilized when training models). Claim 8 has been analyzed and rejected according to claim 1. Claim 9 has been analyzed and rejected according to claim 2. Claim 10 has been analyzed and rejected according to claim 3. Claim 11 has been analyzed and rejected according to claim 4. Claim 12 has been analyzed and rejected according to claim 5. Claim 13 has been analyzed and rejected according to claim 6. Claim 14 has been analyzed and rejected according to claim 7. Claim 15 has been analyzed and rejected according to claim 1. Claim 16 has been analyzed and rejected according to claim 2. Claim 17 has been analyzed and rejected according to claim 3. Claim 18 has been analyzed and rejected according to claim 4. Claim 19 has been analyzed and rejected according to claim 5. Claim 20 has been analyzed and rejected according to claim 6. Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE C MONIKANG whose telephone number is (571)270-1190. The examiner can normally be reached Mon. - Fri., 9AM-5PM, ALT. Fridays off. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Carolyn R Edwards can be reached at 571-270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /GEORGE C MONIKANG/Primary Examiner, Art Unit 2692 1/29/2026
Read full office action

Prosecution Timeline

Oct 30, 2023
Application Filed
Jul 11, 2025
Non-Final Rejection — §103, §DP
Oct 15, 2025
Response Filed
Jan 29, 2026
Final Rejection — §103, §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12604126
VEHICULAR MICROPHONE AND VEHICLE
2y 5m to grant Granted Apr 14, 2026
Patent 12596518
MICROPHONE INTERFACE, VEHICLE, CONNECTION METHOD, AND PRODUCTION METHOD
2y 5m to grant Granted Apr 07, 2026
Patent 12596888
CONTEXTUALIZATION OF GENERATIVE LANGUAGE MODELS BASED ON ENTITY RESOURCE IDENTIFIERS
2y 5m to grant Granted Apr 07, 2026
Patent 12598428
TRANSDUCER AND ELECTRONIC DEVICE
2y 5m to grant Granted Apr 07, 2026
Patent 12591749
MACHINE LEARNING SYSTEM FOR MULTI-DOMAIN LONG DOCUMENT CLUSTERING
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
74%
Grant Probability
82%
With Interview (+7.2%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 941 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month