Last updated: May 29, 2026

Application No. 18/417,335

METHOD AND SYSTEM FOR TRAINING RETRIEVERS AND RERANKERS USING ADAPTERS

Non-Final OA §103

Filed

Jan 19, 2024

Priority

Dec 22, 2023 — provisional 63/614,116

Examiner

MONIKANG, GEORGE C

Art Unit

2692

Tech Center

2600 — Communications

Assignee

Naver Corporation

OA Round

1 (Non-Final)

Interview Optional

— +7.7% interview lift. Interview lift (+7.7%) is below the 15.0% threshold. A written response is recommended.

Based on 952 resolved cases, 2023–2026

Examiner Intelligence

MONIKANG, GEORGE C View full profile →

Grants 75% — above average

Career Allowance Rate

712 granted / 952 resolved

+12.8% vs TC avg

Moderate +8% lift

Without

With

+7.7%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

28 currently pending

Career history

989

Total Applications

across all art units

Statute-Specific Performance

§101

1.3%

-38.7% vs TC avg

§103

84.5%

+44.5% vs TC avg

§102

6.7%

-33.3% vs TC avg

§112

1.1%

-38.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 952 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/21/2025 has been considered by the examiner.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 9, 11 & 15-21 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al, ‘Robust Transfer Learning with Pretrained Language Models through Adapter’, in view of Formal et al, ‘From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective’, and further in view of Yu et al, CN 116882477 A. (The Han et al and Formal et al references are cited in IDS filed 02/21/2025)
Re Claim 1, Han et al disclose a computer-implemented method for training a first-stage neural retriever (pg. 4, left column, Fine-tuning Iterations: early, mid and late stages of pretraining; wherein early stages implies first-stage), the method comprising: inserting adapter layers into one or more transformer layers of a pretrained language model (PLM) in an encoder of the first-stage retriever (abstract; Introduction section, pg. 1, left column through right column: insertion of adapter layers within the pretrained layers to improve stability and adversarial robustness in transfer learning to various downstream tasks), training the first-stage retriever on a downstream task (abstract; Introduction section, pg. 1, left column through right column: insertion of adapter layers within the pretrained layers to improve stability and adversarial robustness in transfer learning to various downstream tasks), wherein said training updates one or more parameters of the inserted adapter layers (abstract; Introduction section, pg. 1, left column through right column: insertion of adapter layers within the pretrained layers to improve stability and adversarial robustness in transfer learning to various downstream tasks); but fails to disclose the encoder being configured to receive one or more documents and generate a sparse representation for each of the documents predicting term importance of the document over a vocabulary. However, Formal et al teaches the concept of a sparse retrieval model which predicts term importance (Formal et al, pg. 2354, left column: Sparse Representation Learning; section 3.1: SPLADE). It would have been obvious to modify Han et al such that one of the inserted layers includes a layer where sparse retrieval model predicts term importance as taught in Formal et al for the purpose of directly learning high dimensional sparse representations that are able to jointly perform expansion and re-weighting through the Masked Language Modeling head of the PLM and the help of sparse regularization.
The combined teachings of Han et al and Formal et al fail to explicitly disclose storing the updated one or more parameters of the inserted adapter layers. Yu et al teaches the concept of an adapter technology in the field of deep learning where pre-training model is fine-tuned by inserting adapter module layers (Yu et al, Background technology section: 3rd paragraph beginning with “Adapter technology…”), where multimode pre-training adapter parameter/layers can be stored (Yu et al, claims 1 & 5). It would have been obvious to store the inserted adapter layers of Han et al, as taught in Yu et al for the purpose of making the system more efficient.
Re Claim 2, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein the encoder is one of a document encoder and a query encoder (Formal et al, pg. 2354, bottom left column through right column: section 3.1: SPLADE: “For a query or document…”; where query or document can be selected from the Markush language).
Re Claim 9, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein the PLM is pretrained to determine a prediction of an importance for an input sequence over the vocabulary with respect to tokens of the input sequence (Formal et al, pg. 2354, section 3.1: top right column: sentence that starts with “Text representations are obtained by pooling such importance predictors over the input sequence,…”).
Re Claim 11, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein the downstream task comprises one of information retrieval, domain adaptation, generalization, reranking, and transfer learning (Han et al, pg. 3: Contributions: “(2) We demonstrate that our approach improves the stability of the adaptation training and adversarial robustness in downstream tasks.”; wherein adaptation is selected from the Markush language).
Re Claim 15, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein each of the adapter layers comprises a bottleneck layer having trainable parameters for downprojecting an input of d-dimension into a bottleneck dimension (Han et al, pg. 1: Abstract: bottleneck layers).
Re Claim 16, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein each of the adapter layers comprises: a down-projection layer having trainable parameters for downprojecting an input of d-dimension into a bottleneck dimension (Han et al, pg. 1: Abstract: bottleneck layers); and an up-projection layer having trainable parameters for up-projecting the downprojected input into the d-dimension (Han et al, pg. 1: Abstract: bottleneck layers).
Re Claim 17, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 16, wherein each of the adapter layers further comprises a nonlinearity (Han et al, abstract; Introduction section, pg. 1, left column through right column: insertion of adapter layers within the pretrained layers to improve stability and adversarial robustness in transfer learning to various downstream tasks; whereby the adaptive nature of the layer implies nonlinearity).
Re Claim 18, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein said training includes one or more of L1 regularization and FLOPS regularization (Formal et al, pg. 2354: top half of right column: Training: FLOPS regularization; wherein FLOPS regularization is selected from the Markush language).
Re Claim 19, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein said training includes distillation (Formal et al, pg. 2354: right column: Section 3.2: Distillation).
Re Claim 20, the combined teachings of Han et al, Formal et al and Yu et al disclose the ranker of claim 1, wherein said training uses in-batch negative sampling (IBN) (Formal et al, pg. 2354: top half of right column: Training: in-batch negatives).
Re Claim 21, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, wherein the PLM is pretrained using masked language modeling (MLM) (Formal et al, pg. 2354: left column: Sparse Representation Learning: Masked Language Modeling).

Claims 8 & 10 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al, ‘Robust Transfer Learning with Pretrained Language Models through Adapter’, Formal et al, ‘From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective’, and Yu et al, CN 116882477 A, as applied to claim 1, and further in view of Agarwal et al, US Patent Pub. 20240289551 A1.
Re Claim 8, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, but fail to disclose wherein the training updates the parameters of the one or more of the adapter layers while one or more layers of the PLM remain frozen. However, Agarwal et al teaches the concept of freezing weights of a pretraining model while updating other weight of the pretraining model (Agarwal et al, para 078). It would have been obvious to modify Han et al, Formal et al and Yu et al such that the weights of other layers are frozen while the weights of the inserted adapter layers are updated as taught in Agarwal et al for the purpose of updating the desired layers without impacting other layers.
Re Claim 10, the combined teachings of Han et al, Formal et al and Yu et al disclose the method of claim 1, but fail to disclose wherein the training updates a number of parameters in the adapter layers that is a fraction of trainable parameters in the PLM. However, Agarwal et al teaches the concept of freezing weights of a pretraining model while updating other weight of the pretraining model (Agarwal et al, para 078). It would have been obvious to modify Han et al, Formal et al and Yu et al such that the weights of other layers are frozen while the weights of the inserted adapter layers are updated as taught in Agarwal et al thus updating a fraction of the pretraining model parameters for the purpose of updating the desired layers without impacting other layers.

Allowable Subject Matter
Claims 3-7, 12-14 & 22-25 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter for claim 3: The prior art does not teach or moderately suggest the following limitations:
Wherein: (i) the downstream task is an information retrieval task; (ii) said inserting inserts adapter layers into one or more transformer layers of the PLM in an encoder of a first-stage retriever that is trained on an information retrieval task using a first, in-domain dataset, (iii) said training uses a second-out-of-domain dataset to train the first-stage retriever on an information retrieval task, and (iv) said training updates one or more parameters of the inserted adapter layers while parameters of the PLM are frozen.
Limitations such as these may be useful in combination with other limitations of claim 1.

The following is a statement of reasons for the indication of allowable subject matter for claims 4-7: The prior art does not teach or moderately suggest the following limitations:
Wherein the first stage retriever comprises: a document encoder comprising the pretrained language model layer including one or more transformer layers; a query encoder configured to receive a query and generate a representation of the query; and a comparator configured to compare the generated representation of the query to the generated representations of the one or more documents to generate a set of respective document scores and rank the one or more documents based on the generated set of document scores.
Limitations such as these may be useful in combination with other limitations of claim 1.

The following is a statement of reasons for the indication of allowable subject matter for claim 12: The prior art does not teach or moderately suggest the following limitations:
Wherein the PLM is pretrained using an in-domain dataset, and wherein said training the first-stage retriever uses an out-of-domain dataset.
Limitations such as these may be useful in combination with other limitations of claim 1.

The following is a statement of reasons for the indication of allowable subject matter for claims 13-14: The prior art does not teach or moderately suggest the following limitations:
Wherein the transformer layers comprises N transformer layers, each of the N transformer layers comprising: a fully-connected feedforward layer; and an attention layer having trained parameters; wherein, in between 1 and N of the transformer layers an adapter among the one or more adapters is disposed downstream of the feedforward layer.
Limitations such as these may be useful in combination with other limitations of claim 1.

The following is a statement of reasons for the indication of allowable subject matter for claims 22-25: The prior art does not teach or moderately suggest the following limitations:
Wherein said inserting inserts adapter layers with a first set of parameters into one or more layers of the PLM with a second set of parameters in an encoder of the first-stage retriever; wherein said training updates one or more of the first set of parameters of the inserted adapter layers; and wherein the second set of parameters is larger than the first set of parameters and the stored updated first set of parameters of the inserted adapter layers represent the larger second set of parameters in the pretrained language model.
Limitations such as these may be useful in combination with other limitations of claim 1.

Claims 26-38 are allowed.
The following are examiner’s statement of reasons for allowable subject matter: 
Referring to claim 26, the Han et al reference (‘Robust Transfer Learning with Pretrained Language Models through Adapter’) discloses a first-stage retriever for a neural information retrieval model. The Han et al reference taken alone or in combination with another, do not disclose, teach or fairly suggest the first-stage retriever for a neural information retrieval model as a whole comprising: a document encoder including a processor comprising a pretrained language model (PLM) layer including at least N transformer layers, the document encoder being configured to receive one or more documents and generate a sparse representation for each of the documents predicting term importance of the document over a vocabulary; a query encoder including a processor configured to receive a query and generate a representation of the query; and a comparator including a processor configured to compare the generated representation of the query to the generated representations of the one or more documents to generate a set of respective document scores and rank the one or more documents based on the generated set of document scores; wherein an adapter layer is inserted into each of 1 to N of the N transformer layers; and wherein the first-stage retriever is trained on an information retrieval task to update one or more parameters of the inserted adapter layers as recited in claim 26.
Claims 27-34 depend on claim 26. 

Referring to claim 35, the Han et al reference (‘Robust Transfer Learning with Pretrained Language Models through Adapter’) discloses a computer-implemented method for information retrieval. The Han et al reference taken alone or in combination with another, do not disclose, teach or fairly suggest the computer-implemented method for information retrieval as a whole comprising: generating, by a document encoder comprising a pretrained language model (PLM) layer including one or more transformer layers having inserted adapter layers, a sparse representation for each of one or more received documents predicting term importance of the document over a vocabulary; generating, by a query encoder, a representation of a received query over the vocabulary; comparing the generated representation of the query to the generated representations of the one or more documents to generate a set of respective document scores; and ranking the one or more documents based on the generated set of document scores; wherein the document encoder is trained on a downstream task by updating parameters of the inserted adapters while the PLM remains frozen as recited in claim 35.
Claims 36-38 depend on claim 35.

Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE C MONIKANG whose telephone number is (571)270-1190. The examiner can normally be reached Mon. - Fri., 9AM-5PM, ALT. Fridays off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Carolyn R Edwards can be reached at 571-270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEORGE C MONIKANG/Primary Examiner, Art Unit 2692                       				2/4/2026                                                                                                                                                                      
/CAROLYN R EDWARDS/Supervisory Patent Examiner, Art Unit 2692

Read full office action

Prosecution Timeline

Jan 19, 2024

Application Filed

Feb 10, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/680,676

Patent 12641371

Electrodynamic actuator with vibration compensation and method of tuning a sound system with such an actuator

1y 12m to grant Granted May 26, 2026

18/600,300

Patent 12627932

LOUDSPEAKER AND BASKET THEREOF

2y 2m to grant Granted May 12, 2026

19/206,972

Patent 12621593

WOODEN OR OTHER DIELECTRIC CAPACITIVE TOUCH INTERFACE AND LOUDSPEAKER HAVING SAME

11m to grant Granted May 05, 2026

19/192,992

Patent 12610200

METHOD, APPARATUS AND SYSTEM FOR NEURAL NETWORK HEARING AID

11m to grant Granted Apr 21, 2026

18/436,924

Patent 12604126

VEHICULAR MICROPHONE AND VEHICLE

2y 2m to grant Granted Apr 14, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

75%

Grant Probability

82%

With Interview (+7.7%)

3y 0m (~8m remaining)

Median Time to Grant

Low

PTA Risk

Based on 952 resolved cases by this examiner. Grant probability derived from career allowance rate.