Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
1. This action is response to application filed on 05/09/2024. Claims 1, 3-21 are pending.
Claim rejections-35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claim 4 recites the limitation "the average quality" in line 4. There is insufficient antecedent basis for this limitation in the claim.
Claim 7 recites the limitation "the values" in line 4. There is insufficient antecedent basis for this limitation in the claim.
Claim rejections-35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 8-9, 11-14 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Marin (EP 4231589 A1) in view of Gupta et al. (US 20250335795) and further in view of Nahamoo et al. (KR 20230025714 A)
Regarding claim 1:
A method for use in protecting a first machine learning model that is query able over an application programming interface, API, against an adversary querying the first machine learning model through the API in order to build up a database of query-response pairs, the method comprising:
identifying a user of the API as a potential adversary: (identifying adversaries, and the tracing requests from the adversaries, and rerouting the tracing requests to other nodes which handle and respond to the tracing request: Marin, page 4, lines 9-15).
in response to a query from the potential adversary, through the API, providing a response from a second machine learning model instead of the first machine learning model: (skip real nodes and reroute tracing requests to other nodes which handle and respond to the tracing requests: Marin, page 4, lines 9-15).
However, Marin does not teach the first machine learning model has having been trained on a first dataset and wherein the second machine learning model has having been trained on a second dataset that is different to the first dataset.
In similar art, Gupta teaches a first LLM includes table of contents, which is instructed to generate query-response pairs for potential user queries. A second LLM is instructed to respond to the user query based on context provided by similar query-response pairs (see, Gupta [0016]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Gupta’s ideas into Marin’s system in order to provide an efficient query-response system (see Gupta, [0014]-0015]).
However, Marin-Gupta does not teach the second dataset comprising cached query-response pairs from previous queries requested by the potential adversary through the API, as stored in a cache.
In similar art, Nahamoo teaches the query processing module query cache stores answers/contents of frequently asked questions, and the like. Such answers/content may include content previously retrieved from the DOM document in response to a previously submitted query (Nahamoo page 9; page 10).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Nahamoo’s ideas into Marin-Gupta’s system in order to provide an efficient query-response system (see Nahamoo, abstract).
Regarding claim 8:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches the second machine learning model has a different architecture to the first machine learning model; or is a different type of model to the first machine learning model: (teaches a first LLM includes table of contents, which is instructed to generate query-response pairs for potential user queries. A second LLM is instructed to respond to the user query based on context provided by similar query-response pairs (see, Gupta [0016]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Gupta’s ideas into Marin-Nahamoo’s system in order to provide an efficient query-response system (see Gupta, [0014]-0015]).
Regarding claim 9:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches responsive to identifying the user of the API as a potential adversary, training the second machine learning model on the second dataset: (skip real nodes and reroute tracing requests to other nodes which handle and respond to the tracing requests: Marin, page 4, lines 9-15).
Regarding claim 12:
In addition to the rejection claim 9, Marin-Gupta-Nahamoo further teaches wherein the second dataset comprises query-response pairs from previous queries requested by the potential adversary through the API, as stored in a cache and wherein the second machine learning model is trained in an incremental manner on the previous queries as they are cached: (Nahamoo teaches the query processing module query cache stores answers/contents of frequently asked questions, and the like. Such answers/content may include content previously retrieved from the DOM document in response to a previously submitted query: Nahamoo page 9; page 10).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Nahamoo’s ideas into Marin-Gupta’s system in order to provide an efficient query-response system (see Nahamoo, abstract).
Regarding claim 11:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches in response to a query from the potential adversary through the API, providing a response from a third machine learning model instead of the first machine learning model, whilst the second machine learning model is being trained: (skip real nodes and reroute tracing requests to other nodes which handle and respond to the tracing requests: Marin, page 4, lines 9-15).
Regarding claim 13:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches the step of providing a response from a second machine learning model instead of the first machine learning model is further performed responsive to: an estimation of a second data extraction level being above a second threshold data extraction level; or a second estimation of likelihood that the user of the API is actually an adversary being above a second likelihood threshold: (identifying adversaries, and the tracing requests from the adversaries, and rerouting the tracing requests to other nodes which handle and respond to the tracing request: Marin, page 4, lines 9-15).
Regarding claim 14:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches the second machine learning model is deployed with the first machine learning model: (Gupta teaches a first LLM includes table of contents, which is instructed to generate query-response pairs for potential user queries. A second LLM is instructed to respond to the user query based on context provided by similar query-response pairs (Gupta [0016]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Gupta’s ideas into Marin-Nahamoo’s system in order to provide an efficient query-response system (see Gupta, [0014]-0015]).
Regarding claim 19:
In addition to the rejection claim 1, Marin-Gupta-Nahamoo further teaches managing a suspected extraction attack by the potential adversary: (identifying adversaries, and the tracing requests from the adversaries, and rerouting the tracing requests to other nodes which handle and respond to the tracing request: Marin, page 4, lines 9-15).
Regarding claim 20:
An apparatus for use in protecting a first machine learning model that is queryable over an application programming interface, API, against an adversary querying the first machine learning model through the API in order to build up a database of query-response pairs, the apparatus comprising: a memory comprising instruction data representing a set of instructions; and a processor configured to communicate with the memory and to execute the set of instructions, wherein the set of instructions, when executed by the processor, causing the apparatus to:
identify a user of the API as a potential adversary: (identifying adversaries, and the tracing requests from the adversaries, and rerouting the tracing requests to other nodes which handle and respond to the tracing request: Marin, page 4, lines 9-15).
in response to a query from the potential adversary, through the API, providing a response from a second machine learning model instead of the first machine learning model: (skip real nodes and reroute tracing requests to other nodes which handle and respond to the tracing requests: Marin, page 4, lines 9-15).
However, Marin does not teach the first machine learning model has having been trained on a first dataset and wherein the second machine learning model has having been trained on a second dataset that is different to the first dataset.
In similar art, Gupta teaches a first LLM includes table of contents, which is instructed to generate query-response pairs for potential user queries. A second LLM is instructed to respond to the user query based on context provided by similar query-response pairs (see, Gupta [0016]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Gupta’s ideas into Marin’s system in order to provide an efficient query-response system (see Gupta, [0014]-0015]).
However, Marin-Gupta does not teach the second dataset comprising cached query-response pairs from previous queries requested by the potential adversary through the API, as stored in a cache.
In similar art, Nahamoo teaches the query processing module query cache stores answers/contents of frequently asked questions, and the like. Such answers/content may include content previously retrieved from the DOM document in response to a previously submitted query (Nahamoo page 9; page 10).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Nahamoo’s ideas into Marin-Gupta’s system in order to provide an efficient query-response system (see Nahamoo, abstract).
Claims 3 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Clever et al. (US 20250061583)
Regarding claim 3:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach use a data augmentation process to generate synthetic training data from the cached query-response pairs; and supplement the second dataset with the synthetic training data.
In similar art, Clever teaches generating synthetic images with one or more augmentations realistically added to objects in the images. A synthetic augmentation system can thus be used to increase the volume and variety of training data, offering a larger and more varied dataset for model training (see Clever abstract).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Clever’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Clever’s ideas into Marin-Gupta-Nahamoo’s system.
Regarding claim 21:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 20, but does not explicitly teach use a data augmentation process to generate synthetic training data from the cached query-response pairs; and supplement the second dataset with the synthetic training data.
In similar art, Clever teaches generating synthetic images with one or more augmentations realistically added to objects in the images. A synthetic augmentation system can thus be used to increase the volume and variety of training data, offering a larger and more varied dataset for model training (see Clever abstract).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Clever’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Clever’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Ahmed et al. (US 20220351072)
Regarding claim 4:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach dataset is supplemented with one or more of: training data from the first dataset that is on average of lower quality compared to the average quality of the first dataset as a whole; training data from the first dataset that is not confidential; and training data that is publicly available.
In similar art, Ahmed teaches sets having an average data quality score less than the desired data quality score, the method determines the media quality score for the set and forms the training data set by merging all segments having an individual data quality score greater than the median data quality score for the set of segments having the most frequent seasonality period.
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Ahmed’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Ahmed’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Zhao et al. (US 20210248376)
Regarding claim 5:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach subset of data in the dataset.
In similar art, Zhao teaches subset of data in the dataset, (see [0115]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Zhao’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Zhao’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo-Zhao in view of Ahmed et al. (US 20220351072)
Regarding claim 6:
Marin-Gupta-Nahamoo-Zhao discloses the invention substantially as disclosed in claim 5, but does not explicitly teach the subset of the data in the first dataset comprises one or both of: training data from the first dataset that is on average of lower quality compared to the average quality of the first dataset as a whole; and training data from the first dataset that is on average less confidential compared to the average confidentiality level of the first dataset as a whole.
In similar art, Ahmed teaches sets having an average data quality score less than the desired data quality score (Ahmed [0037]).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Ahmed’s ideas into Marin-Gupta-Nahamoo-Zhao’s system in order to save resources and development time by implying Ahmed’s ideas into Marin-Gupta-Nahamoo-Zhao’s system.
Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo-Zhao in view of Clever et al. (US 20250061583)
Regarding claim 7:
Marin-Gupta-Nahamoo-Zhao discloses the invention substantially as disclosed in claim 5, but does not explicitly teach dataset comprises one or both of: synthetic training data; and training data from the first dataset, the values of which have been offset with random offset values.
In similar art, Clever teaches a synthetic augmentation system can thus be used to increase the volume and variety of training data, offering a larger and more varied dataset for model training (see Clever abstract).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Clever’s ideas into Marin-Gupta-Nahamoo-Zhao’s system in order to save resources and development time by implying Clever’s ideas into Marin-Gupta-Nahamoo-Zhao’s system.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Zoldi et al. (US 12,323,440)
Regarding claim 10:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 9, but does not explicitly teach machine learning model is trained responsive to: an estimation of a first data extraction level being above a first threshold data extraction level; or a first estimation of likelihood that the user of the API is actually an adversary being above a first likelihood threshold.
In similar art, Zoldi teaches training an adversary detection model based on attributes of the transactions in the corpus, wherein a plurality of attributes and adversarial latent features of the adversary detection model are aggregated using a plurality of moving average features across various transactions, a moving average feature from among the moving average features going through a quantile estimation process, wherein the moving average feature goes through the quantile estimation process using an equation (claim 18).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Zoldi’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Zoldi’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Anderson et al. (US 9,928,365)
Regarding claim 15:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach identifying user of the API as a potential adversary comprises one or more of: comparing a query pattern of the user to query patterns of other users to determine whether the user is performing an abnormal query pattern compared to the other users; comparing a query pattern of the user to previous query patterns of the user to determine whether the user is currently performing an abnormal query pattern compared to the previous query patterns; comparing a query pattern of the user to query patterns associated with extraction attacks to determine whether the user is performing a query pattern consistent with an extraction attack; and identifying the user of the API as a potential adversary if an estimation of a third data extraction level is above a third threshold data extraction level.
In similar art, Anderson teaches obtaining call information from a runtime stack from the second application, comparing the call information with the set of call information, determining the request for access is an abnormal request based on the comparing, and taking an action based on the determination (Anderson abstract).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Anderson’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Anderson’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo-Zoldi in view of Zhang (CN 116090982 A)
Regarding claim 16:
Marin-Gupta-Nahamoo-Zoldi discloses the invention substantially as disclosed in claim 10, but does not explicitly teach the data extraction level is a measure of feature space coverage of previous queries submitted by the user to the API.
In similar art, Zhang teaches user enters requesting extraction level (Zhang page 7).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Zhang’s ideas into Marin-Gupta-Nahamoo-Zoldi’s system in order to save resources and development time by Zhang’s ideas into Marin-Gupta-Nahamoo-Zoldi’s system.
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Yokoyama et al. (US 12,555,037)
Regarding claim 17:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach the second machine learning model produces lower accuracy outputs than the first machine learning model.
In similar art, Yokoyama teaches in one machine learning model, the accuracy may be lowered due to a change in the external environment or the like, and in another machine learning model, the accuracy may be increased due to continuous training (Yokoyama column 1, lines 44-58).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Yokoyama’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Yokoyama’s ideas into Marin-Gupta-Nahamoo’s system.
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Marin-Gupta-Nahamoo in view of Luitjens (EP 4446923 A1)
Regarding claim 18:
Marin-Gupta-Nahamoo discloses the invention substantially as disclosed in claim 1, but does not explicitly teach
In similar art, Luitjens teaches incorporating sensitive or confidential information into the training dataset (Luitjens page 12).
Thus, it would have been obvious to a person of ordinary skill in the art before the effective filling date of the claimed invention was made to modify Luitjens’s ideas into Marin-Gupta-Nahamoo’s system in order to save resources and development time by implying Luitjens’s ideas into Marin-Gupta-Nahamoo’s system.
Conclusions
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAN DAI T TRUONG whose telephone number is (571)272-7959. The examiner can normally be reached Monday-Friday 7:00 Am to 3:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Follansbee John A can be reached on 571-272-3964. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LAN DAI T TRUONG/Primary Examiner, Art Unit 2444