Last updated: April 19, 2026

Application No. 19/021,028

LOAD BALANCING METHOD AND SYSTEM FOR PROVIDING ARTIFICIAL INTELLIGENCE SERVICE

Final Rejection §103

Filed

Jan 14, 2025

Examiner

JAKOVAC, RYAN J

Art Unit

2445

Tech Center

2400 — Computer Networks

Assignee

Rebellions Inc.

OA Round

4 (Final)

Interview Optional

— +17.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 613 resolved cases, 2023–2026

Examiner Intelligence

JAKOVAC, RYAN J View full profile →

Grants 66% — above average

Career Allow Rate

402 granted / 613 resolved

+7.6% vs TC avg

Strong +17% interview lift

Without

With

+17.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 9m

Avg Prosecution

32 currently pending

Career history

645

Total Applications

across all art units

Statute-Specific Performance

§101

7.5%

-32.5% vs TC avg

§103

50.5%

+10.5% vs TC avg

§102

20.7%

-19.3% vs TC avg

§112

17.6%

-22.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 613 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments filed 1/20/2026 have been fully considered and are moot in view of the new grounds of rejection presented herein. 





Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating      obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 1, 3, 5-6, 11, 13, 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over 20250106306 to Hart in view of US 20220103523 to Starr in view of US 20250392523 to Estevez. 

Regarding claim 1, 
Hart teaches a load balancing method in an Artificial Intelligence (AI) service providing system, comprising:  
receiving load balancing information (¶ 13-15, 18, 32-33, 54-66, obtaining load balancing information of servers); 

wherein the load balancing information comprises connection information of the respective one server, information of one or more AI models supported by the respective one server, and information of hardware supporting the one or more AI models in the respective one server (¶ 54-66, ¶ 17-23, 28-30, 42, 45, 50, 63, received load balancing information including connection information, AI model supported, and hardware);

generating a load balancing data structure based on the load balancing information of the plurality of servers (¶ 54-66, load balancing data structure), wherein the load balancing data structure is used for load balancing between the plurality of servers and comprises connection information of the plurality of servers, information of AI models supported by the plurality of servers, and information of hardware supporting the AI models supported by the plurality of servers (¶ 54-66, see also ¶ 17-23, 28-30, 42, 45, 50, 63, load balancing data structure used for load balancing and comprising connection information, information on AI models, hardware);

obtaining an inference task request message for an AI service from a user device (¶ 5, 13, 32-33, 52, 93, 109, obtaining request); 

deriving at least one target server among the plurality of servers based on the inference task request message for the AI service and the load balancing table (¶ 15-20, 26, 28-30, 52, 54, 64, deriving target server based on inference message and AI service); and 

performing load balancing for an inference task of the AI service on the derived target server based on a preset load balancing algorithm (¶ 30, 40-41, 52-55, 93, load balanced based on network connection, AI model, and hardware information), and

wherein the deriving at least one target server among the plurality of servers based on the inference task message request for the AI service and the load balancing table comprises:

deriving target load balancing information including same information as information included in the inference task request message from the load balancing table (¶ 32-35, 40-41, 54, 84-88, 93-94, derived target load balancing information includes same connection information of target server as in inference request message); and  
deriving a server indicated by connection information of the derived target load balancing information as the target server (¶ 30, 41-41, 52, 54, 76, 93, deriving server information load information).  
As described above, Hart discloses a load balancing data structure instead of a load balancing table. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a table as the particular type of data structure because tables are rudimentary types of data structure useful for organizing and retrieving data. 
	Hart fails to explicitly teach that the load balancing information is received from a plurality of servers. However, Hart strongly suggests that the load balancing information is received from the plurality of servers as metrics from the plurality of servers are obtained including the processing capability at the servers, what specific AI models and hardware the servers are utilizing, available computing resources at the servers such as CPU cycles, GPU cycles, available memory, etc.  The metrics are used to route inference requests according to the conditions at the servers (see Hart in at least ¶ 13, 17-20, 23, 32, 32-30, 42, 52-55, 63-33). 
	Nevertheless, although Hart does not explicitly state that the load balancing information is received from a plurality of servers, Starr discloses receiving load balancing information from a plurality of servers (¶ 32). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Starr. The motivation to do so is that the teachings of Starr would have been advantageous in terms of facilitating performance monitoring and load balancing (Starr, ¶ 30-32).
Hart fails to teach but Estevez teaches
receiving updated load balancing information from the respective one server, the updated load balancing information indicating a change to the one or more Al models supported by the respective one server or a change to the hardware supporting the one or more Al models; updating the load balancing table based on the updated load balancing information (¶ 158-171, receiving updated information for load balancing including updated AI models).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Estevez. The motivation to do so is that the teachings of Estevez would have been advantageous in terms of facilitating load balancing optimization (Estevez, ¶ 167-170).

Regarding claim 3, 13,
Hart teaches:
wherein when the inference task message includes AI model information representing a specific AI model name (¶ 93, request includes target AI model), target load balancing information including the AI model information representing the specific AI model name is derived from the load balancing table (¶ 30, 41-41, 52, 54, 72, 76, 93-96), and a server indicated by connection information of the derived target load balancing information is derived as the target server (¶ 30, 41-41, 52, 54, 76, 93-96).  

Regarding claim 5, 15,
Hart teaches:
wherein when the inference task message includes AI model information representing a specific AI model name and a specific AI model version and supported hardware information representing specific supported hardware (¶ 30, 40-41, 52-55, 69-74, 93, inference task message including AI model, hardware), 

target load balancing information including the AI model information representing the specific AI model name and the specific AI model version and the supported hardware information representing the specific supported hardware is derived from the load balancing data structure (30, 41-41, 49, 52, 54, 76, 93, AI model and version information, supported hardware), and 

a server indicated by connection information of the derived target load balancing information is derived as the target server (¶ 30, 41-41, 52, 54, 76, 93, deriving server information).  
As described above, Hart discloses a load balancing data structure instead of a load balancing table. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a table as the particular type of data structure because tables are rudimentary types of data structure useful for organizing and retrieving data. 


Regarding claim 6, 16,
Hart teaches:
wherein when the inference task message includes connection information representing a specific endpoint (¶ 89-93), target load balancing information including the connection information representing the specific endpoint is derived from the load balancing table, and a server indicated by the connection information of the derived target load balancing information is derived as the target server (¶ 30, 41-41, 52, 54, 76, 93-96).  

Claim 11 addressed by similar rationale as claim 1. 


Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Hart, Starr, and Estevez in view of US 12067482 to Perumalla in view of US 20220092043 to Davison.
Regarding claim 4, 14,
Hart teaches:
wherein when the inference task message includes AI model information representing a specific AI model version (¶ 30, 40-41, 52-55, 69-74, 93), 

target load balancing information including the AI model information representing the specific AI model name and the specific AI model version is derived from the load balancing table (¶ 30, 40-41, 52-55, 93 target load balancing info derived), and 

a server indicated by connection information of the derived target load balancing information is derived as the target server (¶ 30, 41-41, 52, 54, 76, 93, deriving server information and load information).   
Hart fails to teach inference task messages specifying an AI model name. However, Perumalla teaches an inference task message specifying an AI model name (col. 3:35-45, col. 4:1-15, inference models; col. 8:1-10, request including model names).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Perumalla. The motivation to do so is that the teachings of Perumalla would have been advantageous in terms of facilitating necessary input data characteristics of learning models (Peumalla, col. 7:49-67).
Hart fails to teach load balancing information including specific AI model names. However, Davidson teaches load balancing information including specific AI model names (¶ 16).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Davidson. The motivation to do so is that the teachings of Davidson would have been advantageous in terms of facilitating machine learning model registry and update (Davidson, ¶ 16).
Claims 8-10, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hart, Starr, and Estevez in view of US 8,812,727 to Sorenson. 

Regarding claim 8, 18,
Hart fails to teach but Sorenson teaches:
wherein the load balancing information is transmitted from the plurality of servers at specific time intervals, and the load balancing table is updated based on the load balancing information transmitted at the specific time intervals (col. 10:39-63, col. 11:60-67, col. 12:1-41)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Sorenson. The motivation to do so is that the teachings of Sorenson would have been advantageous in terms of facilitating load reporting and distribution (Sorenson, col. 10:39-63, col. 11:60-67, col. 12:1-41).
Hart in view of Sorenson renders obvious: “wherein when a specific inference task request message for a specific AI service is obtained after the load balancing table is updated, at least one target server among the plurality of servers is derived based on the updated load balancing table, and load balancing for an inference task of the specific AI service is performed on the derived target server” as Hart discloses wherein when a specific inference task request message for a specific AI service is obtained at least one target server among the plurality of servers is derived, and load balancing for an inference task of the specific AI service is performed on the derived target server (Hart, ¶ 30, 41-41, 52, 54, 76, 93, deriving target server for inference task) while Sorenson teaches the update features as above. It would have been obvious to perform Hart’s server selection after the updated load balancing of Sorenson in order to efficiently select target servers. 


Regarding claim 10, 20,
Hart fails to teach but Sorenson teaches:
wherein the preset load balancing algorithm is a round robin algorithm, a sticky round robin algorithm, a weighted round robin algorithm, an IP/URL hash algorithm, a least connection algorithm, or a least time algorithm (col. 9:50-67, col. 10:45-60, round robin).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the teachings of Sorenson. The motivation to do so is that the teachings of Sorenson would have been advantageous in terms of facilitating load balancing (Sorenson, col. 10:45-60).

Regarding claim 9, 19,
Hart teaches wherein load balancing information including AI model information supported by a server, and supported hardware information of a server (¶ 18, 26, 40). 
Hart fails to disclose encapsulating the information in a table and wherein load balancing information included in the load balancing table includes service information and connection information supported by a server.
Nevertheless, Sorenson teaches load balancing information in a table including service information and connection information supported by a server (col. 9:45-67, col. 10:1-60). Motivation to include Sorenson is the same as presented above. 



CONCLUSION
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN J JAKOVAC whose telephone number is (571)270-5003.  The examiner can normally be reached on 8-4 PM EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Oscar A. Louie can be reached on 572-270-1684.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/RYAN J JAKOVAC/Primary Examiner, Art Unit 2445

Read full office action

Prosecution Timeline

Jan 14, 2025

Application Filed

Feb 26, 2025

Response after Non-Final Action

Apr 29, 2025

Non-Final Rejection — §103

Jun 17, 2025

Response Filed

Jul 17, 2025

Final Rejection — §103

Sep 16, 2025

Request for Continued Examination

Oct 05, 2025

Response after Non-Final Action

Oct 17, 2025

Non-Final Rejection — §103

Jan 06, 2026

Interview Requested

Jan 13, 2026

Applicant Interview (Telephonic)

Jan 14, 2026

Examiner Interview Summary

Jan 20, 2026

Response Filed

Feb 21, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/369,718

Patent 12603906

ALERT MONITORING OF DATA BASED ON RECOMMENDED ATTRIBUTE VALUES

2y 5m to grant Granted Apr 14, 2026

17/881,068

Patent 12572634

ELECTRONIC DEVICE AND ENCRYPTION METHOD FOR ELECTRONIC DEVICE

2y 5m to grant Granted Mar 10, 2026

18/064,909

Patent 12549627

INTELLIGENT CLOUD-EDGE RESOURCE MANAGEMENT

2y 5m to grant Granted Feb 10, 2026

18/133,594

Patent 12526298

System and Method for Fraud Identification

2y 5m to grant Granted Jan 13, 2026

18/142,096

Patent 12500926

Executing Real-Time Message Monitoring to Identify Potentially Malicious Messages and Generate Instream Alerts

2y 5m to grant Granted Dec 16, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

66%

Grant Probability

83%

With Interview (+17.4%)

3y 9m

Median Time to Grant

High

PTA Risk

Based on 613 resolved cases by this examiner. Grant probability derived from career allow rate.