Last updated: May 04, 2026

Application No. 18/475,683

METHOD AND APPARATUS WITH CHECKPOINT ADJUSTMENT

Non-Final OA §102§103§112

Filed

Sep 27, 2023

Priority

Dec 30, 2022 — CN 202211719826.4 +1 more

Examiner

KIM, SISLEY NAHYUN

Art Unit

2196

Tech Center

2100 — Computer Architecture & Software

Assignee

Samsung Electronics Co., Ltd.

OA Round

1 (Non-Final)

Interview Optional

— +16.4% interview lift. Examiner has a relatively high allowance rate (89%); +16.4% interview lift. A written response may suffice.

Based on 675 resolved cases, 2023–2026

Examiner Intelligence

KIM, SISLEY NAHYUN View full profile →

Grants 89% — above average

Career Allowance Rate

600 granted / 675 resolved

+33.9% vs TC avg

Strong +16% interview lift

Without

With

+16.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

32 currently pending

Career history

707

Total Applications

across all art units

Statute-Specific Performance

§101

9.1%

-30.9% vs TC avg

§103

49.7%

+9.7% vs TC avg

§102

26.1%

-13.9% vs TC avg

§112

7.2%

-32.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 675 resolved cases

Office Action

§102 §103 §112

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(B) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of pre-AIA  35 U.S.C. 112, second paragraph::
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3, 8, 9, 13, 18, and 19 are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claims 3 and 13 disclose “high-load API.” but do not provide objective boundaries for the term “high-load.” As used, “high-load” is a relative term that lacks a clear baseline, threshold, metric, or measurement methodology, and the specification does not supply definitions, examples, or teachings that would permit a person of ordinary skill in the art to determine, with reasonable certainty, whether a given API qualifies as “high-load.” Accordingly, the scope of the limitation is ambiguous and the metes and bounds of the claims are not clear. See MPEP § 2173.05(b) (relative terminology).
In order to further examine on the merits of the claim, the examiner interpret "high-load API" as “API".
Claims 8-9 and 18-19 are rejected based on its dependency on a rejected parent claim.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 8, 10-14, and 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhao et al. (US 10,275,851, hereinafter Zhao).

Regarding claim 1, Zhao discloses
A method of adjusting a checkpoint, the method comprising (fig. 1-6):
monitoring calls of an application program interface (API) that are called when an accelerator device executes an application (col. 11, line 66-col. 12, line 3: server frontend 222 stores all API messages that are transmitted between the server frontend 222 and the GPU API 314, wherein the API messages are utilized in a subsequent “replay mode” to allow the client system 110 to recover checkpointed images; col. 15, lines 24-28: the server frontend 222 will record all API conversations (e.g., API messages) between the server frontend 222 and the GPU API 314 into the API conversation memory 342), and by the monitoring, checking an API execution logic and a current API execution cycle of the application with respect to the accelerator device (col. 15, lines 35-38: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310; col. 15, lines 50-63: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310); and
determining a next checkpoint according to a checkpoint adjustment strategy that determines the next checkpoint (col. 2, lines 1-4, 50-57: a GPU checkpointing operation during execution of the first GPU processing task to generate a checkpoint image of a current state of the first GPU processing task … Embodiments of the invention provide multiple solutions for implementing GPU checkpointing services as part of a GPUaaS system which provides GPU processing services for high performance computing (HPC) applications such as Deep Learning analytics, etc., wherein the checkpointing solutions are designed to minimize or avoid any impact on GPU computing performance in a cloud environment; Note: The passages describe “selecting checkpoint timing to minimize impact”; col. 2, line 63-col. 3, line 8: checkpointing systems according to embodiments of the invention include: (i) a conversation-based checkpointing service …, (ii) a client-driven GPU checkpointing service …; and (iii) a multi-layer queue-based checkpointing system) based on the API execution logic (col. 11, line 66-col. 12, line 1: server frontend 222 stores all API messages that are transmitted between the server frontend 222 and the GPU API 314; col. 15, lines 24-28: the server frontend 222 will record all API conversations (e.g., API messages) between the server frontend 222 and the GPU API 314 into the API conversation memory 342; col. 15, lines 35-38: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310; col. 15, lines 50-63: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310; Note: The passages monitors API calls/messages and identifies particular API events (data copy operations, etc.) and treats those as checkpoint-triggering conditions. That is an explicit teaching of using API execution behavior/logic to decide checkpoint events) and based on the current API execution cycle of the application (col. 11, lines 44-48: FIG. 3 further illustrates a run-time implementation of a GPU checkpointing service 340, which comprises an API conversation memory 342, a checkpoint event monitor 344, and a GPU memory checkpoint image generator 346; col. 15, lines 35-38, 64-67: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310 … if the client system 310 infrequently copies data from the GPU server node 200, the “critical points” of executed can be selected at a predetermined time interval; Note: The passages describe detection of critical points while the application/GPU task is actively executing (i.e., the current execution cycle) and uses that runtime information to initiate checkpointing. It also contemplates time-interval-based triggers as an alternative when API events are sparse), 
wherein the checkpoint adjustment strategy corresponds to at least one API execution logic among plural API execution logics (col. 2, line 63-col. 3, line 8: checkpointing systems according to embodiments of the invention include: (i) a conversation-based checkpointing service …, (ii) a client-driven GPU checkpointing service …; and (iii) a multi-layer queue-based checkpointing system; col. 15, lines 56-60: the selection of “critical points” will depend on the policy of the GPU service platform, the GPU server node allocated to handle the GPU processing tasks of the client system 310, the workload of the client system 310, etc.; Note: The passages explicitly provide multiple possible checkpoint strategies (these are the plural API execution logics). It also teaches that the chosen strategy/trigger may vary depending on the API behavior (e.g., frequent data-copying clients) and platform policy).
Regarding claim 11 referring to claim 1, Zhao discloses An apparatus for adjusting a checkpoint, the apparatus comprising: one or more processors; memory storing instructions configured to be executed by the one or more processors to cause the one or more processors to: … (FIG. 2; memory coupled to processing unit comprising one or more processors…).

Regarding claims 2 and 12, Zhao discloses
wherein the checkpoint adjustment strategy is determined (col. 2, lines 1-4, 50-57: a GPU checkpointing operation during execution of the first GPU processing task to generate a checkpoint image of a current state of the first GPU processing task … Embodiments of the invention provide multiple solutions for implementing GPU checkpointing services as part of a GPUaaS system which provides GPU processing services for high performance computing (HPC) applications such as Deep Learning analytics, etc., wherein the checkpointing solutions are designed to minimize or avoid any impact on GPU computing performance in a cloud environment; Note: The passages describe “selecting checkpoint timing to minimize impact”; col. 2, line 63-col. 3, line 8: checkpointing systems according to embodiments of the invention include: (i) a conversation-based checkpointing service …, (ii) a client-driven GPU checkpointing service …; and (iii) a multi-layer queue-based checkpointing system) on at least one API execution cycle (col. 11, lines 44-48: FIG. 3 further illustrates a run-time implementation of a GPU checkpointing service 340, which comprises an API conversation memory 342, a checkpoint event monitor 344, and a GPU memory checkpoint image generator 346; col. 15, lines 35-38, 64-67: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310 … if the client system 310 infrequently copies data from the GPU server node 200, the “critical points” of executed can be selected at a predetermined time interval; Note: The passages describe detection of critical points while the application/GPU task is actively executing (i.e., the current execution cycle) and uses that runtime information to initiate checkpointing. It also contemplates time-interval-based triggers as an alternative when API events are sparse) based on the API execution logic of the application (col. 11, line 66-col. 12, line 1: server frontend 222 stores all API messages that are transmitted between the server frontend 222 and the GPU API 314; col. 15, lines 24-28: the server frontend 222 will record all API conversations (e.g., API messages) between the server frontend 222 and the GPU API 314 into the API conversation memory 342; col. 15, lines 35-38: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310; col. 15, lines 50-63: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310; Note: The passages monitors API calls/messages and identifies particular API events (data copy operations, etc.) and treats those as checkpoint-triggering conditions. That is an explicit teaching of using API execution behavior/logic to decide checkpoint events) executed by the accelerator device and an initial checkpoint interval (col. 15, lines 60-67: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310. In another embodiment, if the client system 310 infrequently copies data from the GPU server node 200, the “critical points” of executed can be selected at a predetermined time interval; col. 16, line 67-col. 17, line 6: For each incoming request with the flag try_replay, the GPU server node 200 will attempt to match the incoming request with a previous request that is contained in the API conversation history. If there is a match, the GPU server node 200 will directly reply with the previous checkpointed GPU processing results; Note: initial checkpoint interval is inherently included in the predetermined intervals of past API conversation history).

Regarding claims 3 and 13, Zhao discloses
wherein the at least one API execution cycle comprises a high-load API for copying data to or from the accelerator device (col. 15, lines 38-42: At a critical point of the execution (e.g., when copying GPU memory from the GPU server node 200 to the processor 316 (e.g., CPU) of the client system 310), the checkpoint event monitor 344 will send a control message …; col. 15, lines 60-67: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310).

Regarding claims 4 and 14, Zhao discloses
wherein the API execution logic of the application comprises an order of API calls when the accelerator device executes the application and a time required to execute each of the API calls (col. 13, lines 52-54: The server frontend 222 tags the GPU service request with a timestamp which indicates a time that the GPU service request was submitted to the GPU server node 200; col. 15, lines 24-28: After the communication session is established, the server frontend 222 will record all API conversations (e.g., API messages) between the server frontend 222 and the GPU API 314 into the API conversation memory 342; Note: The passages describe recording API conversation history with timestamps that refers to timing/order and the use of that history in replay).

Regarding claims 8 and 18, Zhao discloses
wherein the high-load API comprises an API function for copying data into the accelerator device and an API function for copying data from the accelerator device (col. 15, lines 35-41: The checkpoint event monitor 344 communicates with the server backend GPU workers 228 to determine when a “critical point” is reached in the execution of a GPU processing task for the client system 310. At a critical point of the execution (e.g., when copying GPU memory from the GPU server node 200 to the processor 316 (e.g., CPU) of the client system 310), …; col. 15, lines 60-67: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310).

Regarding claim 10, Zhao discloses
wherein the accelerator device comprises a CUDA GPU, and wherein a checkpoint of the CUDA GPU is performed according to the next checkpoint (col. 6, lines 2-5: the GPU API 114 may comprise shim layers that are utilized to extend the functionality of an existing API (e.g., CUDA) to implement the functionalities of the GPU API 114; col. 15, lines 38-47: At a critical point of the execution (e.g., when copying GPU memory from the GPU server node 200 to the processor 316 (e.g., CPU) of the client system 310), the checkpoint event monitor 344 will send a control message to the GPU memory checkpoint image generator 346 commanding the GPU memory checkpoint image generator 346 to generate a checkpoint image of the current state of the GPU memory of the one or more GPU devices that are executing the GPU processing tasks for the client system 310).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Zhao et al. (US 10,275,851, hereinafter Zhao) in view of El-Sayed et al, “Checkpoint/Restart in Practice: When ‘Simple is Better’” listed in IDS filed on September 27, 2023, hereinafter El-Sayed.

Regarding claims 5 and 15, Zhao discloses
wherein the initial checkpoint interval is determined (col. 15, lines 60-67: if the client system 310 frequently copies data from the GPU server node 200, the “data copy” API messages would be considered “critical points” in the execution of the GPU processing tasks for the client system 310. In another embodiment, if the client system 310 infrequently copies data from the GPU server node 200, the “critical points” of executed can be selected at a predetermined time interval; col. 16, line 67-col. 17, line 6: For each incoming request with the flag try_replay, the GPU server node 200 will attempt to match the incoming request with a previous request that is contained in the API conversation history. If there is a match, the GPU server node 200 will directly reply with the previous checkpointed GPU processing results; col. 17, lines 55-62: the server frontend 222 receives the checkpoint request API message, and forwards the received message to the checkpointing request handler 442 of the GPU checkpointing service module 440. The checkpointing request handler 442 will communicate with the task queue service 224 and the task scheduler/dispatcher 226 so that the current checkpointing request is scheduled and added to a global request queue; Note: The passages describe(a) an initial/predetermined checkpoint interval used for scheduling, (b) pre-execution/queued scheduling of checkpoint tasks, and (c) making checkpointing decisions based on API execution behavior and/or time interval).
Zhao does not disclose wherein the … checkpoint interval is determined based on a mean time to failure (MTTF) and a checkpoint cost. El-Sayed discloses wherein the … checkpoint interval is determined based on a mean time to failure (MTTF) and a checkpoint cost (Sec. II; Eqn. 1: Young’s formula determines the checkpoint interval ΔYoung based on only two quantities, the system’s mean time to failure (MTTF) and the checkpoint cost C: ΔYoung = √(2·C·MTTF); Sec. III: El-Sayed further describes simple, practical MTTF estimation techniques (SMA/WMA/EMA) and shows robustness of Young’s formula to estimation error). Zhao and El-Sayed address the same technical problem of when to place checkpoints to balance checkpoint overhead and lost work after failures in long-running, failure-prone computations (GPU-accelerated tasks in Zhao; HPC jobs in El-Sayed). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to determine Zhao’s initial (predetermined) checkpoint interval using Young’s method to compute an interval from the formula based on MTTF and checkpoint cost. The motivation would have been to identify methods for optimizing the checkpointing process that are easy to use in practice and at the same time achieve high quality solutions (El-Sayed abstract).

Claims 9 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhao et al. (US 10,275,851, hereinafter Zhao) in view of Chen et al. “Design of an adaptive GPU sharing and scheduling scheme in container-based cluster,” 2019, hereinafter Chen.

Regarding claims 9 and 19, Zhao does not disclose wherein, when the accelerator device is a graphics processing unit (GPU) and the application is a GPU application, the API function for copying data into the accelerator device is cuMemcpyHtoD, and the API function for copying data from the accelerator device is cuMemcpyDtoH. El-Sayed discloses wherein, when the accelerator device is a graphics processing unit (GPU) and the application is a GPU application, the API function for copying data into the accelerator device is cuMemcpyHtoD, and the API function for copying data from the accelerator device is cuMemcpyDtoH (Sec. 3.3: TensorFlow platform uses CUDA API to communicate and control NVIDIA GPU. Among these APIs, cuMemAlloc is used to allocate device memory and cuMemcpyDtoH, cuMemcpyHtoD are used to transfer data between CPU and GPU. To manage and schedule containers and handle the GPU memory over-commit problem, capturing these APIs is needed). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zhao’s predetermined checkpoint interval by Chen’s implementing in the checkpoint module using APIs including cuMemcpyDtoH, cuMemcpyHtoD in the checkpoint module to transfer data between CPU and GPU. The motivation would have been to manage and schedule containers and handle the GPU memory over-commit problem (Chen Set 3.3).

Allowable Subject Matter
Claims 6, 7, 16, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Zhao et al. (US 2020/0174840) discloses “These data feed operations are coordinated using an API call such as a CUDA host-to-device memory copy operation (e.g., cuMemcpyHtoD(devPtr, hostPtr, size, stream)) which is invoked at the client node” (paragraph [0069]).
Zhao et al. (US 2019/0324856) discloses “The checkpoint image scheduling control functions implement a bandwidth-aware scheduling protocol to schedule a device-to-host memory copy operation for transferring a copy of the compressed checkpoint image of the intermediate DL model from the GPU memory 254 to the system memory 216 over the bus-communication network 220 at an optimal time when bandwidth-usage of the communication link(s) between the selected GPU device and the CPU 212 is deemed to be relatively low (as per one or more predetermined criteria) and would minimize adverse impact on the DL training process” (paragraph [0045]).
Chung et al. (US 11,016861) discloses “An application may issue this API call manually or set a frequency of checkpointing using a provided “set_checkpointing_frequency” API call where a dedicated CPU thread checkpoints using the create_checkpoint API at selected time periods based on the set frequency” (col. 3, line 67-col. 4, line 5).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SISLEY N. KIM whose telephone number is (571)270-7832. The examiner can normally be reached M-F 11:30AM -7:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, April Y. Blair can be reached on (571)270-1014. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
/SISLEY N KIM/Primary Examiner, Art Unit 2196                                                                                                                                                                                                        02/14/2026

Read full office action

Prosecution Timeline

Sep 27, 2023

Application Filed

Feb 17, 2026

Non-Final Rejection — §102, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/093,273

Patent 12613751

JOB SCHEDULER FOR MULTI-TENANT FAIRNESS

3y 3m to grant Granted Apr 28, 2026

18/152,754

Patent 12602254

JOB NEGOTIATION FOR WORKFLOW AUTOMATION TASKS

3y 3m to grant Granted Apr 14, 2026

18/185,134

Patent 12602260

COMPUTER-BASED PROVISIONING OF CLOUD RESOURCES

3y 1m to grant Granted Apr 14, 2026

17/954,966

Patent 12591474

BATCH SCHEDULING FUNCTION CALLS OF A TRANSACTIONAL APPLICATION PROGRAMMING INTERFACE (API) PROTOCOL

3y 6m to grant Granted Mar 31, 2026

17/975,506

Patent 12585507

LOAD TESTING AND PERFORMANCE BENCHMARKING FOR LARGE LANGUAGE MODELS USING A CLOUD COMPUTING PLATFORM

3y 4m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

89%

Grant Probability

99%

With Interview (+16.4%)

2y 7m (~0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 675 resolved cases by this examiner. Grant probability derived from career allowance rate.