Last updated: May 29, 2026
Application No. 18/605,677
PROCESSING SYSTEM, PROCESSING METHOD, AND STORAGE MEDIUM

Non-Final OA §101§103§DOUBLEPATENT§DP
Filed
Mar 14, 2024
Priority
Aug 23, 2023 — JP 2023-135835
Examiner
MAHROUKA, WASSIM
Art Unit
2665
Tech Center
2600 — Communications
Assignee
Kabushiki Kaisha Toshiba
OA Round
1 (Non-Final)
Interview Optional

— +6.8% interview lift. Interview lift (+6.8%) is below the 15.0% threshold. A written response is recommended.
Based on 253 resolved cases, 2023–2026
Examiner Intelligence

MAHROUKA, WASSIM View full profile →
Grants 86% — above average
Career Allowance Rate
218 granted / 253 resolved
+24.2% vs TC avg
Moderate +7% lift
Without
With
+6.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 3m
Avg Prosecution
14 currently pending
Career history
277
Total Applications
across all art units
Statute-Specific Performance

§101
6.6%
-33.4% vs TC avg
§103
69.4%
+29.4% vs TC avg
§102
6.8%
-33.2% vs TC avg
§112
7.5%
-32.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 253 resolved cases
Office Action

§101 §103 §DOUBLEPATENT §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: processing system in claim 1, and processing device in claim 12. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim(s) 1-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The limitations, under their broadest reasonable interpretation, cover mental process (concept performed in a human mind, including as observation, evaluation, judgment, opinion, organizing human activity and mathematical concepts and calculations). The claim(s) recite(s) a system, a method, and a CRM for estimating a task being performed by a worker. This judicial exception is not integrated into a practical application because the steps do not add meaningful limitations to be considered specifically applied to a particular technological problem to be solved .The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the steps of the claimed invention can be done mentally and no additional features in the claims would preclude them from being performed as such except for the generic computer elements at high level of generality (i.e., processor, memory, and a generic neural network).
According to the USPTO guidelines, a claim is directed to non-statutory subject matter if: 
STEP 1: the claim does not fall within one of the four statutory categories of invention (process, machine, manufacture or composition of matter), or 
STEP 2: the claim recites a judicial exception, e.g. an abstract idea, without reciting additional elements that amount to significantly more than the judicial exception, as determined using the following analysis:
STEP 2A (PRONG 1): Does the claim recite an abstract idea, law of nature, or natural phenomenon?
STEP 2A (PRONG 2): Does the claim recite additional elements that integrate the judicial exception into a practical application?
STEP 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
Using the two-step inquiry, it is clear that claims 1, and 12-13 are directed to an abstract idea as shown below:
STEP 1: Do the claims fall within one of the statutory categories?  
YES
STEP 2A (PRONG 1): Is the claim directed to a law of nature, a natural phenomenon or an abstract idea? 
YES
The claims are directed toward a mental process (i.e. abstract idea).
	With regard to STEP 2A (PRONG 1), the guidelines provide three groupings of subject matter that are considered abstract ideas:
Mathematical concepts – mathematical relationships, mathematical formulas or equations, mathematical calculations;
Certain methods of organizing human activity – fundamental economic principles or practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions); and
Mental processes – concepts that are practicably performed in the human mind (including an observation, evaluation, judgment, opinion).
The claims comprise a mental process that can be practicably performed in the human mind (or generic computers or components configured to perform the method) and, therefore, an abstract idea.
Regarding Claim(s) 1, 12, and 13: the claims recite the steps (functions) of: 
estimate a pose of a worker based on a first image, the worker and an article being visible in the first image (mental process including observation and evaluation, and can be done mentally in the human mind);
estimate at least one selected from a state of the article and a work location of the worker on the article based on the first image (mental process including observation and evaluation, and can be done mentally in the human mind);
generate first graph data based on the pose and the at least one selected from the state and the work location, the first graph data including a plurality of nodes and a plurality of edges (mental process including observation and evaluation, and can be done mentally in the human mind or using a pen and paper);
estimate a task being performed by the worker (mental process including observation and evaluation, and can be done mentally in the human mind);
These limitations, as drafted, is a simple process that, under their broadest reasonable interpretation, covers performance of the limitations in the mind or by a human. The Examiner notes that under MPEP 2106.04(a)(2)(III), the courts consider a mental process (thinking) that “can be performed in the human mind, or by a human using a pen and paper" to be an abstract idea. CyberSource Corp. v. Retail Decisions, Inc., 654 F.3d 1366, 1372, 99 USPQ2d 1690, 1695 (Fed. Cir. 2011). As the Federal Circuit explained, "methods which can be performed mentally, or which are the equivalent of human mental work, are unpatentable abstract ideas the ‘basic tools of scientific and technological work’ that are open to all.’" 654 F.3d at 1371, 99 USPQ2d at 1694 (citing Gottschalk v. Benson, 409 U.S. 63, 175 USPQ 673 (1972)). See also Mayo Collaborative Servs. v. Prometheus Labs. Inc., 566 U.S. 66, 71, 101 USPQ2d 1961, 1965 ("‘[M]ental processes[] and abstract intellectual concepts are not patentable, as they are the basic tools of scientific and technological work’" (quoting Benson, 409 U.S. at 67, 175 USPQ at 675)); Parker v. Flook, 437 U.S. 584, 589, 198 USPQ 193, 197 (1978) (same).  

As such, a person could analyze images to estimate a pose of a worker, a state of an item, and work locations, generate a graph based on the analysis, and estimate the task being performed by the worker either mentally or using a pen and paper.  The mere nominal recitation that the various steps are being executed by a device/in a device (e.g. processing unit) does not take the limitations out of the mental process grouping.  Thus, the claims recite a mental process.   
	STEP 2A (PRONG 2): Does the claim recite additional elements that integrate the judicial exception into a practical application? 
NO.
 	The claims do not recite additional elements that integrate the judicial exception into a practical application.
With regard to STEP 2A (prong 2), whether the claim recites additional elements that integrate the judicial exception into a practical application, the guidelines provide the following exemplary considerations that are indicative that an additional element (or combination of elements) may have integrated the judicial exception into a practical application:
an additional element reflects an improvement in the functioning of a computer, or an improvement to other technology or technical field;
an additional element that applies or uses a judicial exception to affect a particular treatment or prophylaxis for a disease or medical condition; 
an additional element implements a judicial exception with, or uses a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim;
an additional element effects a transformation or reduction of a particular article to a different state or thing; and
an additional element applies or uses the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.
While the guidelines further state that the exemplary considerations are not an exhaustive list and that there may be other examples of integrating the exception into a practical application, the guidelines also list examples in which a judicial exception has not been integrated into a practical application:
an additional element merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea; 
an additional element adds insignificant extra-solution activity to the judicial exception; and 
an additional element does no more than generally link the use of a judicial exception to a particular technological environment or field of use.
Claim(s) 1, 12, and 13 does/do not recite any of the exemplary considerations that are indicative of an abstract idea having been integrated into a practical application. Claim(s) 1 and 12 recite(s) the further limitations of:
inputting the first graph data to a neural network including a graph neural network (GNN) and by using a result output from the neural network (insignificant pre/post-solution extra activity of generating data using a generic computer component)
These limitations are recited at a high level of generality (i.e. as a general action or change being taken based on the results of the acquiring step) and amounts to mere post solution actions, which is a form of insignificant extra-solution activity. Further, the claims are claimed generically and are operating in their ordinary capacity such that they do not use the judicial exception in a manner that imposes a meaningful limit on the judicial exception.  Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.  
STEP 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? 
NO.
 	The claims do not recite additional elements that amount to significantly more than the judicial exception.
With regard to STEP 2B, whether the claims recite additional elements that provide significantly more than the recited judicial exception, the guidelines specify that the pre-guideline procedure is still in effect.  Specifically, that examiners should continue to consider whether an additional element or combination of elements:
adds a specific limitation or combination of limitations that are not well-understood, routine, conventional activity in the field, which is indicative that an inventive concept may be present; or  
simply appends well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception, which is indicative that an inventive concept may not be present.
Thus, since Claim(s) 1: (a) directed toward an abstract idea, (b) do not recite additional elements that integrate the judicial exception into a practical application, and (c) do not recite additional elements that amount to significantly more than the judicial exception, it is clear that Claim(s) 1, 12 and 13 are not eligible subject matter under 35 U.S.C 101.

Regarding claim 2-11: the additional limitations do not integrate the mental process into practical application or add significantly more to the mental process. The limitation(s): The additional limitations fall under mental process including observation and evaluation, and can be done mentally in the human mind) OR (mathematical concepts,  mathematical relationships, mathematical formulas or equations, mathematical calculations) OR (organizing of human activity) OR (generic computers or components configured to perform the steps).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Terada (JP 2022176819) in view of Yamazaki (US 20250174024). 
Regarding claim 1: 
Terada discloses: a processing system (¶ [0017] “…FIG. 1 is a functional configuration diagram of a work recognition device according to Embodiment 1; FIG. 1 is a hardware/software configuration diagram of a work recognition device”; Terada discloses a work recognition device for recognizing work motion based on image data in a manufacturing site (¶¶ [0001], [0020], [0024], [0026]-[0027])); configured to:
While Terada discloses (“The work recognition device 100 is a device that performs processing for recognizing the work motion of the worker 2 or the work motion of the surrounding object 5 based on the image data acquired by the image acquisition device 6” (¶ [0026])); 
Terada does not specifically teach: estimate a pose of a worker; estimate at least one selected from a state of the article and a work location of the worker on the article based on the first image.
However, in a related field, Yamazaki teaches:  estimate at least one selected from a state of the article (Yamazaki teaches that the master appearance information includes information indicating an article used for the work (¶ [0080]); the object detection function detects the article and obtains its position (¶ [0100]); the pose analysis function can estimate a pose for an article (¶ [0105]); the color feature analysis function detects an article and classifies it into classes such as conveyor, worktable, part, semifinished product, product, baggage, and tools/equipment (¶ [0111]); and appearance information includes article information / image feature values (¶¶ [0113], [0115]-[0116], [0167]-[0168]));
and a work location of the worker on the article based on the first image (work position and work pose (¶¶ [0084]-[0087]), where a work position is the place where each step is performed and the range in which the worker is present during the step (¶ [0085]). Yamazaki further teaches behavior analysis that determines a person’s position in the image and movement in position (¶ [0107]), and step execution determined from frame images associated with a work position or work pose, including transitions among work positions (¶¶ [0178]-[0185]))
generate first graph data based on the pose and the at least one selected from the state and the work location, the first graph data including a plurality of nodes and a plurality of edges (Yamazaki supplies the specific image derived inputs as applied above: work pose (¶¶ [0104]-[0105]), article state (¶¶ [0080], [0100], [0105], [0111], [0115]-[0116], [0167]-[0168]), and work position (¶¶ [0084]-[0087], [0107], [0178]-[0185]). Terada further teaches that the feature extraction unit extracts feature amounts such as color features, motion features, CNN features, and positional information (¶¶ [0030]-[0031]); the relevance estimation unit calculates object relevance indicating the connection between objects (¶ [0032]); “ The graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033])); and
by inputting the first graph data to a neural network including a graph neural network (GNN) (Terada teaches that the graph generation unit generates “graphs used in GCN” (¶ [0033]) and that the task learning unit performs machine learning “using GCN” based on the generated graph (¶ [0034]));
and by using a result output from the neural network, estimate a task being performed by the worker (Terada teaches that the task learning unit generates an inference model from the generated graph, and that node work labels are correct labels for recognizing work such as “welding” and “screw tightening” (¶ [0034]). Terada further teaches that the work inference unit uses the inference model to infer work recognition and obtain an inference result (¶ [0035]). Yamazaki also shows the same result by teaching that the work estimation unit estimates a kind of work conducted by the worker based on appearance information from the video and the master information (¶¶ [0169]-[0172])).
Therefore, It would have been obvious to a person of ordinary skill in the art, before the effective filing date, to have modified Terada  with Yamazaki because both references are directed to image recognition of worker activities in manufacturing environments. Terada   teaches that manufacturing task recognition should use worker and object relationships in a graph using GCN framework. Yamazaki teaches explicit extraction of the worker’s pose, article state information, and work position from the video for estimating the kind of work performed. A person having ordinary skill in the art would have been motivated to incorporate Yamazaki’s worker and article analysis features into Terada’s graph and GCN recognition framework to improve the contextual specificity and accuracy of manufacturing task recognition, which is a predictable use of known image derived worker and article features in a known graph using GCN.

Regarding claim 2:
Terada in view of Yamazaki teaches the limitations of claim 1 as applied above.
Yamazaki further teaches: wherein the state is estimated based on the first image (the master appearance information includes information indicating an article used for the work (¶ [0080]); the object detection function detects the article and obtains its position in the image (¶ [0100]); the pose analysis function can estimate a pose for an article (¶ [0105]); the color feature analysis function detects an article and classifies it into classes such as conveyor, worktable, part, semifinished product, product, baggage, and tools/equipment (¶ [0111]);), and 
the first graph data includes: a plurality of first nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033])); 
Yamazaki further teachers: corresponding respectively to a plurality of joints of the worker ( “(4) The pose analysis function detects a joint point of a person from an image, and creates a stick figure model connecting the joint point.” (¶ [0104]); and further teaches that appearance information includes pose information (¶¶ [0115]-[0116], [0167]-[0168])
a plurality of first edges and a plurality of second nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033]));
corresponding respectively to a plurality of skeletal parts of the worker; and corresponding respectively to a plurality of the states that the article may be in (Yamazaki teaches article position (¶ [0100]); article pose (¶ [0105]); article class or type (¶ [0111]); article information (¶¶ [0050], [0115]-[0116], [0167]-[0168])).

Regarding claim 3:
Terada in view of Yamazaki teaches the limitations of claim 2 as applied above.
Terada further teaches: wherein in the first graph data, each of the plurality of second nodes is connected with one of the plurality of first nodes by edges “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033])).

Regarding claims 12-13: the claims limitations are similar to those of claim 1; therefore, rejected in the same manner as applied above. 

Claim(s) 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Terada (JP 2022176819) in view of Yamazaki (US 20250174024) and Datar (US 20210124944). 
Regarding claim 6:
Terada in view of Yamazaki teaches the limitations of claim 1 as applied above.
wherein the work location is estimated based on the first image (Yamazaki teaches work position and work pose (¶¶ [0084]-[0087]), where a work position is the place where each step is performed and the range in which the worker is present during the step (¶ [0085]). Yamazaki further teaches behavior analysis that determines a person’s position in the image and movement in position (¶ [0107]), and step execution determined from frame images associated with a work position or work pose, including transitions among work positions (¶¶ [0178]-[0185])); and
the first graph data includes: a plurality of first nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033])); 
Yamazaki further teachers: corresponding respectively to a plurality of joints of the worker ( “(4) The pose analysis function detects a joint point of a person from an image, and creates a stick figure model connecting the joint point.” (¶ [0104]); and further teaches that appearance information includes pose information (¶¶ [0115]-[0116], [0167]-[0168])
a plurality of first edges (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033]));
corresponding respectively to a plurality of skeletal parts of the worker; and corresponding respectively to a plurality of the states that the article may be in (Yamazaki teaches article position (¶ [0100]); article pose (¶ [0105]); article class or type (¶ [0111]); article information (¶¶ [0050], [0115]-[0116], [0167]-[0168]));
a plurality of third nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033]));
Terada in view of Yamazaki does not specifically teach: the plurality of third nodes corresponding respectively to a plurality of locations on the article.
However, in a related field, Datar teaches: a plurality of locations on the article (“determine whether the aggregated wrist position 4106 corresponds to a position on a shelf 3920a-c of the rack 112. This may be achieved by comparing the aggregated wrist position 4106 to a set of one or more predefined shelf positions (e.g., determined based at least in part on the shelf markers 3922a-c, described above). Based on this comparison, the tracking subsystem 3910 may determine whether the aggregated wrist position 4106 is within a threshold distance of at least one of the shelves 3920a-c of the rack 112 or to a predefined location of the item 3924a-i on the shelf 3920a-c. If the aggregated wrist position 4106 is within a threshold distance of a shelf 3920a-c (e.g., or of a predefined position of an item 3924a-i stored on a shelf 3920a-c), the event trigger 4006 of FIG. 40 may be initiated (e.g., provided for data handling and integration, as illustrated in FIG. 40).” (¶ [0486]).
Therefore, It would have been obvious to a person of ordinary skill in the art, before the effective filing date, to have modified Terada in view of Yamazaki because Terada already teaches using graph structure and GCN for manufacturing site task recognition, Yamazaki teaches extracting worker pose information in manufacturing video via joints and a stick figure model, where in Datar teaches that pose derived wrist position relative to multiple predefined article locations is useful for identifying the interaction location and action recognition.  Motivation to combine lies in the fact that the modified system improves the contextual specificity of task recognition and the combination yields a predicable result. 

Regarding claim 7:
Terada in view of Yamazaki teaches the limitations of claim 1 as applied above.
Yamazaki further teaches: wherein the state is estimated based on the first image (the master appearance information includes information indicating an article used for the work (¶ [0080]); the object detection function detects the article and obtains its position in the image (¶ [0100]); the pose analysis function can estimate a pose for an article (¶ [0105]); the color feature analysis function detects an article and classifies it into classes such as conveyor, worktable, part, semifinished product, product, baggage, and tools/equipment (¶ [0111]);)
wherein the work location is estimated based on the first image (Yamazaki teaches work position and work pose ¶¶ [0084]-[0087]), where a work position is the place where each step is performed and the range in which the worker is present during the step (¶ [0085]). Yamazaki further teaches behavior analysis that determines a person’s position in the image and movement in position (¶ [0107]), and step execution determined from frame images associated with a work position or work pose, including transitions among work positions (¶¶ [0178]-[0185]));  and 
the first graph data includes: a plurality of first nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033])); 
Yamazaki further teachers: corresponding respectively to a plurality of joints of the worker ( “(4) The pose analysis function detects a joint point of a person from an image, and creates a stick figure model connecting the joint point.” (¶ [0104]); and further teaches that appearance information includes pose information (¶¶ [0115]-[0116], [0167]-[0168])
a plurality of first edges and a plurality of second nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033]));
corresponding respectively to a plurality of skeletal parts of the worker; and corresponding respectively to a plurality of the states that the article may be in (Yamazaki teaches article position (¶ [0100]); article pose (¶ [0105]); article class or type (¶ [0111]); article information (¶¶ [0050], [0115]-[0116], [0167]-[0168])).
a plurality of third nodes (Terada teaches “the graph generation unit 105 is a functional unit that generates graphs used in GCN. The graph generation unit 105 generates graph nodes for the number of regions acquired by the analysis region detection unit 102, assigns each node a feature amount obtained by the feature extraction processing of the feature extraction unit 103, and assigns a node identification label 202 to each node. label. Labels of the node specific label 202 are names of objects in the manufacturing site, such as "robot", "worker", and "processing machine", for example. Assuming that there is a connection between each node, an edge is provided between those nodes, and the edge of the graph is weighted using the object relevance obtained from the relevance estimation unit 104 .” (¶ [0033]));
Terada in view of Yamazaki does not specifically teach: the plurality of third nodes corresponding respectively to a plurality of locations on the article.
However, in a related field, Datar teaches: a plurality of locations on the article (“determine whether the aggregated wrist position 4106 corresponds to a position on a shelf 3920a-c of the rack 112. This may be achieved by comparing the aggregated wrist position 4106 to a set of one or more predefined shelf positions (e.g., determined based at least in part on the shelf markers 3922a-c, described above). Based on this comparison, the tracking subsystem 3910 may determine whether the aggregated wrist position 4106 is within a threshold distance of at least one of the shelves 3920a-c of the rack 112 or to a predefined location of the item 3924a-i on the shelf 3920a-c. If the aggregated wrist position 4106 is within a threshold distance of a shelf 3920a-c (e.g., or of a predefined position of an item 3924a-i stored on a shelf 3920a-c), the event trigger 4006 of FIG. 40 may be initiated (e.g., provided for data handling and integration, as illustrated in FIG. 40).” (¶ [0486]).
Therefore, It would have been obvious to a person of ordinary skill in the art, before the effective filing date, to have modified Terada in view of Yamazaki because Terada already teaches using graph structure and GCN for manufacturing site task recognition, Yamazaki teaches extracting worker pose information in manufacturing video via joints and a stick figure model, where in Datar teaches that pose derived wrist position relative to multiple predefined article locations is useful for identifying the interaction location and action recognition.  Motivation to combine lies in the fact that the modified system improves the contextual specificity of task recognition and the combination yields a predicable result. 


Claim(s) 8 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Terada (JP 2022176819) in view of Yamazaki (US 20250174024) and Rezazadeh (US 20240185052). 
Regarding claim 8:
Terada in view of Yamazaki teaches the limitations of claim 1 as applied above.
Yamazaki further teaches: estimate the pose of the worker based on a second image, the worker and the article being visible in the second image (the video is formed of a plurality of frame images (¶ [0074]), that the analysis is performed on images different on a time-series basis (¶ [0112]), and that the analysis unit generates time-series appearance information and includes pose information (¶ [0115], ¶¶ [0167]-[0168]). Yamazaki also teaches object detection of worker/article (¶ [0100]) and pose analysis by stick figure (¶ [0104]));
estimate at least one selected from the state of the article and the work location of the worker on the article based on the second image (Yamazaki teaches that the analysis unit generates time-series appearance information from the video (¶ [0115]) and repeatedly analyzes the frame images (¶¶ [0167]-[0168]). Yamazaki also teaches article-related state information from the image, including article detection and position (¶ [0100]), article pose (¶ [0105]), and article class/type (¶ [0111]));
generate second graph data by using a result estimated based on the second image, the second graph data including a plurality of nodes and a plurality of edges (Terada teaches extracting feature amounts between predetermined frames (¶ [0050]), securing chronological data such as previous and subsequent frames (¶ [0051]), estimating object relevance (¶ [0052]), and generating a graph used in GCN based on that relevance (¶ [0053]). Terada further teaches that for time-series data such as video data, the same processing is performed for each clock or time range as video data is obtained (¶ [0087]). Thus, TERADA teaches generating graph data repeatedly from different time positions in the video, i.e., a first graph and a second graph based on a later image/time.);
While Terada teaches that generated graph data are input to GCN and used to produce an inference result (¶¶ [0053]-[0054]). Terada does not expressly teach that second graph data, in addition to first graph data, are both used in the neural network path for task estimation.
However, in a related field, Rezazadeh teaches:  estimate the task by using a result output from the neural network when the second graph data, in addition to the first graph data, is input to the neural network (constructing a first graph representation (¶ [0031]) and a second graph representation (¶ [0032]), performing message passing between the two graph representations (¶ [0037]), and executing the task based on readouts from the updated first and second graph representations (¶¶ [0043]-[0045]). Rezazadeh also teaches that the readouts from each graph representation may be concatenated and passed to another neural network such as a classifier network (¶ [0038])).
Therefore, It would have been obvious to a person of ordinary skill in the art, before the effective filing date, to have modified Terada in view of Yamazaki to repeat Terada’s graph generation on a second frame image in Yamazaki’s time series manufacturing video, thereby producing second graph data, and then use that second graph data in addition to the first graph data in the neural network task estimation path, as expressly taught by Rezazadeh‘s first and second graphs architecture. Doing so would have predictably improved task estimation by incorporating temporally distinct graph information rather than relying on a single graph instance alone.
Regarding claim 11:
Terada in view of Yamazaki and Rezazadeh teaches the limitations of claim 8 as applied above.
Terada teaches that graph data generated from video data are input to GCN, and that the graph based machine learning produces an inference result (¶¶ [0053]-[0054]). Thus, Terada teaches the claimed result output from the GNN.
However, Terada does not expressly teach that the first GNN result and the second GNN result are both received by a downstream fully connected layer.
in the neural network, a fully connected layer, receives input of: a result output from the GNN when the first graph data is input to the GNN; and a result output from the GNN when the second graph data is input to the GNN (Rezazadeh teaches taking readouts from each graph representation, concatenating those readouts, and passing them to another neural network, such as a classifier network, for downstream task execution (¶ [0038]). Rezazadeh also teaches executing the task based on readouts from the updated first graph representation and the updated second graph representation (¶¶ [0043]-[0045]). Thus, Rezazadeh teaches the precise architectural idea of using the readouts of the first and second graph representations together as the input to a downstream classifier stage. As for the Fully connected layer aspect: Rezazadeh does not use the exact phrase “fully connected layer” in the cited passages. However, Rezazadeh expressly teaches a downstream classifier network receiving concatenated fixed length graph readouts. A PHOSITA would have understood a fully connected classifier head to be a standard and routine implementation for taking concatenated graph readouts and producing a task classification).
the task is estimated by using a result output from the fully connected layer (Terada teaches that the graph-based machine-learning output is used for task inference (¶¶ [0053]-[0054]. Rezazadeh teaches that the concatenated readouts from the first and second graph representations are passed to another neural network, such as a classifier network, and that the task is executed based on readouts from the updated first and second graph representations (¶ [0038], ¶¶ [0043]-[0045]). Thus, Rezazadeh teaches the downstream classifier and task-estimation stage based on the combined graph outputs).

No prior art has been applied to claims 4-5 and 9-10
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-13 are  provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1-10 of copending Application No. 18605592 in view of the cited prior art above including 
Terada (JP 2022176819) 
Yamazaki (US 20250174024)
Datar (US 20210124944), and 
Rezazadeh (US 20240185052).
Motivation to combine are similar to those found throughout the office action. 
 For example, instant claim 1 is broader than and not patently distinct from claim 5 and/or claim 6 of the reference application. 
This is a provisional nonstatutory double patenting rejection.

Relevant art not relied upon 
Walker (US 20230022356) teaches training models that use human pose data to predict human activities or specific task(s) that the workers are engaged in efficiently.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WASSIM MAHROUKA whose telephone number is (571)272-2945. The examiner can normally be reached Monday-Thursday 8:00-5:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Koziol can be reached at (408) 918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WASSIM MAHROUKA/
Primary Examiner, Art Unit 2665
Read full office action
Prosecution Timeline

Mar 14, 2024
Application Filed
May 08, 2026
Non-Final Rejection mailed — §101, §103, §DOUBLEPATENT (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/477,634
Patent 12633134
RECOGNITION APPARATUS, RECOGNITION METHOD, AND STORAGE MEDIUM
2y 7m to grant Granted May 19, 2026
18/415,568
Patent 12626356
CORESET BASED MASK INSPECTION FOR SEMICONDUCTOR SPECIMEN FABRICATION
2y 3m to grant Granted May 12, 2026
18/303,078
Patent 12620264
ACTUATING A SYSTEM VIA LIVE PERSON IMAGE RECOGNITION
3y 0m to grant Granted May 05, 2026
18/646,283
Patent 12608774
Image Processing Apparatus, Image Processing Method and Storage Medium
1y 12m to grant Granted Apr 21, 2026
18/398,193
Patent 12602739
JOINT DENOISING AND DEMOSAICKING METHOD FOR COLOR RAW IMAGES GUIDED BY MONOCHROME IMAGES
2y 3m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
86%
Grant Probability
93%
With Interview (+6.8%)
2y 3m (~1m remaining)
Median Time to Grant
Low
PTA Risk
Based on 253 resolved cases by this examiner. Grant probability derived from career allowance rate.