Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1, 13, and 18 have been amended. Claim 21 is new. Claims 1-21 are currently pending and have been considered by the Examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 18-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 18 is rendered indefinite because the amendment from line 12 to “sensor types” in line 14 is identical to the amendment in lines 19-21. The features of claim 18 are similar to claim 1. It is unclear if the amendment in claim 18, lines 19-21 is a typographical error and is supposed to recite features similar to claim 1, lines 20-23. Examiner treats the amendment in claim 18, lines 19-21 as if it had recited the same features as claim 1, lines 20-23.
Claims 19-20 are rejected for failing to cure the deficiencies of claim 18.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Claims 1-12 and 21 recite a method, claims 13-17 recite a system comprising processors, and claims 18-20 recite a non-transitory computer-readable storage medium (a product). A method, a system, and a product each falls under one of the four statutory categories of patent eligible subject matter.
CLAIM 1
Step 2A Prong 1: Detect trigger conditions associated with a predefined mission of the computer system is an observation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Generating training information items based on a behavior pattern determined by data from two or more of the at least two distinct sensor types is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. A person can reasonably create training information items (training input data) by evaluating data from two distinct sensor types.
While streaming the sensor data: detecting one or more signature events in the sensor data is an observation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. For example, this limitation amounts to a human watching and listening to a football game, and using his or her mind to detect/recognize events in the data such as a player scoring points in the game.
While streaming the sensor data: generating one or more information items characterizing the one or more signature events detected in the sensor data, independently of the sensor types
Step 2A Prong 2: A computer system having one or more processors and memory amounts to generic computer components for applying the abstract ideas on a generic computer under MPEP 2106.05(f).
Streaming the sensor data from a plurality of sensor devices during a time duration amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
The plurality of sensor devices including at least two distinct senor types and disposed in a physical environment amounts to a field of use and technological environment under MPEP 2106.05(h).
Generating a large behavior model including context information of multiple sensor domains by obtaining a predefined large language model (LLM) trained with predefined training data, obtaining training data associated with the plurality of sensor devices, wherein the training data includes information associated with a signature event that includes one or more training information items
While streaming the sensor data: applying the large behavior model to process the one or more information items associated with the sensor data and generate a multimodal output associated with the sensor data in real time while the sensor data are being streamed, the multimodal output describing the one or more signature events associated with the sensor data in one of a plurality of predefined output modalities amounts to insignificant extra-solution activity under MPEP 2106.05(g).
While streaming the sensor data: presenting the multimodal output according to the one of the plurality of predefined output modalities amounts to insignificant extra-solution activity under MPEP 2106.05(g).
The additional elements as disclosed above, alone or in combination, do not integrate the abstract ideas into a practical application as they are mere insignificant extra solution activities as disclosed in combination with generic computer functions and a field of use that are implemented to perform the abstract ideas disclosed above. The claim is directed to an abstract idea.
Step 2B: A computer system having one or more processors and memory amounts to generic computer components for applying the abstract ideas on a generic computer under MPEP 2106.05(f).
Streaming the sensor data from a plurality of sensor devices during a time duration is analogous to receiving data over a network, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II).
The plurality of sensor devices including at least two distinct senor types and disposed in a physical environment amounts to a field of use and technological environment under MPEP 2106.05(h).
Generating a large behavior model including context information of multiple sensor domains by obtaining a predefined large language model (LLM) trained with predefined training data, obtaining training data associated with the plurality of sensor devices, wherein the training data includes information associated with a signature event that includes one or more training information items
While streaming the sensor data: applying the large behavior model to process the one or more information items associated with the sensor data and generate a multimodal output associated with the sensor data in real time while the sensor data are being streamed, the multimodal output describing the one or more signature events associated with the sensor data in one of a plurality of predefined output modalities amounts to well-understood, routine, conventional activity under MPEP 2106.05(d)(I). De Barros et al. (US 20250036695 A1, cited in PTO-892 issued 05/13/2025) at paragraph [0003] provides Berkheimer evidence for large language models configured to generate outputs based upon text inputs set forth by a user and in near real-time.
While streaming the sensor data: presenting the multimodal output according to the one of the plurality of predefined output modalities is analogous to presenting offers and gathering statistics, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II).
The additional elements as disclosed above, in combination with the abstract ideas, are not sufficient to amount to significantly more than the abstract ideas as they are well-understood, routine and conventional activities as disclosed in combination with generic computer functions and a field of use that are implemented to perform the abstract ideas disclosed above. The claim is not patent eligible.
CLAIM 2 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. A first information item is generated based on the subset of sensor data to characterize the first signature event is an judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2 and Step 2B: A subset of sensor data corresponds to a first signature event, and includes a first temporal sequence of sensor samples obtained from a first sensor device and a second temporal sequence of sensor samples obtained from a second sensor device amounts a description of sensor data, which is a field of use and technological environment under MPEP 2106.05(h).
A first sensor type of the first sensor device is different from a second sensor type of the second sensor device amounts to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 3 incorporates the rejection of claim 2.
Step 2A Prong 1: The abstract ideas of claim 2 are incorporated.
Step 2A Prong 2: The first temporal sequence of sensor samples and the second temporal sequence of sensor samples are concurrently measured amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g) and a field of use and technological environment under MPEP 2106.05(h).
The first temporal sequence of sensor samples has a first sampling rate, and the second temporal sequence of sensor samples has a second sampling rate that is different from the first sampling rate amounts to a field of use and technological environment under MPEP 2106.05(h).
Step 2B: The first temporal sequence of sensor samples and the second temporal sequence of sensor samples are concurrently measured is analogous to receiving data over a network, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II).
The first temporal sequence of sensor samples has a first sampling rate, and the second temporal sequence of sensor samples has a second sampling rate that is different from the first sampling rate amounts to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 4 incorporates the rejection of claim 2.
Step 2A Prong 1: The abstract ideas of claim 2 are incorporated.
Step 2A Prong 2 and Step 2B: Applying at least a universal event projection model to process the first temporal sequence of sensor samples and the second temporal sequence of sensor samples jointly to generate the first information item amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 5 incorporates the rejection of claim 2.
Step 2A Prong 1: The abstract ideas of claim 2 are incorporated.
Step 2A Prong 2 and Step 2B: Applying at least a first event projection model to process the first temporal sequence of sensor samples to generate the first information item amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Applying at least a second event projection model to process the second temporal sequence of sensor samples to generate the first information item, the first event projection model distinct from the second event projection model amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 6 incorporates the rejection of claim 5.
Step 2A Prong 1: The abstract ideas of claim 5 are incorporated. Selecting each of the first event projection model and the second event projection model based on a respective device type of the first sensor device and the second sensor device is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2 and Step 2B: The claim does not recite any additional elements which, alone or in combination, would integrate the abstract ideas into a practical application. The claim does not recite any additional elements which, in combination with the abstract ideas, would be sufficient to amount to significantly more than the abstract ideas. The claim is not patent eligible.
CLAIM 7 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. Generating an ordered sequence of respective sensor data features defining a respective parametric representation of the temporal sequence of respective sensor samples, independently of a sensor type
Step 2A Prong 2 and Step 2B: Each sensor device corresponds to a temporal sequence of respective sensor samples amounts to a field of use and technological environment under MPEP 2106.05(h).
Providing the ordered sequence of respective sensor data features to an event projection model amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 8 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. Associating each sensor data item of the temporal sequence of sensor data with a respective timestamp and a subset of respective sensor samples that are grouped based on the temporal window is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2: The sensor data includes a temporal sequence of sensor data amounts to a description of sensor data, which is a field of use and technological environment under MPEP 2106.05(h).
Obtaining the sensor data further comprises: obtaining a stream of context data measured continuously by the plurality of sensor devices amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
The stream of context data including the temporal sequence of respective sensor samples that are grouped for each sensor device based on a temporal window, the temporal window configured to move with a time axis amounts to a field of use and technological environment under MPEP 2106.05(h).
Step 2B: The sensor data includes a temporal sequence of sensor data amounts to a description of sensor data, which is a field of use and technological environment under MPEP 2106.05(h).
Obtaining the sensor data further comprises: obtaining a stream of context data measured continuously by the plurality of sensor devices is analogous to receiving data over a network, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II).
The stream of context data including the temporal sequence of respective sensor samples that are grouped for each sensor device based on a temporal window, the temporal window configured to move with a time axis amounts to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 9 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated.
Step 2A Prong 2: Storing the one or more information items associated with the one or more signature events amounts to insignificant extra-solution activity under MPEP 2106.05(g).
The one or more information items including a timestamp and a location of each of the one or more signature events amounts to a field of use and technological environment under MPEP 2106.05(h).
Step 2B: Storing the one or more information items associated with the one or more signature events is analogous to storing information in memory, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II).
The one or more information items including a timestamp and a location of each of the one or more signature events amounts to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 10 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. Determining a behavior pattern based on the one or more signature events for the time duration is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Generating a subset of the one or more information items describing the behavior pattern is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2: Examiner treats “providing the subset of the one or more information items of the behavior pattern associated with the sensor data” as outputting the subset, which amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f) and an insignificant extra-solution activity under MPEP 2106.05(g).
Step 2B: Providing the subset of the one or more information items of the behavior pattern associated with the sensor data amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). It is analogous to presenting offers and gathering statistics, which the courts have recognized as a well-understood, routine, conventional activity MPEP 2106.05(d)(II). The claim is not patent eligible.
CLAIM 11 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. Based on a predefined loss function, training the large behavior model using the plurality of training inputs and associated ground truths is a mathematical calculation.
Step 2A Prong 2: Obtaining a plurality of training inputs, each training input including a training text prompt and an information item associated with a training signature event amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
Obtaining a ground truth corresponding to each training input, the ground truth including a sample multimodal output preferred for the training input amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
Step 2B: Obtaining a plurality of training inputs, each training input including a training text prompt and an information item associated with a training signature event is analogous to receiving data over a network, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II).
Obtaining a ground truth corresponding to each training input, the ground truth including a sample multimodal output preferred for the training input is analogous to receiving data over a network, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II). The claim is not patent eligible.
CLAIM 12 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated.
Step 2A Prong 2: Obtaining a plurality of training inputs amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
Each training input including one or more test tags of a sequence of signature events, the one or more test tags having a predefined description format in which the one or more information items and an associated timestamp of each signature event is organized amounts to a field of use and technological environment under MPEP 2106.05(h).
Step 2B: Obtaining a plurality of training inputs is analogous to receiving data over a network, which the courts have recognized as a well-understood, routine, conventional activity under MPEP 2106.05(d)(II).
Each training input including one or more test tags of a sequence of signature events, the one or more test tags having a predefined description format in which the one or more information items and an associated timestamp of each signature event is organized amounts to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
Claim 13 recites a system which implements the same features as the method of claim 1 and is therefore rejected for at least the same reasons.
In Step 2A Prong 2 and in Step 2B, one or more processors and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform operations amount to generic computer components for applying the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 14 incorporates the rejection of claim 13.
Step 2A Prong 1: The abstract ideas of claim 13 are incorporated.
Step 2A Prong 2 and Step 2B: For a temporal window corresponding to a subset of sensor data, the memory further has instructions for: applying at least a universal event projection model to process the subset of sensor data within the respective temporal window and detect the one or more signature events amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 15 incorporates the rejection of claim 13.
Step 2A Prong 1: The abstract ideas of claim 13 are incorporated.
Step 2A Prong 2 and Step 2B: The plurality of sensor devices include one or more of: a presence sensor, a proximity sensor, a microphone, a motion sensor, a gyroscope, an accelerometer, a Radar, a Lidar scanner, a camera, a temperature sensor, a heartbeat sensor, and a respiration sensor amount to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 16 incorporates the rejection of claim 13.
Step 2A Prong 1: The abstract ideas of claim 13 are incorporated.
Step 2A Prong 2: Processing the sensor data to generate one or more sets of intermediate items successively and iteratively, until generating the one or more information items amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Storing the one or more information items or the multimodal output in a database amounts to insignificant extra-solution activity under MPEP 2106.05(g).
Step 2B: Processing the sensor data to generate one or more sets of intermediate items successively and iteratively, until generating the one or more information items amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Storing the one or more information items or the multimodal output in a database is analogous to storing information in memory and electronic recordkeeping, which the courts have recognized as well-understood, routine, conventional activities under MPEP 2106.05(d)(II). The claim is not patent eligible.
CLAIM 17 incorporates the rejection of claim 16.
Step 2A Prong 1: The abstract ideas of claim 13 are incorporated.
Step 2A Prong 1: The instructions for processing the sensor data further comprise instructions for: processing the sensor data to generate a first set of intermediate items at a first time amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Storing the first set of intermediate items in the database amounts to insignificant extra-solution activity under MPEP 2106.05(g).
Processing the first set of intermediate items to generate one or more second sets of intermediate items successively at one or more successive second times following the first time amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Successively storing the one or more second sets of intermediate items in the database, and deleting the first set of intermediate items from the database amounts to insignificant extra-solution activity under MPEP 2106.05(g).
Processing a most recent intermediate set of the one or more second sets of intermediate items to generate the one or more information items at a third time following the one or more successive second times amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Step 2B: The instructions for processing the sensor data further comprise instructions for: processing the sensor data to generate a first set of intermediate items at a first time amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Storing the first set of intermediate items in the database is analogous to storing information in memory and electronic recordkeeping, which the courts have recognized as well-understood, routine, conventional activities under MPEP 2106.05(d)(II).
Processing the first set of intermediate items to generate one or more second sets of intermediate items successively at one or more successive second times following the first time amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Successively storing the one or more second sets of intermediate items in the database, and deleting the first set of intermediate items from the database is analogous to storing information in memory and electronic recordkeeping, which the courts have recognized as well-understood, routine, conventional activities under MPEP 2106.05(d)(II).
Processing a most recent intermediate set of the one or more second sets of intermediate items to generate the one or more information items at a third time following the one or more successive second times amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
Claim 18 recites a product which implements the same features as the method of claim 1 and is therefore rejected for at least the same reasons. Examiner treats the amendment in claim 18, lines 19-21 as if it had recited the same features as claim 1, lines 20-23.
In Step 2A Prong 2 and Step 2B, a non-transitory computer-readable storage medium, having instructions stored thereon, which when executed by one or more processors cause the one or more processors to perform operations amount to generic computer components for applying the abstract ideas on a generic computer under MPEP 2106.05(f). The claim is not patent eligible.
CLAIM 19 incorporates the rejection of claim 18.
Step 2A Prong 1: The abstract ideas of claim 18 are incorporated.
Step 2A Prong 2 and Step 2B: The multimodal output includes one or more of: description, timestamp, numeral information, statistic summary, warning message, and recommended action associated with the one or more signature events amounts a description of output data, which is a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 20 incorporates the rejection of claim 18.
Step 2A Prong 1: The abstract ideas of claim 18 are incorporated. Identifying the plurality of predefined output modalities including two one or more distinct modalities is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Selecting the one of the plurality of predefined output modalities is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2 and Step 2B: Textual statements, software code, an image or video, an information dashboard having a predefined format, a user interface, and a heatmap amount to a field of use and technological environment under MPEP 2106.05(h). The claim is not patent eligible.
CLAIM 21 incorporates the rejection of claim 1.
Step 2A Prong 1: The abstract ideas of claim 1 are incorporated. In response to the user query, extracting the one or more information items characterizing the one or more signature events for one or more temporal windows that are included in a time duration defined by the user query is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper.
Step 2A Prong 2: Displaying, on a user interface of an application executed on a client device, a conversation panel amounts to insignificant extra-solution activity under MPEP 2106.05(g). A client device amounts to a generic computer component for applying the abstract ideas on a generic computer under MPEP 2106.05(f).
Receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being streamed amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g).
Providing the user query, the one or more information items in the time duration, and respective timestamps to the large behavior model amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query, the multimodal output characterizing the sensor data amounts to insignificant extra-solution activity under MPEP 2106.05(g).
Step 2B: Displaying, on a user interface of an application executed on a client device, a conversation panel is analogous to presenting offers and gathering statistics, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II). A client device amounts to a generic computer component for applying the abstract ideas on a generic computer under MPEP 2106.05(f).
Receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being streamed is analogous to receiving data over a network, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II).
Providing the user query, the one or more information items in the time duration, and respective timestamps to the large behavior model amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f).
Displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query, the multimodal output characterizing the sensor data is analogous to presenting offers and gathering statistics, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II). The claim is not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-4, 7, 9-10, 13-15, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), and Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025).
Regarding claim 1, Pandya teaches: A method for processing sensor data, comprising: at a computer system having one or more processors and memory: (C. 26, L. 29-35; C. 27, L. 65 to C. 28, L. 2 and L. 13-15.)
streaming the sensor data from a plurality of sensor devices during a time duration, the plurality of sensor devices including at least two distinct senor types and disposed in a physical environment, wherein at least one respective sensor device of the plurality of sensor devices is configured to detect trigger conditions associated with a predefined mission of the computer system; (C. 8, L. 23 to “site” in L. 34; C. 8, L. 39-41 disclose a hazard worksite which is a physical environment. C. 17, L. 12-18, 48-55, and 59 to C. 18, L. 2 discloses inputting sensors data to safety engine 210 and outputting a warning signal and interventions at output 215 . A sensor data stream contains sensor data during a time duration. A predefined mission is keeping workers safe at a construction site, and trigger conditions include any sensor data (e.g., 202 or 204) which triggers model 213 to generate a warning signal or intervention.)
generating a large behavior model including context information of multiple sensor domains by obtaining a predefined
while streaming the sensor data: (C. 7, L. 37-53 discloses issuing a “real-time alert or warning to workers” by utilizing “real-time multimodal sensor data”. This indicates the that processing sensor data streams and delivering an alert to workers happens continuously while streaming the sensor data.)
…
generating one or more information items characterizing the one or more signature events of sensor devices, wherein the one or more information items are generated based on a subset of the sensor data corresponding to the one or more signature events, the subset of the sensor data including sensor samples obtained from two or more of the at least two distinct sensor types; (Fig. 2, C. 17, from L. 59 to “level” in L. 66, and C. 18, L. 12-17. A “signature event” is an incident, a behavior/action not in compliance with safety protocol, or fatigue level. “Information items” are input features in the input feature dataset. The module 211 generates features from input data regardless (“independently”) of the sensor types, and a subset of sensor data includes samples captured by the computer vision and the LIDAR systems at the same time.)
applying the large behavior model to process the one or more information items associated with the sensor data and generate a multimodal output associated with the sensor data in real time while the sensor data are being streamed, the multimodal output describing the one or more signature events associated with the sensor data in one of a plurality of predefined output modalities; and (C. 10, L. 3-15; Fig. 2, C. 17, L. 12-18, 48-55, and 59 to C. 18, L. 2. Predictive model 213 is a “large behavior model” because it processes behaviors of construction workers. An alert is a first predefined output modality, and an intervention is a second predefined output modality. Alerts and interventions at the output 215 describe an incident, a behavior/action not in compliance with safety protocol, or fatigue level.)
presenting the multimodal output according to the one of the plurality of predefined output modalities. (C. 10, L. 3-15 teaches delivering an alert as a vibration and delivering an intervention as a rhythmic cue. Delivering the alert or intervention to the worker corresponds to “presenting the multimodal output.”)
However, Pandya does not explicitly teach: generating a large behavior model by obtaining a predefined large language model (LLM) and re-training the predefined LLM with the training data, the LLM having a self-attention based transformer structure;
while streaming the sensor data: detecting one or more signature events in the sensor data;
But Vörös teaches: generating a large
the LLM having a self-attention based transformer structure; ([0097], lines 1-11 discloses GPT uses a transformer decoder made up of a stack of layers that perform multi-head attention. In Applicant’s specification, paragraph [0065], lines 7-9 discloses GPT-3 has self-attention layers.)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Vörös’ GPT-3 model into Pandya’s safety inference engine 210 and fine-tuned the GPT-3 model on Pandya’s camera images and analogously on Pandya’s LIDAR point cloud data. A motivation for the combination is that GPT is suitable for language modeling tasks, such as text generation, summarization, and question answering. (Vörös, [0097])
However, Pandya and Vörös do not explicitly teach: while streaming the sensor data: detecting one or more signature events in the sensor data;
But Newcomb teaches: while streaming the sensor data: detecting one or more signature events in the sensor data; ([0108] starting on page 9 in left column, lines 33-40 teaches detecting a motion event, which is a “signature event in the sensor data” as claimed.)
generating one or more information items characterizing the one or more signature events detected in the sensor data, ([0108] starting on page 9 in left column, line 33 to the end of the paragraph discloses classifying a moving object characterizing the detected motion event. Features of a moving object are “information items” as claimed.)
Pandya’s engine 210 continues to process vision data whether or not any movement has been detected. It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Newcomb’s computer vision software into Pandya’s computer vision system. A motivation for the combination is to execute Pandya’s safety inference engine 210 only when a significant change in the environment happens (Newcomb, [0108] on page 9, left column, lines 33-40). Furthermore, it would have been obvious to have incorporated similar software into Pandya’s LIDAR system 203 that analyzes sequential frames of LIDAR data for differences and registers a motion event when a large enough change is detected.
Regarding claim 2, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, wherein:
Pandya teaches: a subset of sensor data corresponds to a first signature event, and includes a first temporal sequence of sensor samples obtained from a first sensor device and a second temporal sequence of sensor samples obtained from a second sensor device; (C. 10, L. 3 to “fall” in L. 6 and C. 17, L. 12-18 disclose a first sensor is a computer vision (CV) system 201, a second sensor is a LIDAR system 203, a first signature event is the incident, and a subset of sensor data includes sequences of CV and LIDAR data samples corresponding to the incident. C. 18, L. 21 to “time” in L. 29 discloses that data captured by camera and LIDAR may be aligned with respect to time, which indicates there is a temporal sequence of camera/CV data samples and a temporal sequence of LIDAR data samples.)
a first sensor type of the first sensor device is different from a second sensor type of the second sensor device; and (C. 17, L. 12-18)
a first information item is generated based on the subset of sensor data to characterize the first signature event. (Each of Fig. 2, C. 17, L. 59-62, and C. 18, L. 12-17 discloses a module 211 for generating an input feature dataset. A first information item includes an input feature within the dataset that characterizes the incident.)
Regarding claim 3, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 2,
Pandya teaches: wherein the first temporal sequence of sensor samples and the second temporal sequence of sensor samples are concurrently measured, and (C. 2, L. 22-37 discloses fusing CV and LIDAR sensor data, which indicates they had been concurrently measured.)
wherein the first temporal sequence of sensor samples has a first sampling rate, and the second temporal sequence of sensor samples has a second sampling rate that is different from the first sampling rate. (C. 18, L. 21 to “frequency” in L. 25)
Regarding claim 4, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 2, further comprising:
Pandya teaches: applying at least a universal event projection model to process the first temporal sequence of sensor samples and the second temporal sequence of sensor samples jointly to generate the first information item. (Each of Fig. 2, C. 17, L. 59-62, and C. 18, L. 12-17 discloses a module 211 for generating an input feature dataset to be processed by the trained predictive model 213. A universal event projection model is module 211, and the first information item includes an input feature within the dataset that characterizes the tripping incident.)
Regarding claim 7, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1,
Pandya teaches: wherein each sensor device corresponds to a temporal sequence of respective sensor samples, (C. 18, L. 21 to “time” in L. 29 discloses that data captured by camera and LIDAR may be aligned with respect to time, which indicates the system captures a temporal sequence of camera/CV data samples and a temporal sequence of LIDAR data samples.)
the method further comprising, for each sensor device: generating an ordered sequence of respective sensor data features defining a respective parametric representation of the temporal sequence of respective sensor samples, independently of a sensor type of the respective sensor device; and (C. 18, L. 21 to “time” in L. 29 teaches pre-processing raw sensor data by aligning the data with respect to time. Each “temporal sequence of respective sensor samples” corresponds to raw sensor data from a particular sensor (e.g., camera or LIDAR). Each “ordered sequence of respective sensor features defining a respective parametric representation” corresponds to time-aligned sensor data from a particular sensor. The time-aligned sensor data contains sensor features, and it would have been aligned based on parameters of time. The module 211 performs data alignment regardless (“independently”) of the sensor types.)
providing the ordered sequence of respective sensor data features to an event projection model. (C. 18, L. 14-17 and L. 21 to “time” in L. 29. An event projection model is the portion of module 211 which extracts/generates features from time-aligned sensor data.)
Regarding claim 9, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, further comprising:
Pandya teaches: storing the one or more information items associated with the one or more signature events, the one or more information items including a timestamp and a location of each of the one or more signature events. (C. 8, L. 31 to “site” in L. 34 and C. 18, L. 12 to “time” in L. 29. A location of every incident is the construction site, information items are input features in the input feature dataset, and aligning sensor data with respect to time indicates the data includes timestamps.)
Regarding claim 10, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, further comprising:
Pandya teaches: generating a subset of the one or more information items describing the behavior pattern; and (Based on C. 17, L. 62 to “level” in L. 66 and C. 18, L. 12-17, the claimed signature event (e.g., an incident,) is a behavior pattern that occurs during a time duration, and any information item itself is a subset as claimed.)
providing the subset of the one or more information items of the behavior pattern associated with the sensor data. (C. 18, L. 12-14)
However, Pandya and Vörös do not explicitly teach: determining a behavior pattern based on the one or more signature events for the time duration;
But Newcomb teaches: determining a behavior pattern based on the one or more signature events for the time duration; ([0108] on page 9 in left column, line 33 to the end of the paragraph teaches classifying the moving object. Determining a behavior pattern is a classification.)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Newcomb’s computer vision software into Pandya’s computer vision system and analogously Pandya’s LIDAR system. A motivation for the combination is to execute Pandya’s safety inference engine 210 when a classified motion event happens. (Newcomb, [0108] on page 9, left column, lines 33-end)
Claim 13 recites a system which implements the same features as the method of claim 1 and is therefore rejected for at least the same reasons.
Pandya teaches: one or more processors; and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform: (C. 28, 10-13)
Regarding claim 14, the combination of Pandya, Vörös, and Newcomb teaches: The computer system of claim 13,
Pandya teaches: memory further has instructions (C. 28, 10-13)
However, Pandya and Vörös do not explicitly teach: wherein for a temporal window corresponding to a subset of sensor data, the memory further has instructions for: applying at least a universal event projection model to process the subset of sensor data within the respective temporal window and detect the one or more signature events.
But Newcomb teaches: wherein for a temporal window corresponding to a subset of sensor data,
A motivation for the combination is the same as the motivation provided in claim 1.
Regarding claim 15, the combination of Pandya, Vörös, and Newcomb teaches: The computer system of claim 13,
Pandya teaches: wherein the plurality of sensor devices include one or more of: a presence sensor, a proximity sensor, a microphone, a motion sensor, a gyroscope, an accelerometer, a Radar, a Lidar scanner, a camera, a temperature sensor, a heartbeat sensor, and a respiration sensor. (C. 10, L. 29-32 teaches at least a Lidar scanner and a camera.)
Claim 18 recites a product which implements the same features as the method of claim 1 and is therefore rejected for at least the same reasons. Examiner treats the amendment in claim 18, lines 19-21 as if it had recited the same features as claim 1, lines 20-23.
Pandya teaches: A non-transitory computer-readable storage medium, having instructions stored thereon, which when executed by one or more processors cause the one or more processors to perform: (C. 28, 10-13)
Regarding claim 19, the combination of Pandya, Vörös, and Newcomb teaches: The non-transitory computer-readable storage medium of claim 18,
Pandya teaches: wherein the multimodal output includes one or more of: description, timestamp, numeral information, statistic summary, warning message, and recommended action associated with the one or more signature events. (C. 10, L. 3-10 discloses an alert (“warning message”) and an intervention (“recommended action”).)
Regarding claim 20, the combination of Pandya, Vörös, and Newcomb teaches: The non-transitory computer-readable storage medium of claim 18, further comprising instructions for:
Pandya teaches: identifying the plurality of predefined output modalities including two or more distinct modalities of: textual statements, software code, an image or video, an information dashboard having a predefined format, a user interface, and a heatmap; and (C. 10, L. 3 to “incident” in line 6; C. 10, L. 13-15 states, “intervention such as rhythmic cue, audio, visual, or tactile stimulus may be delivered to the worker via the wearable device”. C. 17, L. 33-34 states, “In some cases, the output 215 may include feedback information such as an alert” and L. 49-51 states, “For example, the output 215 may further include interventions delivered to the associated individual”. An audio device for delivering an audio alarm alert is a first distinct modality of a user interface. A visual device for delivering an visual intervention stimulus is a second distinct modality of a user interface.)
selecting the one of the plurality of predefined output modalities. (Based on C. 10, L. 3-7 and 13-16, the audio device would be selected when the predictive model outputs an audio alarm alert and the visual device would be selected when the predictive model outputs a visual intervention stimulus.)
Claims 5-6 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025), and Lee et al. (US 20220067479 A1, cited in PTO-892 issued 01/23/2025).
Regarding claim 5, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 2, further comprising:
Pandya teaches: applying at least a first event projection model to process the first temporal sequence of sensor samples to generate the first information item; and (C. 18, L. 12-17. A first event projection model is module 211.)
Pandya discloses applying the same feature generation module 211 to extract features from all the sensor data, but Pandya, Vörös, and Newcomb do not explicitly teach: applying at least a second event projection model to process the second temporal sequence of sensor samples to generate the first information item, the first event projection model distinct from the second event projection model.
But Lee teaches: applying at least a second event projection model to process the second temporal sequence of sensor samples to generate the first information item, the first event projection model distinct from the second event projection model. ([0058], lines 1-2 and [0064], from line 12 to “2)” in line 16. Each continuous signal stream would contain a temporal sequence of sensor samples. First and second event projection models correspond to feature extraction modules 1 and 2, respectively. The limitation of “first information item” corresponds to all extracted features collectively.)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Lee’s feature extraction modules into Pandya’s input feature generation module 211. A motivation for the combination is that different types of sensor data may require different algorithms to perform feature extraction. For example, Lee at [0066] discloses that a mel-frequency spectral feature may be extracted from an audio signal and other features may be extracted from a three-channel motion signal.
Regarding claim 6, the combination of Pandya, Vörös, Newcomb, and Lee teaches: The method of claim 5,
However, Pandya, Vörös, and Newcomb do not explicitly teach: further comprising: selecting each of the first event projection model and the second event projection model based on a respective device type of the first sensor device and the second sensor device.
But Lee teaches: selecting each of the first event projection model and the second event projection model based on a respective device type of the first sensor device and the second sensor device. ([0060], lines 1-4; [0064], from line 12 to “2)” in line 16; and all of [0066] teach that feature extraction module 1 extracts mel-frequency spectral features when the sensor device type is an audio sensor, and feature extraction module 2 extracts features from motion signals when the sensor device type is a motion signal sensor. Thus, the respective feature extraction modules have been selected based on respective device types.)
A motivation for the combination is the same as the motivation for claim 5.
Regarding claim 12, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, further comprising:
Pandya teaches: obtaining a plurality of training inputs, each training input including one or more test tags of a timestamp of each signature event is organized. (Examiner treats one “training input” as being a “test tag” and also as comprising an incident plus an associated timestamp of the incident. C. 19, L. 40-42 discloses “the different types [of] data may be combined with respect to time (e.g., time stamps)” and C. 20, L. 26-41 discloses continually training the predictive model. A signature event is an incident (see C. 10, L. 3-6), an information item of each signature event is an input feature dataset associated with the incident (see C. 18, L. 14-17), and an associated timestamp is a timestamp associated with the incident (see C. 19, L. 40-42). Timestamps describe the incident and thus constitute a predefined description format.)
However, Pandya, Vörös, and Newcomb do not explicitly teach: a sequence of signature events
But Lee teaches: a sequence of signature events ([0067], lines 13-end. Signature events are sub-event states, and a sequence of signature events is a sequence of the sub-event states.)
Lee’s different sub-event states are analogous to Pandya’s different types of incidents such as a worker tripping and falling. It would have been obvious to a person having ordinary skill in the art to have incorporated Lee’s sequence of sub-event states into the combination of Pandya, Vörös, and Newcomb as a sequence of incidents such as a worker tripping and then falling. A motivation for the combination is that training a predictive model to predict sequences of events/incidents would allow for more specific types of classifications when compared to predicting single events/incidents. (Lee, [0067])
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025), and Manotas Gutierrez et al. (US 20230259117 A1, cited in PTO-892 issued 09/15/2025).
Regarding claim 8, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1,
Pandya teaches: wherein the sensor data includes a temporal sequence of sensor data, and (C. 18, L. 21 to “time” in L. 29 discloses that data captured by camera and LIDAR may be aligned with respect to time, which indicates the system captures a temporal sequence of camera/CV data samples and a temporal sequence of LIDAR data samples.)
obtaining the sensor data further comprises: obtaining a stream of context data measured continuously by the plurality of sensor devices, the stream of context data including the temporal sequence of respective sensor samples that are grouped for each sensor device
associating each sensor data item of the temporal sequence of sensor data with a respective timestamp and a subset of respective sensor samples that are grouped data samples and LIDAR data samples corresponding to the incident, and the temporal window is the window of the incident.)
However, Pandya and Vörös, and Newcomb do not explicitly teach: sensor samples that are grouped for each sensor device based on a temporal window, the temporal window configured to move with a time axis;
But Manotas Gutierrez teaches: sensor samples that are grouped for each sensor device based on a temporal window, the temporal window configured to move with a time axis; ([0049])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Manotas Gutierrez’s sliding window of time series sensor data into the combination of Pandya and Vörös, and Newcomb. A motivation for the combination is to limit an amount of training data to a predetermined length of time. (Manotas Gutierrez, [0049])
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025), and Chambers et al. (US 20240428316 A1, cited in PTO-892 issued 09/15/2025).
Regarding claim 11, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, further comprising:
Pandya teaches: obtaining a plurality of training inputs, each training input including
obtaining a ground truth corresponding to each training input, the ground truth including a sample multimodal output preferred for the training input; and (C. 20, L. 36-41. When the model outputs an alert or an intervention, the validator would assign a correct alert or intervention based on each safety-related event.)
Vörös at [0089], lines 6-12 appears to teach that an input content item may include text or images. However, Pandya, Vörös, and Newcomb do not explicitly teach: each training input including a training text prompt and an information item
based on a predefined loss function, training the large behavior model
But Chambers teaches: each training input including a training text prompt and an information item ([0032] and [0042], lines 22-end disclose training inputs including text input paired with a selected image (“information item”). )
based on a predefined loss function, training the large behavior model using the plurality of training inputs and associated ground truths. ([0032], [0034])
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Chambers’ training inputs and actual outputs into the combination of Pandya, Vörös, and Newcomb. A motivation for the combination is to train a model that recommends an action for a user based on images and text prompts provided by the user. (Chambers, [0042])
Claims 16-17 is rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025), and Aimone et al. (US 20220027712 A1, cited in PTO-892 issued 01/23/2025).
Regarding claim 16, the combination of Pandya, Vörös, and Newcomb teaches: The computer system of claim 13, further comprising instructions for:
Pandya teaches: processing the sensor data
storing the one or more information items or the multimodal output in a database. (C. 13, L. 51-67 teaches that a local database 141 may store data about a predictive model and data generated by a predictive model including an output of the model. The input feature dataset is the model input and thus comprise data about the model.)
However, Pandya, Vörös, and Newcomb do not explicitly teach: processing the sensor data to generate one or more sets of intermediate items successively and iteratively,
But Aimone teaches: processing the
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Aimone’s spiking neural network and temporal buffer circuit into Pandya’s input feature generation module. A motivation for the combination is to perform efficient analog vector matrix operations that underpin many of the relevant computations in neural computation. (Aimone, [0019], lines 1-5)
Regarding claim 17, the combination of Pandya, Vörös, Newcomb, and Aimone teaches: The computer system of claim 16, further comprising instructions for:
Pandya at C. 18, L. 12-17 teaches processing sensor data to generate an input feature dataset. However, Pandya and Newcomb do not explicitly teach: processing the sensor data to generate a first set of intermediate items at a first time; storing the first set of intermediate items in the database; processing the first set of intermediate items to generate one or more second sets of intermediate items successively at one or more successive second times following the first time; successively storing the one or more second sets of intermediate items in the database, and deleting the first set of intermediate items from the database; and processing a most recent intermediate set of the one or more second sets of intermediate items to generate the one or more information items at a third time following the one or more successive second times.
But Aimone teaches: processing the
storing the first set of intermediate items in the database; ([0025], lines 1-3 discloses a temporal buffer circuit that holds spiking activation signals for a delay time. The database corresponds to the temporal buffer circuit, and the spiking activation signal generated at the first time step corresponds to the first set of intermediate items.)
processing the first set of intermediate items to generate one or more second sets of intermediate items successively at one or more successive second times following the first time; ([0025] and [0031] discloses that each mosaic can be sequentially computed using the temporal buffer. [0054], lines 1-6 and [0055], lines 1-3 disclose inputting spiking activation signals (“the first set of intermediate items”) back into crossbar stack as second input data for a second time step. At the second time step, the second input data is processed to generate an output spiking activation signal (“one or more second sets of intermediate items”) according to [0051]-[0052].)
successively storing the one or more second sets of intermediate items in the database, and deleting the first set of intermediate items from the database; and ([0052], lines 3-5 together with [0055], lines 1-3 discloses storing the output spiking activation signal generated at the second time step. The output spiking activation signal at the first time step would be an intermediate result for an intermediate layer according to [0045]-[0046]. Therefore, the temporal buffer circuit would be overwritten with a new output spiking activation signal at each time step.)
processing a most recent intermediate set of the one or more second sets of intermediate items to generate the one or more information items at a third time following the one or more successive second times. ([0025] and [0031] discloses that each mosaic can be sequentially computed using the temporal buffer. [0054], lines 1-6 and [0055], lines 1-3 disclose inputting spiking activation signals (“a most recent intermediate set of the one or more sets of intermediate items”) back into crossbar stack as third input data for a third time step. At the third time step, the third input data is processed to generate an output spiking activation signal (“the one or more information items”) according to [0051]-[0052].)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have incorporated Aimone’s spiking neural network and temporal buffer circuit into Pandya’s input feature generation module. A motivation for the combination is to perform efficient analog vector matrix operations that underpin many of the relevant computations in neural computation. (Aimone, [0019], lines 1-5)
Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Pandya et al. (US 11803955 B1, cited in PTO-892 issued 05/13/2025) in view of Vörös et al. (US 20240372830 A1, cited in PTO-892 issued 09/15/2025), Newcomb (US 20220031105 A1, cited in PTO-892 issued 05/13/2025), and Zhu et al. (“Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents”).
Regarding claim 21, the combination of Pandya, Vörös, and Newcomb teaches: The method of claim 1, further comprising:
Pandya teaches: displaying, on a user interface of an application executed on a client device, a
…
providing
displaying, in the
However, Pandya, Vörös, and Newcomb do not explicitly teach: a conversation panel;
receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being streamed;
in response to the user query, extracting the one or more information items
a time duration defined by the user query;
providing the user query to the large behavior model; and
displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query,
But Zhu teaches: a conversation panel; (On page 2, Fig. 1 and its description in the bottom paragraph to page 3, line 12 discloses a conversation panel between the Q-BOT and A-BOT. On page 6, § 3.1 discloses that both Q-BOT and A-BOT are machine learning models.)
receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being [captured]
in response to the user query, extracting (On page 6, § 3.1 discloses that A-BOT is a machine learning model which analyzes the video (extracts features) to answer questions from Q-BOT. Therefore, it performs feature extraction in response to the user query.)
a time duration defined by the user query; (A time duration is the beginning of the video, which is defined by “first in the video” on page 2, Fig. 1, Q1.)
providing the user query to the large behavior model; and (On page 7, Fig. 3 discloses providing the question qi from Q-BOT to A-BOT. The “large behavior model” as claimed is A-BOT.)
displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query, (On page 2, Fig. 1 discloses displaying “A man starts laughing out loud” in the conversation panel.)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have displayed Zhu’s conversation panel on Pandya’s dashboard for a human user to interact with Pandya’s model in the combination of Pandya, Vörös, and Newcomb. The human user would perform the same role as Q-BOT. A motivation for the combination is to protect sensitive personal information such as the identifiable face images and voices (Zhu, pages 1, first paragraph of § 1). Pandya’s sensors record personal information of construction workers.
Response to Arguments
The following is the Examiner’s response to the Applicant’s arguments filed 02/17/2026.
Applicant’s First Arguments Under 35 U.S.C. 101: On page 11 of the remarks, Applicant argues claim 1 recites specific technical features that cannot reasonably be performed mentally. Applicant submits that it would be impossible for a human with a pen and paper to simultaneously obtain data that would be analogous to the streaming of sensor data from a plurality of sensor devices including “at least two distinct sensor types.”
Examiner’s Response: Applicant's arguments have been fully considered but they are not persuasive. Claim 1 as a whole is directed to streaming sensor data and processing information items associated with the sensor data using a large language model (LLM). Claim 21 as a whole is directed to submitting a user query to the LLM via a user interface and receiving a response.
With respect to applicant’s first and second paragraphs in the remarks, page 11, section III, the step of collecting sensor data by streaming it from a plurality of sensor devices was not considered a mental process in the previous Office Action. Analyzing the sensor data is the mental process. Detecting trigger conditions associated with a predefined mission of the computer system is merely an observation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. The feature of the sensor devices configured to detect trigger conditions is recited at a high level of generality. Specification paragraph [0089] discloses a scenario of monitoring a condition of a patient, and the trigger condition includes a first health condition. A medical professional can reasonably monitor a condition of a patient and mentally observe the onset of a first health condition. A sensor is merely a generic computer component for apply the abstract ideas on a generic computer. The claim does not indicate the sensor device detects trigger conditions in a way that is any different from a person detecting the same conditions mentally. The claim does not recite the sensor comprises specific sensor hardware, as submitted in the remarks. Furthermore, the claim does not positively recite the sensor device detecting trigger conditions. The claim instead recites the “sensor device of the plurality of sensor devices is configured to detect trigger conditions”. The sensor device may or may not detect trigger conditions in the method of claim 1.
With respect to applicant’s third paragraph in the remarks on page 11 under section III, it is noted that the features upon which applicant relies (i.e., “digital sensor data” and “digital sensor samples”) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). In claim 1, line 21 recites “sensor data” but not “digital sensor data” as argued in the remarks.
The limitation “while streaming the sensor data: detecting one or more signature events in the sensor data” is an observation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. For example, this limitation amounts to a human watching and listening to a football game by streaming visual data through his or her eyes and streaming audio data through his or her ears, and using his or her mind to detect/recognize events in the data such as a player scoring points in the game.
The limitation “while streaming the sensor data: generating one or more information items characterizing the one or more signature events detected in the sensor data, independently of the sensor types… , wherein the one or more information items are generated based on a subset of the sensor data corresponding to the one or more signature events, the subset of the sensor data including sensor samples obtained from two or more of the at least two distinct sensor types” is an judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. This limitation amounts to a human writing down information items about the detected events while continuing to watch and listen to the game. The subset of sensor data could include a visual feed of the game and an audio feed from announcers at the game. The written information item is independent from the type of sensors because the visual feed and audio feed contain the same information of the player scoring points. The claim recites an abstract idea.
Applicant’s Second Arguments Under 35 U.S.C. 101: On pages 11-12, Applicant argues that amended claim 1 includes several limitations that constitute technical improvements like Desjardins. On page 12, Applicant argues the amended limitations specify that sensor devices are configured to detect trigger conditions associated with a predefined mission, that training data is generated based on behavior patterns determined from multiple sensor types, and that information items are generated from digital sensor samples obtained from multiple distinct sensor types. These limitations represent a specific technical solution for processing multi-modal sensor data through a re-trained LLM to generate real-time outputs, which constitutes a practical application of any underlying concepts.
Examiner’s Response: Applicant's arguments have been fully considered but they are not persuasive. Examiner respectfully disagrees that claim 1 constitutes technical improvements to training machine learning models like Desjardins. The feature of generating training information items based on a behavior pattern determined by data from two or more of the at least two distinct sensor types is a judgement and evaluation mental process which can reasonably be performed in the human mind with the aid of pencil and paper. A person can reasonably create training information items (training input data) by evaluating data from two distinct sensor types. Any improvements which may be recited in the claim are merely improvements to a mental process, which alone cannot provide the technical improvement. MPEP 2106.05(a) states, “It is important to note, the judicial exception alone cannot provide the improvement.” MPEP 2106.05(a), II. states, “it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology.”
The limitation of “generating a large behavior model including context information of multiple sensor domains by obtaining a predefined large language model (LLM) trained with predefined training data, obtaining training data associated with the plurality of sensor devices, wherein the training data includes information associated with a signature event that includes one or more training information items… , and re-training the predefined LLM with the training data, the LLM having a self-attention based transformer structure” amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f). All the features of the training data are recited at a high level of generality and does not appear to recite specific steps for processing the multimodal sensor data that would integrate the abstract ideas into a practical application. With respect to the explanation in remarks page 12, lines 13-16, the computer system processing multimodal sensor data by using a smaller amount of data than is collected by the sensor data amounts to mere instructions to apply the abstract ideas on a generic computer under MPEP 2106.05(f), and transforming the data into information items is a judgement and evaluation mental process, because these features lack details that would integrate the abstract ideas into a practical application. Unlike Desjardins, pending claim 1 does not recite a technical improvement to training a machine learning model.
With respect to the second paragraph on page 12 of the remarks, detecting trigger conditions associated with a predefined mission is an observation mental process. Generating training data based on behavior patterns determined from multiple sensor types s a judgement and evaluation mental process. Generating information items from sensor samples obtained from multiple distinct sensor types is an judgement and evaluation mental process. A judicial exception alone cannot provide a technical improvement.
The additional elements as disclosed above, alone or in combination, do not integrate the abstract ideas into a practical application as they are mere insignificant extra solution activities as disclosed in combination with generic computer functions and a field of use that are implemented to perform the abstract ideas disclosed above. The claim is directed to an abstract idea.
Applicant’s Third Arguments Under 35 U.S.C. 101: On page 12, Applicant submits that new claim 21 further integrates any recited judicial exception into a practical application.
Examiner’s Response: Applicant's arguments have been fully considered but they are not persuasive. In Step 2A Prong 2, displaying, on a user interface of an application executed on a client device, a conversation panel amounts to insignificant extra-solution activity under MPEP 2106.05(g). A client device amounts to a generic computer component for applying the abstract ideas on a generic computer under MPEP 2106.05(f). Receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being streamed amounts to mere data-gathering, an insignificant extra-solution activity under MPEP 2106.05(g). Displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query, the multimodal output characterizing the sensor data amounts to insignificant extra-solution activity under MPEP 2106.05(g).
In Step 2B, displaying, on a user interface of an application executed on a client device, a conversation panel is analogous to presenting offers and gathering statistics, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II). A client device amounts to a generic computer component for applying the abstract ideas on a generic computer under MPEP 2106.05(f).
Receiving, via the conversation panel, a user query including a plurality of natural language words, the user query received in real time while or after the sensor data are being streamed is analogous to receiving data over a network, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II).
Displaying, in the conversation panel, the multimodal output provided by the large behavior model in response to the user query, the multimodal output characterizing the sensor data is analogous to presenting offers and gathering statistics, which is a well-understood, routine, conventional activity recognized by the courts under MPEP 2106.05(d)(II). The claim is not patent eligible.
Applicant’s Arguments Under 35 U.S.C. 103: On page 14 of the remarks, Applicant argues none of the cited references teach or suggest training data that includes the limitation of pending claim 1, lines 8-15.
Examiner’s Response: Applicant's arguments have been fully considered but they are not persuasive. Pandya teaches: generating a large behavior model including context information of multiple sensor domains by obtaining a predefined A “signature event” is an incident, a behavior/action not in compliance with safety protocol, or fatigue level. “Information items” are input features in the input feature dataset. A “behavior pattern” is a behavior of construction workers. Sensor data is captured by the computer vision and the LIDAR systems at the same time.)
A signature event, information items, and a behavior pattern are recited at a high level of generality. These terms were previously mapped to features of Pandya’s system in the non-final action, and the current remarks do not appear to refute the previous mapping.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached at (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.H.J./Examiner, Art Unit 2127
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127