Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-13 are pending.
Drawings
The drawings are objected to because Fig. 2B text is illegible. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-3, 5-9, 11-13 is/are rejected under 35 U.S.C. 102(a)(1)as being anticipated by Ma et al (US 20200206920 A1).
Regarding claim 1, Ma discloses, teaches or suggests a method of identifying a task sequence from an interaction stream, the method comprising:
receiving, by a computing system, an interaction stream related to one or more interactions of one or more users with the computing system, one or more events that occurred from the one or more interactions, wherein the processed interaction stream is transformed into n-grams;
(see at least:
[0022] An “event stream” as referenced in the present disclosure is a recorded sequence of UI actions (e.g. mouse clicks, keyboard strokes, interactions with various elements of a graphical user interface (GUI), auditory input, eye movements and/or blinks, pauses, gestures (including gestures received/input via a touchscreen device, as well as gestures performed in view of a camera, e.g. for VR applications), etc.) and/or associated device actions (e.g. OS actions, API calls, calls to data source(s), etc.) for a particular user over a particular time period.
[0217] In more approaches, events may be encoded using techniques from language modeling. Because noisy words (or impulsive events) may be embedded in any sequence, in order to increase the robustness and accuracy to model each word, the adjacent events and the event itself are used to determine the “sematic meaning” of the event. The context window size is determined by the detection accuracy when used in finding similar events. A context size of 5 is typically advantageous, which means there are two events on the left and right sides respectively. The event itself is then represented by the 5-gram, i.e., five contiguous events.)
identifying, by the computing system, a plurality of potential data candidates for each of the n-grams by interpreting corresponding start markers and end markers;
(see at least:
[0151] Accordingly, several implementations of segmentation in accordance with operation 306 involve a more complex classification or labeling process than described above. In essence, the classification portion includes marking different events according to the task to which the events belong. In one approach, this may be accomplished by identifying the events that delineate different tasks, and labeling events as “external” (i.e. identifying a task boundary, whether start or end) or “internal” (i.e. belonging to a task delimited by sequentially-labeled boundary events). Various exemplary approaches to such classification include known techniques such as binary classifiers, sequence classification, and/or phrase tagging (e.g. as typically employed in the context of natural language processing). However, such approaches require a ground truth from which to determine the event boundaries and create associated supervised models, and in most situations consistent with the presently described inventive efforts, no such ground truth(s) is/are available.);
transforming for each of the n-grams, by the computing system, each of the identified plurality of potential data candidates into a corresponding potential data candidate vector, wherein the potential data candidate vectors are numerical representation of the plurality of potential data candidates;
(see at least:
[0161] The concatenated event streams are parsed into subsequences using the sliding window length N and feature vectors are calculated for each subsequence starting at each position within the event stream. In an exemplary embodiment, each subsequence includes categorical and/or numerical features, where categorical features include a process or application ID; an event type (e.g. mouse click, keypress, gesture, button press, etc.; a series of UI widgets invoked during the subsequence; and/or a value (such as a particular character or mouse button press) for various events in the subsequence. Numerical features may include, for example, a coordinate location corresponding to an action, a numerical identifier corresponding to a particular widget within a UI, a time elapsed since a most recent previous event occurrence, etc. as would be appreciated by a person having ordinary skill in the art upon reading the present disclosure.);
determining, by the computing system, a similarity score for each pair of the plurality of potential data candidates based on comparison of each of the plurality of the potential data candidate vectors of the corresponding pair of the plurality of potential data candidates;
(see at least:
[0163] Regardless of the particular manner in which feature vectors are generated, a distance matrix is computed for all pairs of subsequences. The preferred metric for the distance given the calculation of the feature vectors as described above is the Euclidean distance; however, other distance metrics can also be of value, for instance the cosine similarity, or the Levenshtein distance if the feature vectors are understood to be directly word sequences in the event language. Clusters of non-overlapping subsequences may then be identified according to similarity, using various techniques and without departing from the scope of the inventive concepts described herein. For example, in one embodiment a predetermined set k of pairs of subsequences characterized by the smallest distances between the elements of the pairs among the overall distance matrix may be selected as initial clusters representing k task types.); and
grouping, by the computing system, the plurality of potential data candidates into one or more groups based on the similarity score of a corresponding plurality of potential data candidates, wherein each of the one or more groups indicates a unique task sequence of the processed interaction stream:
(see at least:
[0027] Preferably based on a critical, often minimal, set of interactions, “traces” are built, in accordance with the inventive concepts described herein. A “trace” is to be understood as a segment that accomplished a particular task in a particular instance. Visually, and as described in greater detail below with reference to FIG. 3, a “trace” is analogous to a path from one end of a directed graph to another end of the graph, where each point along the path represents an event;
[0065] In one general approach, a computer-implemented method of identifying one or more processes for robotic automation (RPA) includes: recording a plurality of event streams, each event stream corresponding to a human user interacting with a computing device to perform one or more tasks; concatenating the event streams; segmenting some or all of the concatenated event streams to generate one or more individual traces performed by the user interacting with the computing device, each trace corresponding to a particular task; clustering the traces according to a task type; identifying, from among some or all of the clustered traces, one or more candidate processes for robotic automation; prioritizing the candidate processes; and selecting at least one of the prioritized candidate processes for robotic automation).
Regarding claim 2, Ma further teaches the method of claim 1, wherein the identifying each of the plurality of potential data candidates comprises:
identifying a plurality of data candidates for each of the n-grams by defining corresponding start markers and end markers;
(see at least:
[0151] Accordingly, several implementations of segmentation in accordance with operation 306 involve a more complex classification or labeling process than described above. In essence, the classification portion includes marking different events according to the task to which the events belong. In one approach, this may be accomplished by identifying the events that delineate different tasks, and labeling events as “external” (i.e. identifying a task boundary, whether start or end) or “internal” (i.e. belonging to a task delimited by sequentially-labeled boundary events). Various exemplary approaches to such classification include known techniques such as binary classifiers, sequence classification, and/or phrase tagging (e.g. as typically employed in the context of natural language processing). However, such approaches require a ground truth from which to determine the event boundaries and create associated supervised models, and in most situations consistent with the presently described inventive efforts, no such ground truth(s) is/are available.);
determining a weighted score for each of the plurality of data candidates based on frequency of a type of interaction in each of the n-grams and length of each of the plurality of data candidates;
(see at least:
[0145] Thus, preferred implementations of the inventive concepts presented herein employ segmentation in operation 306 of method 300 by analyzing many traces to identify event sequences that are frequent (e.g. occurring at least as frequently as a minimum frequency threshold). If a number of identified event subsequences meets or exceeds the minimum frequency threshold, the subsequence is extended and the search performed again. In this way, segmentation iteratively grows the seed event sequences and searches for the new set of sequences until it is no longer plausible that the sequences still contain events pertaining to one task. At this point, task boundaries may be defined with high confidence.);
normalizing the weighted score of each of the plurality of data candidates based on a predefined function to obtain a prioritization score, wherein each of the plurality of data candidates are prioritized based on the prioritization score;
(see at least:
[0123] Normalization can also be achieved using regular expressions. Any number of conditional regular expressions can be expressed, matching consecutive events. Back-references, as understood by an artisan, can be used to indicate cohesion between events, for instance by enforcing that events pertain to the same application.);
eliminating one or more data candidates from the plurality of data candidates, wherein the elimination is based on the prioritization score of each of the plurality of data candidates;
(see at least:
[0110] Cleaning essentially seeks to reduce the data set by eliminating irrelevant and/or misleading information from the event streams. In preferred approaches, cleaning may involve analyzing the text of the event stream records, and identifying redundant irrelevant events, streams, etc.) and at least one of:
presence and position of one or more keywords in the plurality of data candidates;
(see at least:
[0235] In accordance yet another instance, method 300 includes generating a multi-dimensional feature vector for each of the individual traces, where each event is represented by a multi-dimensional feature describing one or more features of the corresponding event. The feature(s) may include any one or combination of: application ID, application name, hierarchical position of an element of a user interface (UI) interacted with during the event, an event type, an event value, a location corresponding to the event, and/or a time elapsed since a previous event occurrence, in various approaches.), or
predefined acceptable range of length when overlapping plurality of data candidates with same start markers and end markers are present, wherein the plurality of data candidates remaining post elimination are identified as the plurality of potential data candidates.
Note limitations are recited in the alternatives.
Regarding claim 3, Ma further teaches the method of claim 1, wherein each of the identified plurality of potential data candidates are transformed into the corresponding potential data candidate vector based on at least one of a frequency of a type of interaction and presence or absence of the interaction, by comparing each interaction of each of the plurality of the data candidate with each corresponding interaction of each of rest of the plurality of data candidates;
(see at least:
[0329]…. Concept (j) The computer-implemented method of concept (a), wherein the clustering further comprises generating a multi-dimensional feature vector for each of the individual traces, wherein each event comprises a multi-dimensional feature describing one or more features of the corresponding event, the one or more features comprising: an application ID, an application name, a hierarchical position of an element of a user interface (UI) interacted with during the event, an event type, an event value, a location corresponding to the event, and/or an amount of time elapsed since a previous event occurrence).
Regarding claim 5, Ma further teaches the method of claim 1, wherein the n-grams are at least one of unigram, bigram, quad gram, pentagram, hexagram, and octagram;
(see at least:
[0217] In more approaches, events may be encoded using techniques from language modeling. Because noisy words (or impulsive events) may be embedded in any sequence, in order to increase the robustness and accuracy to model each word, the adjacent events and the event itself are used to determine the “sematic meaning” of the event. The context window size is determined by the detection accuracy when used in finding similar events. A context size of 5 is typically advantageous, which means there are two events on the left and right sides respectively. The event itself is then represented by the 5-gram, i.e., five contiguous events.).
Regarding claim 6, Ma further teaches the method of claim 1, further comprises:
performing, by the computing system, an inter-group and intra-group comparison of each of the one or more groups to determine presence of at least one overlapping interaction in the plurality of potential data candidates,
(see at least:
[0170] In particularly preferred approaches, multiple window lengths may be applied to each event stream and the clusters resulting therefrom overlapped to improve overall precision, as would be appreciated by persons having ordinary skill in the art upon reading the instant disclosure.);
eliminating, by the computing system, one or more of the plurality of potential data candidates from the one or more groups when the presence of at least one of the overlapping interactions is determined, wherein elimination is performed based on at least one of, length of the plurality of data candidates that are determined to have overlapping interactions and frequency of a type of interactions in the plurality of data candidates that are determined to have overlapping interactions;
(see at least:
[0163] Regardless of the particular manner in which feature vectors are generated, a distance matrix is computed for all pairs of subsequences. The preferred metric for the distance given the calculation of the feature vectors as described above is the Euclidean distance; however, other distance metrics can also be of value, for instance the cosine similarity, or the Levenshtein distance if the feature vectors are understood to be directly word sequences in the event language. Clusters of non-overlapping subsequences may then be identified according to similarity, using various techniques and without departing from the scope of the inventive concepts described herein. For example, in one embodiment a predetermined set k of pairs of subsequences characterized by the smallest distances between the elements of the pairs among the overall distance matrix may be selected as initial clusters representing k task types.).
Claims 7-9, 11, 12 essentially recites the limitations of claims 1-3, 5, 6 in form of system thus are rejected for the same reasons discussed in claims 1-3, 5, 6 above.
Claim 13 essentially recites the limitations of claim 1 in form of non- transitory computer program product, thus is rejected for the same reasons discussed in claim 4 above.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 4, 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ma et al (US 20200206920 A1), in view of Gao et al (US 20170147538 A1), further in view of Faiman et al (US 20120330993 A1).
Regarding claim 4, Ma further teaches the method of claim 1, wherein determining similarity score for each pair of the plurality of potential data candidates comprises:
for each pair of the plurality of potential data candidates,
comparing each interaction of each of the plurality of the potential data candidate vectors with each corresponding interaction of each of rest of the plurality of potential data candidate vectors;
(see at least:
[0140] Accordingly, in order to delineate between different tasks and/or traces, while retaining sufficient similarity between different variants such that different tasks/traces are identified, but retain sufficient similarity to be grouped together into a single cluster, the presently disclosed inventive concepts employ a segmentation approach 306 as shown in FIG. 3,
[0224] With continuing reference to FIG. 3 and method 300, operation 312 includes prioritizing the one or more identified processes for robotic automation. Prioritizing the processes includes determining the value of automating each identified segment/trace using RPA, and comparing the weight and/or frequency of performing a task by human operators to a weight and/or frequency of performing the corresponding task according to the automated robotic process);
The difference is Ma does not specifically show:
assigning:
a predefined first score to each interaction of the plurality of potential data candidates, when a comparison results in an exact match of frequency of a type of interaction,
a predefined second score to each interaction of the plurality of potential data candidates when the comparison results in mismatch of frequency of the type of interaction,
a predefined third score to each interaction of the plurality of potential data candidates when the comparison results in absence of a common type of interaction;
determining a cumulative score based on the predefined first score, the predefined second score and the predefined third score assigned to each interaction of the corresponding plurality of potential data candidates; and
determining the similarity score based on the corresponding cumulative score and a predefined weightage of each of the n-grams.
However Ma clearly teaches determining similarity scores for clustering traces according to a task type and assigning weight to each trace;
(see at least:
Ma [0134] As described herein, and in accordance with method 300, identifying processes for RPA involves segmenting individual traces/tasks within the event streams, as well as identifying different task types and grouping (clustering) traces corresponding to the same task type. In a preferred implementation, segmentation according to the various implementations and aspects described herein is performed as operation 306 of method 300, and similarly clustering according to the various implementations and aspects described herein is performed as operation 308 of method 300. Each operation may respectively involve any combination of corresponding features, sub-operations, functions, etc. as described herein performed in any order, including but not limited to: clustering followed by segmentation, segmentation followed by clustering, combined segmentation and clustering, automatic and/or manual segmentation and clustering. Most preferably, identifying processes for robotic information per method 300 includes a combined, automated segmentation and clustering approach;
Ma [0263] Continuing now with the notion of building DAGs from RPA mining data obtained and analyzed in accordance with method 300, as described hereinabove this process involves identifying and assigning a weight to each trace in each cluster generated during the RPA mining phase. The weight may reflect the frequency of the trace within the cluster, the weight of the node relative to each trace including the node, etc. as would be understood by a person having ordinary skill in the art upon reading the present descriptions).
Ma does not specifically show the claimed first, second, third, and cumulative scores. However it is customary in the art to match candidate data by different degrees of similarity as shown by Gao;
See at least:
Gao [0029]…. The comparison of the input sequence and the context contents directly for non-coded input and context content or under the same input coding scheme for coded input sequence may produce an exact match. An exact match occurs when the same sequence (e.g., non-coded sequence, Pinyin sequence, Wubi, or Stroke count sequence) of the input sequence is found in the sequence of the context contents (as a subsequence of the converted sequence for the context content, for example). The comparison may alternatively produce a close match even if there is no exact match. The threshold for close match may be predefined. For example, close match may be defined as finding a subsequence in the converted sequence of context contents that is more than 80% similar to the input sequence).
Furthermore it is customary in the art as shown by Faiman to use a variety of predetermined criteria and scores to retain candidate data;
(see at least:
Faiman [0046] The ranking data may be based on a predetermined score. A predetermined score is a value that is assigned to a name in the name database 224 before the messaging application 222 orders the matched names. The messaging application 222 then uses the predetermined score to order matched names (e.g., the highest score appears at the top of an ordered list). The first user 102 may assign a predetermined score to each name or entry in the name database 224. The score may then increase or decrease based on further user input, for example, if the first user 102 later selects the associated name as a matched name.
Faiman [0047] The ranking data may be based on a frequency the name appears in the first user's 102 communications. For example, the messaging application 222 may increase a score assigned to an entry in the name database 224 each time the entry appears in a communication. Moreover, the messaging application 222 may increase the score by a greater value if the matched name is then selected by the user. Conversely, the score may decrease after a predetermined period of time if the name entry is not matched to an identified name. For example, the score associated with the name entry may decrease by a certain value each time the predetermined period of time expires without the name matching an identified name. In this manner, infrequently used names in the name database 224 may be given less weight. In one embodiment, the messaging application 222 may establish a threshold that if a score falls below, the name is erased from the name database 224);
it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed inven5tion to combine the teachings of Gao and Faiman while implementing the method of Ma by assigning the claimed scores in order to remove candidate processes not meeting the required criteria and to retain candidate processes for robotic automation in the method of Ma (see Ma [0065]).
Claim 10 essentially recites the limitations of claim 4 in form of system thus is rejected for the same reasons discussed in claim 4 above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
VARSANYI et al (AU 2014205389 A1) teach systems, methods, and computer-readable media for detecting threats on a network. In an embodiment, target network traffic being transmitted between two or more hosts is captured. The target network traffic comprises a plurality of packets, which are assembled into one or more messages. The assembled message(s) may be parsed to generate a semantic model of the target network traffic. The semantic model may comprise representation(s) of operation(s) or event(s) represented by the message(s). Score(s) for the operation(s) or event(s) may be generated using a plurality of scoring algorithms, and potential threats among the operation(s) or event(s) may be identified using the score(s).
Garner et al (US 20180129755 A1) teach a Technological Emergence Scoring and Analysis Platform. The platform provides emergence scores for terms and a set of emergence indicators based on the scores. The platform can be used to quantitatively distinguish scientific and technological emergent topics within a target data category. The platform can take a set of data comprising a plurality of records and calculate an emergence score for terms and phrases representing the terms' or phrases' technological emergence. In further aspects, the platform can analyze the data to identify one or more sets of candidate terms. In further aspects, analyzing the data set to identify a set of candidate terms can comprise extracting all n-grams up to a predetermined length, for example, unigrams and multiword phrases. In still further aspects, the candidate term set can comprise various forms of keywords, or index terms, or the like.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to UYEN T LE whose telephone number is (571)272-4021. The examiner can normally be reached M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ajay M Bhatia can be reached at 5712723906. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/UYEN T LE/Primary Examiner, Art Unit 2156 25 February 2026