Last updated: April 19, 2026
Application No. 17/649,453
GENERATING AND IDENTIFYING TEXTUAL TRACKERS IN TEXTUAL DATA

Non-Final OA §103
Filed
Jan 31, 2022
Examiner
YOUNG, CAMERON KENNETH
Art Unit
2655
Tech Center
2600 — Communications
Assignee
Gong Io Ltd.
OA Round
5 (Non-Final)
Interview Optional

— +12.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 20 resolved cases, 2023–2026
Examiner Intelligence

YOUNG, CAMERON KENNETH View full profile →
Grants 70% — above average
Career Allow Rate
14 granted / 20 resolved
+8.0% vs TC avg
Moderate +12% lift
Without
With
+12.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
23 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
20.1%
-19.9% vs TC avg
§103
58.9%
+18.9% vs TC avg
§102
11.4%
-28.6% vs TC avg
§112
7.7%
-32.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 20 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/07/2025 has been entered.
 

Response to Amendment
	Applicant’s amendment, filed 10/07/2025, has been entered. Applicant included a statement that a new claim, claim 22, had been added. However, no such claim was present within the amended claims submitted 10/07/2025. Claims 1 – 7, 9 – 18, and 20 – 21 remain pending within the application. Applicant’s amendments have resolved each and every claim objection laid out in the office action dated 07/07/2025. As such, the claim objections of claims 1, 11, and 12 are hereby withdrawn. 

Response to Arguments
Applicant's arguments filed 10/07/2025 have been fully considered but they are not persuasive.
Applicant alleges, on page 9 and 10 of Applicant’s Response, that Boteanu does not teach the selecting the key phrase as a topic based on the input query and particularly not the sentence embedding value of the input query. Applicant continues to allege that because Boteanu teaches the seller selecting the key phrase as a topic. Examiner disagrees. 
First, Examiner notes that it is not Boteanu alone that teaches the limitations in question. Indeed, it is the combination of Ho and Boteanu that teaches such limitations. As such, arguments directed only towards Boteanu’s teachings do not take into account the context of the combination of Ho and Boteanu because Boteanu’s teachings alone are not relied upon to teach the limitations of the claims. 
Specifically, Boteanu’s teachings of selecting a key phrase or keyword by a user are, in some form, an input. Therefore, Boteanu’s teachings of selecting a key phrase or keyword and clustering similar key phrases or keywords based on the user’s selection is similar in scope and content to Ho’s teachings of receiving a user query and locating matching results from an index based upon the query. Thus, when viewing both Ho and Boteanu, a person of ordinary skill in the art would have found it obvious that Boteanu’s clustering of similar keywords or key phrases based upon word embeddings could indeed be performed based upon the input query of Ho to achieve similar results. Further, such an alteration would have been obvious because the similar processes of identifying matching utterances and similar keywords/key phrases based on clustering would have been an obvious application of known techniques to similar problems to achieve similar results. Therefore, the 35 U.S.C. § 103 rejections of claims 1, 11, and 12, and their respective dependent claims, are maintained for at least this reason. 
Further, Applicant alleges that columns 12 and 15 of Boteanu are unrelated portions of the specification and therefore, Examiner's rejection uses improper hindsight to reject the claims. Examiner disagrees. 
First, Examiner notes that, although certain portions of a specification may appear unrelated on their face, the original inventor, in this case Boteanu, would not be precluded from understanding separate portions of their specification within the context of each other merely because they are on different pages of the specification. 
Second, Examiner does not agree that the portions of the text are unrelated. Just beyond the cited portion of column 12 (i.e., column 12 lines 34 – 63), Boteanu continues to discuss that the common text features are determined by a machine learning algorithm of the module which determines similarity or semantic relationships between the query results. Column 15 lines 26 – 39 expressly recites: 
"Information as to semantic relationships is provided in the form of a graph 602. The query words ACME and MODEL ONE are determined as related with respect to certain other commonly used words --- e.g., DRONE, NEW, RELEASED, and TOY, as demonstrated in the feature matrix 600 of Fig. 6A. While the graph is provided to visualize distances between words as defined in their semantic relationships, it is understood that a configured system may not graph the relationships, but merely provide the outputs. As a result, the closest determined terms, by semantic relationships, may have the least distance as calculated by a cosine distance measure or a Euclidean distance measure --- once normalized."
This portion, in addition to the portions of column 11 and 12 preceding and following the cited portion (i.e., lines 9 – 67, and 34 – 63 respectively) demonstrate that the key-phrases/keywords discussed in column 12 are indeed directly related to the concepts in column 15 which are applicable to the key-phrases and keywords selection/clustering. A person of ordinary skill in the art would have recognized this as applying the concepts present in column 15 to the concepts present in column 12 would have been a predictable application of relevant concepts to increase accuracy of query results. Therefore, the 35 U.S.C. § 103 rejections of claims 1, 11, and 12, and their respective dependents are maintained for at least this reason.
Further, Applicant alleges, on pages 12 and 13 of Applicant’s Response, that the combination of Boteanu and Ho would render Ho unsatisfactory for its intended purpose by introducing new words into Ho’s indexed corpus which, Applicant alleges, would prevent the propagation of labeling information throughout the corpus. Examiner disagrees. 
Particularly, Examiner notes that the introduction of new words or embeddings within a corpus does not preclude the propagation of label information throughout the corpus. Applicant argues that the introduction of new words outside the corpus does not allow propagation of labeled words to the remaining words in the corpus, however, no evidence is provided to support such an allegation. Examiner notes that the propagation within Ho is performed by propagating labels introduced by the user to words throughout a corpus. (Ho at ¶ [0057].) Further, Ho’s method propagates the labels to words having semantically similar words within the corpus, it does not require that propagated labels are always applied to identical words throughout the corpus. (Ho at ¶¶ [0063] – [0065].) Therefore, the propagated labels are not restricted from being applied to words or phrases from outside the corpus that are introduced within the corpus. Newly introduced words or phrases would also be labeled, provided they meet the similarity requirements. Further still, Boteanu’s expansion of the corpus with additional words or phrases could only allow for expanded search results which might not otherwise be found. Which is, in essence, a benefit that any person of ordinary skill in the art would have recognized. 
As such, the 35 U.S.C. § 103 rejections of claims 1 – 7, 9 – 18, and 20 – 21 are maintained for at least the reasons laid out above. 

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1 – 2, 7, 9, 11-13, 18, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over U.S. Patent Application Publication No. 2020/0074984 A1 to Tin Kam Ho et al. (hereinafter Ho) in view of U.S. Patent No. 11,301,540 B1 to Adrian Boteanu et al. (hereinafter Boteanu) and in further view of U.S. Patent Application No. 2022/0138388 A1 to Chaofan Wang et al. (hereinafter Wang).

Regarding claim 1, Ho teaches a method for generating a tracker model for identification of trackers in textual data, comprising: (Ho teaches training a machine learning model/classifier (i.e., a tracker model) for identifying intents in a live dialog server for live automated responses. Ho at Fig. 7 and ¶¶ [0061] - [0067].)
receiving an input query including at least an input sentence exemplifying a tracker of interest, wherein the tracker is at least one word with a specific context; (Ho teaches receiving a search query for retrieving utterances (i.e., utterances are frequently sentences due to the nature of verbal communication in the context of human language.) wherein the utterances are related to a semantic scope of intent input by the user (i.e., the intent input by the user may be something akin to "teaching" which is a at least one word with a specific context (e.g., teaching/learning) as defined by the word itself.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
generating a base results set including a set of matching sentences matching the input sentence, wherein the sentences in the base results set are obtained from an index indexing textual data, wherein the index has sentence embedding values for the indexing textual data including the sentences in the base results set; (Ho teaches retrieving a set of 100 best matching utterances (i.e., a base results set including a set of sentences matching the input sentences.) wherein the utterances are retrieved from conversational logs (i.e., an index indexing textual data) stored as text (see Ho at Fig.8 where the retrieved utterances are displayed in text form then labeled, therefore the conversational logs are text data.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
deriving a first labeling set from the base results set, wherein the first labeling set includes sample sentences that are at least a portion of the matching sentences in the base results set wherein deriving the first labeling set further comprises: … (Ho teaches determining a subset of matching utterances and presenting a user with the subset of the matching utterances to be labeled by the user (i.e., a labeling set including at least a portion of samples of sentences from the base results set.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
receiving labels on each sentence in the first labeling set; (Ho teaches a user labeling the utterances from the subset of the utterances (i.e., receiving labels on each sentence). Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
and feeding the first labeling set and the respective labels to a machine learning algorithm to train the tracker model. (Further, Ho teaches training a machine learning model / classifier (i.e., a tracker model) using the corpus of conversational logs of utterances labeled by the user and propagated from the subset to the remaining (i.e., feeding the labeling set and the labels to a machine learning model for training.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
Ho, however, does not alone teach “wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set”, “clustering the sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index;” and “selecting the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence.”
In a similar field of endeavor (e.g., processing natural language queries, indexing, and clustering query results), Boteanu teaches wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set (Boteanu teaches determining closely related entities to keywords/key phrases by clustering word embeddings and clustering the word embeddings using latent semantic indexing (i.e., the base results rest is clustered and indexed.). Boteanu at 10:1 – 10:31.)
Further, Boteanu teaches clustering the matching sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index; (Boteanu teaches clustering groups of words (i.e., a labeling set) using word embeddings. Boteanu at 9:50-10:18. Further, Boteanu teaches a seller wishing to index their catalog, and incorporating the seller's pre-existing catalog in the task of clustering the information within the catalog. (Boteanu at 10:32 - 11:7.) As such, Boteanu's index exists prior to the query runtime and performs clustering of both the present, pre-existing index, and the external references which are indexed at query run-time.)
Further, Boteanu teaches selecting the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence; (Boteanu teaches selecting a key phrase (i.e., a sample sentence) to serve as a topic for a set to aid in narrowing search results. Boteanu at 12:21-33. Further, Boteanu teaches clustering the keywords and indicating which keywords are to be indexed (i.e., added to the search engine database as results to the query) using distance metrics such as Euclidean distance, Word Mover's Distance, and cosine similarity. Boteanu at 10:1 - 10:18. Further still, Boteanu teaches the cosine distance or Euclidean distance will have the "least distance" i.e., the closest value. Boteanu at 15:21 - 15:39. As such, a person of ordinary skill in the art would have understood that, in this situation, there is a predetermined moving threshold determining the closest embedding as the lowest distance value between the key phrase and the indexed content.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho with the teachings of Boteanu to provide the amended limitations of claim 1. Doing so would improve the accuracy of queries/searches as recognized by Boteanu at 2:10-2:31. Further, Boteanu and Ho both perform similar methods of responding to queries using natural language. As such, a person of ordinary skill in the art would have both recognized the similarities between Ho’s response systems and Boteanu’s query response and search results system, and been motivated to combine the two because they occupy a similar field and perform similar tasks.
Ho in view of Boteanu (hereinafter Ho-Boteanu), however, do not alone teach “wherein the selected sample sentences include a sample sentence from each cluster of the plurality of clusters”
In a similar field of endeavor (e.g., building training data for neural networks using clustering encoded data.), Wang teaches the selected sample sentences include a sample sentence from each cluster of the plurality of clusters. (Wang teaches a method of selecting samples withing a system of generating training data for neural network models wherein the samples are clustered, then a sample from each cluster is selected and used to train a model as part of an iterative process. Wang at ¶¶ [0054] - [0058]. Further, Wang's selected samples are selected from a subset of the original clusters (i.e., the number of samples selected is half the number of samples across two batches of samples. Therefore, the selected samples are selected from a subset of the larger set of samples.) Wang at ¶¶ [0054] - [0058]. Further, the samples selected by Wang are labeled (i.e., the selected samples for a labeling set.) Wang at ¶¶ [0054] - [0058].) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu with the teachings of Wang to provide selecting samples from each cluster of the plurality of clusters. Doing so would have been an obvious to one of ordinary skill in the art to modify Ho-Boteanu with the clustering and sample selection system of Wang, which selects and labels samples from each cluster of a plurality of clusters, by simply applying the sample selection method of Wang to the clusters of Ho-Boteanu to ensure that a correct number of samples are selected during training of machine learning models. Wang teaches that the selected number of samples from each cluster should be balanced as much as possible. Applying this concept to Ho-Boteanu would be a predictable application of known techniques to improve consistency and balance of the training sets used by Ho-Boteanu for language models. 


Regarding claim 2, Ho-Boteanu in view of Wang (hereinafter Ho-Boteanu-Wang) further teaches the method of claim 1, wherein when the tracker model is not ready further comprising: iteratively generating a second labeling set from the base results set; (Ho teaches presenting a user with alternate queries that result in additional to-be-labeled utterances from the conversational logs (i.e., the user is presented with additional sentences for labeling.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067]. The additional queries presented to the user which yield additional results similar to the query would logically be labeled as the interface shown in Fig. 8 of Ho suggests the ability to create a new labeling set by using the "new intent" button or altering the text field and interacting with the "Search" button an additional time.)
receiving labels on each sentence in the second labeling set; (Ho teaches presenting a user with additional sentences for labeling after processing additional queries. Ho at ¶¶ [0061] - [0067] and Figs. 7 - 8. As such, labeling the utterances by the user amounts to the system receiving labels for each sentence (utterance).)
and feeding the second labeling set and the respective labels to the machine learning algorithm to further train the tracker model. (Ho teaches using the user-defined labels (i.e., the labeling sets) to train a machine learning model / classifier on the labeled data. Ho at Figs. 7 - 8, and ¶¶ [0061] - [0067].)

Regarding claim 7, Ho-Boteanu-Wang teaches the limitations of claim 1 as shown above. Further, Ho teaches including all the determined matching sentences in the base results set (Ho at ¶¶ [0061] – [0067].). 
Furthermore, Boteanu teaches computing the sentence embedding value to the input sentence (Boteanu at 10:1-10:18) and determining, based on their respective sentence embedding values in the index, all the matching sentences in the index that are close to the sentence embedding value of the input sentence; and including all the determined matching sentences in the base results set (Boteanu at 10:1-10:31). 

Regarding claim 9, Ho-Boteanu-Wang teaches all the limitations of claim 2, as laid out above. Further, Ho teaches the method of claim 2, further comprising: generating the second labeling set from the base results set and labels generated based on the first labeling set. (Ho teaches propagating labels from a subset of utterances (i.e., a first labeling set) throughout a corpus of utterances to yield a labeled corpus (i.e., a second labeling set generated based on the base results set and labels of the first labeling set.) Ho at ¶¶ [0061] - [0067].)

Regarding claim 11, Ho teaches a non-transitory computer readable medium having stored thereon instructions for causing a processing circuitry to execute a process for live migration of an index in a document store, the process comprising: (Ho teaches a method implemented by executing instructions stored on computer readable media. Ho at ¶ [0061])
receiving an input query including at least an input sentence exemplifying a tracker of interest, wherein the tracker is at least one word with a specific context; (Ho teaches receiving a search query for retrieving utterances (i.e., utterances are frequently sentences due to the nature of verbal communication in the context of human language.) wherein the utterances are related to a semantic scope of intent input by the user (i.e., the intent input by the user may be something akin to "teaching" which is a at least one word with a specific context (e.g., teaching/learning) as defined by the word itself.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
generating a base results set including a set of sentences matching the input sentence, wherein the sentences in the base results set are obtained from an index indexing textual data, … (Ho teaches retrieving a set of 100 best matching utterances (i.e., a base results set including a set of sentences matching the input sentences.) wherein the utterances are retrieved from conversational logs (i.e., an index indexing textual data) stored as text (see Ho at Fig.8 where the retrieved utterances are displayed in text form then labeled, therefore the conversational logs are text data.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
deriving a first labeling set from the base results set, wherein the first labeling set includes at least a portion of sample sentences from the base results set wherein deriving the first labeling set further comprises: … (Ho teaches determining a subset of matching utterances and presenting a user with the subset of the matching utterances to be labeled by the user (i.e., a labeling set including at least a portion of samples of sentences from the base results set.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
receiving labels on each sentence in the first labeling set; (Ho teaches a user labeling the utterances from the subset of the utterances (i.e., receiving labels on each sentence). Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
and feeding the first labeling set and the respective labels to a machine learning algorithm to train the tracker model. (Further, Ho teaches training a machine learning model / classifier (i.e., a tracker model) using the corpus of conversational logs of utterances labeled by the user and propagated from the subset to the remaining (i.e., feeding the labeling set and the labels to a machine learning model for training.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
Ho, however, does not alone teach “wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set”, “clustering the sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index;” and “selecting the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence.”
In a similar field of endeavor (e.g., processing natural language queries, indexing, and clustering query results), Boteanu teaches wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set (Boteanu teaches determining closely related entities to keywords/key phrases by clustering word embeddings and clustering the word embeddings using latent semantic indexing (i.e., the base results rest is clustered and indexed.). Boteanu at 10:1 – 10:31.)
Further, Boteanu teaches clustering the sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index; (Boteanu teaches clustering groups of words (i.e., a labeling set) using word embeddings. Boteanu at 9:50-10:18. Further, Boteanu teaches a seller wishing to index their catalog, and incorporating the seller's pre-existing catalog in the task of clustering the information within the catalog. (Boteanu at 10:32 - 11:7.) As such, Boteanu's index exists prior to the query runtime and performs clustering of both the present, pre-existing index, and the external references which are indexed at query run-time.)
Further, Boteanu teaches selecting the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence; (Boteanu teaches selecting a key phrase (i.e., a sample sentence) to serve as a topic for a set to aid in narrowing search results. Boteanu at 12:21-33. Further, Boteanu teaches clustering the keywords and indicating which keywords are to be indexed (i.e., added to the search engine database as results to the query) using distance metrics such as Euclidean distance, Word Mover's Distance, and cosine similarity. Boteanu at 10:1 - 10:18. Further still, Boteanu teaches the cosine distance or Euclidean distance will have the "least distance" i.e., the closest value. Boteanu at 15:21 - 15:39. As such, a person of ordinary skill in the art would have understood that, in this situation, there is a predetermined moving threshold determining the closest embedding as the lowest distance value between the key phrase and the indexed content.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho with the teachings of Boteanu to provide the amended limitations of claim 11. Doing so would improve the accuracy of queries/searches as recognized by Boteanu at 2:10-2:31. Further, Boteanu and Ho both perform similar methods of responding to queries using natural language. As such, a person of ordinary skill in the art would have both recognized the similarities between Ho’s response systems and Boteanu’s query response and search results system, and been motivated to combine the two because they occupy a similar field and perform similar tasks.
Ho-Boteanu, however, do not alone teach “wherein the selected sample sentences include a sample sentence from each cluster of the plurality of clusters”
In a similar field of endeavor (e.g., building training data for neural networks using clustering encoded data.), Wang teaches the selected sample sentences include a sample sentence from each cluster of the plurality of clusters. (Wang teaches a method of selecting samples withing a system of generating training data for neural network models wherein the samples are clustered, then a sample from each cluster is selected and used to train a model as part of an iterative process. Wang at ¶¶ [0054] - [0058]. Further, Wang's selected samples are selected from a subset of the original clusters (i.e., the number of samples selected is half the number of samples across two batches of samples. Therefore, the selected samples are selected from a subset of the larger set of samples.) Wang at ¶¶ [0054] - [0058]. Further, the samples selected by Wang are labeled (i.e., the selected samples for a labeling set.) Wang at ¶¶ [0054] - [0058].) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu with the teachings of Wang to provide selecting samples from each cluster of the plurality of clusters. Doing so would have been an obvious to one of ordinary skill in the art to modify Ho-Boteanu with the clustering and sample selection system of Wang, which selects and labels samples from each cluster of a plurality of clusters, by simply applying the sample selection method of Wang to the clusters of Ho-Boteanu to ensure that a correct number of samples are selected during training of machine learning models. Wang teaches that the selected number of samples from each cluster should be balanced as much as possible. Applying this concept to Ho-Boteanu would be a predictable application of known techniques to improve consistency and balance of the training sets used by Ho-Boteanu for language models. 

Regarding claim 12, Ho teaches A system for generating a tracker model for identification of trackers in textual data, comprising: (Ho teaches a method for generating a machine learning model / classifier using labeled utterances from a user implemented on a machine storing computer executable instructions. Ho at ¶¶ [0061] – [0067].)
a processing circuitry; (Ho teaches the system comprising a processing unit 16. Ho at Fig. 1.) and a memory, the memory containing instructions that, when executed by the processing circuitry, configure the system to: (Ho teaches the system comprising memory 28 storing computer executable instructions that implement a method of generating a machine learning model / classifier. Ho at Fig. 1, 7, and 8, and ¶¶ [0061] – [0067].)
receive an input query including at least an input sentence exemplifying a tracker of interest, wherein the tracker is at least one word with a specific context; (Ho teaches receiving a search query for retrieving utterances (i.e., utterances are frequently sentences due to the nature of verbal communication in the context of human language.) wherein the utterances are related to a semantic scope of intent input by the user (i.e., the intent input by the user may be something akin to "teaching" which is a at least one word with a specific context (e.g., teaching/learning) as defined by the word itself.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
generate a base results set including a set of sentences matching the input sentence, wherein the sentences in the base results set are obtained from an index indexing textual data, … (Ho teaches retrieving a set of 100 best matching utterances (i.e., a base results set including a set of sentences matching the input sentences.) wherein the utterances are retrieved from conversational logs (i.e., an index indexing textual data) stored as text (see Ho at Fig.8 where the retrieved utterances are displayed in text form then labeled, therefore the conversational logs are text data.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
derive a first labeling set from the base results set, wherein the first labeling set includes at least a portion of sample sentences from the base results set wherein deriving the first labeling set further configure the system to: … (Ho teaches determining a subset of matching utterances and presenting a user with the subset of the matching utterances to be labeled by the user (i.e., a labeling set including at least a portion of samples of sentences from the base results set.) Ho at Fig. 7 and ¶¶ [0061] - [0067].)
receive labels on each sentence in the first labeling set; (Ho teaches a user labeling the utterances from the subset of the utterances (i.e., receiving labels on each sentence). Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
and feed the first labeling set of the respective labels to a machine learning algorithm to train the tracker model. (Further, Ho teaches training a machine learning model / classifier (i.e., a tracker model) using the corpus of conversational logs of utterances labeled by the user and propagated from the subset to the remaining (i.e., feeding the labeling set and the labels to a machine learning model for training.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067].)
Ho, however, does not alone teach “wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set”, “cluster the sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index;” and “select the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence.”
In a similar field of endeavor (e.g., processing natural language queries, indexing, and clustering query results), Boteanu teaches wherein the index has sentence embedding values for the indexing textual data including the matching sentences in the base results set (Boteanu teaches determining closely related entities to keywords/key phrases by clustering word embeddings and clustering the word embeddings using latent semantic indexing (i.e., the base results rest is clustered and indexed.). Boteanu at 10:1 – 10:31.)
Further, Boteanu teaches cluster the sentences in the base results set into a plurality of clusters, based on their respective sentence embedding values in the index; (Boteanu teaches clustering groups of words (i.e., a labeling set) using word embeddings. Boteanu at 9:50-10:18. Further, Boteanu teaches a seller wishing to index their catalog, and incorporating the seller's pre-existing catalog in the task of clustering the information within the catalog. (Boteanu at 10:32 - 11:7.) As such, Boteanu's index exists prior to the query runtime and performs clustering of both the present, pre-existing index, and the external references which are indexed at query run-time.)
Further, Boteanu teaches select the sample sentences from among clusters that are within a predefined threshold distance from a sequence embedding value of the input sentence; (Boteanu teaches selecting a key phrase (i.e., a sample sentence) to serve as a topic for a set to aid in narrowing search results. Boteanu at 12:21-33. Further, Boteanu teaches clustering the keywords and indicating which keywords are to be indexed (i.e., added to the search engine database as results to the query) using distance metrics such as Euclidean distance, Word Mover's Distance, and cosine similarity. Boteanu at 10:1 - 10:18. Further still, Boteanu teaches the cosine distance or Euclidean distance will have the "least distance" i.e., the closest value. Boteanu at 15:21 - 15:39. As such, a person of ordinary skill in the art would have understood that, in this situation, there is a predetermined moving threshold determining the closest embedding as the lowest distance value between the key phrase and the indexed content.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho with the teachings of Boteanu to provide the amended limitations of claim 12. Doing so would improve the accuracy of queries/searches as recognized by Boteanu at 2:10-2:31. Further, Boteanu and Ho both perform similar methods of responding to queries using natural language. As such, a person of ordinary skill in the art would have both recognized the similarities between Ho’s response systems and Boteanu’s query response and search results system, and been motivated to combine the two because they occupy a similar field and perform similar tasks.
Ho-Boteanu, however, do not alone teach “wherein the selected sample sentences include a sample sentence from each cluster of the plurality of clusters”
In a similar field of endeavor (e.g., building training data for neural networks using clustering encoded data.), Wang teaches the selected sample sentences include a sample sentence from each cluster of the plurality of clusters. (Wang teaches a method of selecting samples withing a system of generating training data for neural network models wherein the samples are clustered, then a sample from each cluster is selected and used to train a model as part of an iterative process. Wang at ¶¶ [0054] - [0058]. Further, Wang's selected samples are selected from a subset of the original clusters (i.e., the number of samples selected is half the number of samples across two batches of samples. Therefore, the selected samples are selected from a subset of the larger set of samples.) Wang at ¶¶ [0054] - [0058]. Further, the samples selected by Wang are labeled (i.e., the selected samples for a labeling set.) Wang at ¶¶ [0054] - [0058].) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu with the teachings of Wang to provide selecting samples from each cluster of the plurality of clusters. Doing so would have been an obvious to one of ordinary skill in the art to modify Ho-Boteanu with the clustering and sample selection system of Wang, which selects and labels samples from each cluster of a plurality of clusters, by simply applying the sample selection method of Wang to the clusters of Ho-Boteanu to ensure that a correct number of samples are selected during training of machine learning models. Wang teaches that the selected number of samples from each cluster should be balanced as much as possible. Applying this concept to Ho-Boteanu would be a predictable application of known techniques to improve consistency and balance of the training sets used by Ho-Boteanu for language models. 






Regarding claim 13, Ho-Boteanu-Wang teaches the system of claim 12 wherein when the tracker model is not ready, the system is further configured to: iteratively generate a second labeling set from the base results set; (Ho teaches presenting a user with alternate queries that result in additional to-be-labeled utterances from the conversational logs (i.e., the user is presented with additional sentences for labeling.) Ho at Figs. 7 - 8 and ¶¶ [0061] - [0067]. The additional queries presented to the user which yield additional results similar to the query would logically be labeled as the interface shown in Fig. 8 of Ho suggests the ability to create a new labeling set by using the "new intent" button or altering the text field and interacting with the "Search" button an additional time.)
receive labels on each sentence in the second labeling set; (Ho teaches presenting a user with additional sentences for labeling after processing additional queries. Ho at ¶¶ [0061] - [0067] and Figs. 7 - 8. As such, labeling the utterances by the user amounts to the system receiving labels for each sentence (utterance).)
and feed the second labeling set and the respective labels to the machine learning algorithm to further train the tracker model. (Ho teaches using the user-defined labels (i.e., the labeling sets) to train a machine learning model / classifier on the labeled data. Ho at Figs. 7 - 8, and ¶¶ [0061] - [0067].)

Regarding claim 18, Ho-Boteanu-Wang teaches the limitations of claim 12 as shown above. Further, Ho teaches the system of claim 12 wherein the system is configured to include all the determined matching sentences in the base results set (Ho at ¶¶ [0061] – [0067].). 
Furthermore, Boteanu teaches the system further configured to compute a sentence word embedding value to the input sentence (Boteanu at 10:1-10:18); determine, based on their respective sentence embedding values in the index, all the matching sentences in the index that are close to the sentence word embedding value of the input sentence; and include all the determined matching sentences in the base results set. (Boteanu at 10:1-10:31).

Regarding claim 20, Ho-Boteanu-Wang teaches all the limitations of claim 13 as laid out above. Further, Ho teaches the system of claim 13, wherein the system is further configured to: generate the second labeling set from the base results set and labels generated based on the first labeling set. (Ho teaches propagating labels from a subset of utterances (i.e., a first labeling set) throughout a corpus of utterances to yield a labeled corpus (i.e., a second labeling set generated based on the base results set and labels of the first labeling set.) Ho at ¶¶ [0061] - [0067].)

Claims 3 – 4 and 14 -15 are rejected under 35 U.S.C. 103 as being unpatentable over Ho-Boteanu-Wang in view of U.S. Patent No. 6,675,159 B1 to Lin et al. (hereinafter Lin).
Regarding claim 3, Ho-Boteanu-Wang teaches all the limitations of claim 1 as laid out above. Ho, however, does not teach indexing textual data stored in a corpus to generate an index.
In a similar field of endeavor (e.g., processing a corpus of textual data for query processing and training classifiers), Lin teaches the method of claim 1, further comprising: indexing textual data stored in a corpus to generate the index (Lin teaches indexing collections of documents and storing the documents in a data-repository (i.e., a corpus). Lin at 10:62-11:44.).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu-Wang with the teachings of Lin to provide the limitations of claim 3. Doing so would have produced a better repository using a bayes classifier with better documents as recognized by Lin at 21:30 - 21:39. Further, doing so would have yielded improvements in the precision of information retrieval and improvements in the user interface as recognized by Lin at 6:32 - 7:22. Further still, Lin and Ho both teach query processing systems operating on corpora. As such, A person of ordinary skill in the art would have looked to Lin to index the corpus found in Ho because they teach similar processes in similar fields.

Regarding claim 4, Ho-Boteanu-Wang in view of Lin (hereinafter Ho-Boteanu-Wang-Lin) teaches all the limitations of claim 3 as laid out above. Further, Lin teaches the method of claim 3, wherein indexing the textual data further comprises: splitting each record in the corpus into a plurality of sentences; (Lin teaches splitting a document/input into individual sentences. Lin at 14:22-14:32.)
computing a vector representation to each of the plurality of sentences; (Lin teaches generating a vector representation of sentences. Lin at 18:17 – 18:34.)
associating metadata fields with the vector representation, wherein the vector representation includes a sentence embedding value; (Lin teaches the vector representation and the sentence including an adaptive weight (i.e., embedding value).) Lin at 18:17 – 18:34. Further, Lin teaches calculating a probability to describe the likelihood of a specific topic that is stored in the data depository i.e., the probability is data that describes the data of the repository (metadata). Lin at 18:17 – 18:34 and 24:60 – 25:36.)
and saving a sentence with its respective vector representation and metadata fields as a vector included in as entry in the index. (Lin teaches the vector and calculating a probability to describe the likelihood of a specific topic that is stored in the data depository. i.e., the probability is data that describes the data of the repository (metadata). Lin at 18:17-18:34 and 24:60-25:36. As such, storing the data in the depository with its probability and adaptive weight is saving a sentence with its respective vector representation and metadata fields as a vector included as an entry in the index.)

Regarding claim 14, Ho-Boteanu-Wang teaches all the limitations of claim 13 as laid out above. However, Ho-Boteanu-Wang does not teach indexing textual data stored in a corpus to generate an index.
In a similar field of endeavor (e.g., processing a corpus of textual data for query processing and training classifiers), Lin teaches the system of claim 13, further configured to: index textual data stored in a corpus to generate the index (Lin teaches indexing collections of documents and storing the documents in a data-repository (i.e., a corpus). Lin at 10:62-11:44.).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu-Wang with the teachings of Lin to provide the limitations of claim 14. Doing so would have produced a better repository using a bayes classifier with better documents as recognized by Lin at 21:30 - 21:39. Further, doing so would have yielded improvements in the precision of information retrieval and improvements in the user interface as recognized by Lin at 6:32 - 7:22. Further still, Lin and Ho both teach query processing systems operating on corpora. As such, A person of ordinary skill in the art would have looked to Lin to index the corpus found in Ho because they teach similar processes in similar fields.

Regarding claim 15, Ho-Boteanu-Wang-Lin teaches all the limitations of claim 14 as laid out above. Further, Lin teaches the system of claim 14, wherein the system is further configured to: split each record in the corpus into a plurality of sentences; (Lin teaches splitting a document/input into individual sentences. Lin at 14:22-14:32.)
compute a vector representation to each of the plurality of sentences; (Lin teaches generating a vector representation of sentences. Lin at 18:17 – 18:34.)
associate metadata fields with the vector representation, wherein the vector representation includes a sentence embedding value; (Lin teaches the vector representation and the sentence including an adaptive weight (i.e., embedding value).) Lin at 18:17 – 18:34. Further, Lin teaches calculating a probability to describe the likelihood of a specific topic that is stored in the data depository i.e., the probability is data that describes the data of the repository (metadata). Lin at 18:17 – 18:34 and 24:60 – 25:36.)
and save a sentence with its respective vector representation and metadata fields as a vector included in as entry in the index. (Lin teaches the vector and calculating a probability to describe the likelihood of a specific topic that is stored in the data depository. i.e., the probability is data that describes the data of the repository (metadata). Lin at 18:17-18:34 and 24:60-25:36. As such, storing the data in the depository with its probability and adaptive weight is saving a sentence with its respective vector representation and metadata fields as a vector included as an entry in the index.)

Claims 5, 6, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Ho-Boteanu-Wang-Lin in view of U.S. Patent Publication No. US 2020/0387819 A1 to Oleg Rogynskyy et al. (hereinafter Rogynskyy).
Regarding claim 5, Ho-Boteanu-Wang-Lin teaches the limitations of claim 4 as shown above. However, Ho-Lin does not teach the records in the corpus includes at least transcripts of calls and email messages related to sales in an organization. 
Rogynskyy teaches records in the corpus include at least transcripts of calls and email messages related to sales in an organization. For example, Rogynskyy teaches extracting information related to sales representative performance to aid in improving their sales performance. (Rogynskyy at ¶¶ [0053] and [0332]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ho-Boteanu-Wang-Lin to incorporate the teachings of Rogynskyy to provide transcripts of calls and emails related to sales in an organization. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Regarding claim 6, Ho-Boteanu-Wang-Lin teaches the limitations of claim 5 as shown above. However, Ho-Lin does not teach wherein the metadata fields are retrieved from a customer relationship management (CRM) of the organization. 
Rogynskyy, however, teaches retrieving information such as attributes or metrics such as “yes” or “no” referring to other data. (i.e., metadata) Rogynskyy at ¶ [0421]. Further, Rogynskyy teaches retrieving information from customer relationship management systems. (i.e., metadata fields retrieved from a customer relationship management (CRM) system of the organization.). Rogynskyy at ¶¶ [0046], [0214], and [0436].
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ho-Boteanu-Wang-Lin to incorporate the teachings of Rogynskyy to retrieve the metadata fields taught by Lin from the customer relationship management system taught by Rogynskyy. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Regarding claim 16, Ho-Boteanu-Wang-Lin teaches the limitations of claim 15 as shown above. However, Ho-Lin does not teach the records in the corpus includes at least transcripts of calls and email messages related to sales in an organization. 
Rogynskyy teaches records in the corpus include at least transcripts of calls and email messages related to sales in an organization. For example, Rogynskyy teaches extracting information related to sales representative performance to aid in improving their sales performance. (Rogynskyy at ¶¶ [0053] and [0332]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ho-Boteanu-Wang-Lin to incorporate the teachings of Rogynskyy to provide transcripts of calls and emails related to sales in an organization. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Regarding claim 17, Ho-Boteanu-Wang-Lin teaches the limitations of claim 16 as shown above. However, Ho-Lin does not teach records in the corpus includes at least transcripts of calls and email messages related to sales in an organization or wherein the metadata fields are retrieved from a customer relationship management (CRM) of the organization. 
Rogynskyy, however, teaches retrieving information such as attributes or metrics such as “yes” or “no” referring to other data. (i.e., metadata) Rogynskyy at ¶ [0421]. Further, Rogynskyy teaches retrieving information from customer relationship management systems. (i.e., metadata fields retrieved from a customer relationship management (CRM) system of the organization.). Rogynskyy at ¶¶ [0046], [0214], and [0436].
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ho-Boteanu-Wang-Lin to incorporate the teachings of Rogynskyy to retrieve the metadata fields taught by Lin from the customer relationship management system taught by Rogynskyy. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Claims 10 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Ho-Boteanu-Wang as applied to claims 1 and 12 above in view of Rogynskyy.

Regarding claim 10, Ho-Boteanu-Wang teaches the limitations of claim 1 as shown above. However, Ho-Boteanu-Wang does not teach receiving a transcript of a new sales call; and identifying, using the tracker model, a tracker in the transcript of a new sales call.
Rogynskyy, however, teaches receiving a transcript of a new sales call; and identifying, using the tracker model, a tracker in the transcript of a new sales call. Rogynskyy teaches sales representatives of an organization making calls and taking part in meetings. (Rogynskyy at ¶ [0053]). Rogynskyy teaches tracking elements of sales made by sales representatives (Rogynskyy at ¶ [0053]) and using machine learning to identify trends and behaviors not tracked by the managers. (i.e., identifying using a tracker model, a tracker in the transcript). Rogynskyy teaches receiving information from various systems that may include telephone transcripts. (Rogynskyy at ¶ [0332]). Rogynskyy further teaches ingesting new electronic activities (e.g., new sales call, new email chain, new message, etc.) Rogynskyy at ¶ [0075]. Rogynskyy teaches that the system is used to tag electronic activities relating to sales, recruiting, or other business-related activities. Rogynskyy at ¶ [0143]. Further, Rogynskyy teaches electronic activities are calls, meetings, and others relating to deals or opportunities sales representatives are working on. Rogynskyy at ¶ [0053].
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu-Wang with the teachings of Rogynskyy to receive a transcript of a new sales call and identify, using the tracker model, a tracker in the transcript of a new sales call. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Regarding claim 21, Ho-Boteanu-Wang teaches the limitations of claim 12 as shown above. However, Ho does not teach wherein the system is further configured to: receive a transcript of a new sales call; and identify, using the tracker model, a tracker in the transcript of a new sales call.
Rogynskyy, however, teaches receiving a transcript of a new sales call; and identifying, using the tracker model, a tracker in the transcript of a new sales call. Rogynskyy teaches sales representatives of an organization making calls and taking part in meetings. (Rogynskyy at ¶ [0053]). Rogynskyy teaches tracking elements of sales made by sales representatives (Rogynskyy at ¶ [0053]) and using machine learning to identify trends and behaviors not tracked by the managers. (i.e., identifying using a tracker model, a tracker in the transcript). Rogynskyy teaches receiving information from various systems that may include telephone transcripts. (Rogynskyy at ¶ [0332]). Rogynskyy further teaches ingesting new electronic activities (e.g., new sales call, new email chain, new message, etc.) Rogynskyy at ¶ [0075]. Rogynskyy teaches that the system is used to tag electronic activities relating to sales, recruiting, or other business-related activities. Rogynskyy at ¶ [0143]. Further, Rogynskyy teaches electronic activities are calls, meetings, and others relating to deals or opportunities sales representatives are working on. Rogynskyy at ¶ [0053].
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date to combine the teachings of Ho-Boteanu-Wang with the teachings of Rogynskyy to receive a transcript of a new sales call and identify, using the tracker model, a tracker in the transcript of a new sales call. Doing so would save users time and effort by reducing the amount of time required to generate reports and gather information automatically as recognized by Rogynskyy at ¶ [0053].

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAMERON KENNETH YOUNG whose telephone number is (703)756-1527. The examiner can normally be reached Mon - Fri, 9:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached at 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CAMERON KENNETH YOUNG/Examiner, Art Unit 2655                                                                                                                                                                                                        
/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655
Read full office action
Prosecution Timeline

Jan 31, 2022
Application Filed
May 13, 2024
Non-Final Rejection — §103
Aug 14, 2024
Response Filed
Oct 17, 2024
Final Rejection — §103
Jan 28, 2025
Request for Continued Examination
Jan 30, 2025
Response after Non-Final Action
Mar 07, 2025
Non-Final Rejection — §103
Apr 24, 2025
Applicant Interview (Telephonic)
Apr 24, 2025
Examiner Interview Summary
May 01, 2025
Response Filed
Jun 30, 2025
Final Rejection — §103
Oct 07, 2025
Request for Continued Examination
Oct 10, 2025
Response after Non-Final Action
Jan 28, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/999,850
Patent 12602409
INFORMATION SEARCH SYSTEM
2y 5m to grant Granted Apr 14, 2026
18/290,574
Patent 12592230
RECOGNITION OR SYNTHESIS OF HUMAN-UTTERED HARMONIC SOUNDS
2y 5m to grant Granted Mar 31, 2026
17/974,455
Patent 12567429
VOICE CALL CONTROL METHOD AND APPARATUS, COMPUTER-READABLE MEDIUM, AND ELECTRONIC DEVICE
2y 5m to grant Granted Mar 03, 2026
18/619,608
Patent 12525250
Cascade Architecture for Noise-Robust Keyword Spotting
2y 5m to grant Granted Jan 13, 2026
18/096,309
Patent 12493748
LARGE LANGUAGE MODEL UTTERANCE AUGMENTATION
2y 5m to grant Granted Dec 09, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
70%
Grant Probability
82%
With Interview (+12.5%)
2y 11m
Median Time to Grant
High
PTA Risk
Based on 20 resolved cases by this examiner. Grant probability derived from career allow rate.