Last updated: April 19, 2026
Application No. 19/082,244
INDICATOR EVALUATION METHOD AND INDICATOR EVALUATION SYSTEM OF CLUSTER STABILITY

Final Rejection §101§103
Filed
Mar 18, 2025
Examiner
CAIADO, ANTONIO J
Art Unit
2164
Tech Center
2100 — Computer Architecture & Software
Assignee
Profet AI Technology Co. Ltd.
OA Round
2 (Final)
Interview Optional

— +49.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 188 resolved cases, 2023–2026
Examiner Intelligence

CAIADO, ANTONIO J View full profile →
Grants 69% — above average
Career Allow Rate
130 granted / 188 resolved
+14.1% vs TC avg
Strong +50% interview lift
Without
With
+49.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
23 currently pending
Career history
211
Total Applications
across all art units
Statute-Specific Performance

§101
30.1%
-9.9% vs TC avg
§103
50.5%
+10.5% vs TC avg
§102
3.9%
-36.1% vs TC avg
§112
13.0%
-27.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 188 resolved cases
Office Action

§101 §103
DETAILED ACTION
1.	 Claims 1-14 are pending in this application.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Response to Amendment
3.	This office action is in response to applicant’s amendment filed on 02/10/2026 in response to the non-final action mailed on 12/02/2025.  Claims 1 and 8-14 have been amended. Claims 2-7 have been kept original. Amendment has been entered.

Response to Arguments
4.	 It is noted that the applicant amended the specification and drawings to correct the deficiencies pointed out in the previous office action. Therefore, the objections to the specification and drawings are withdrawn.

	Applicant's arguments, filed on 02/10/2026, with respect to the rejection of claims 1-14 under 35 U.S.C. §101 an abstract idea (mental process) (Applicant’s arguments, pages 12-13), have been fully considered but are not persuasive. Respectfully, the examiner disagrees, see the clarification below.

	Applicant argues that “Since the applicant has amended the processing device to be the processor, and has amended the data storage device to be the memory in claims 8-14, claims 8-14 recite hardware elements. Therefore, the claimed invention is not directed to nonstatutory subject matter.”  However, the specification lacks a description of a processor as a hardware processor. A processor can be a logic processor, as known in computer science to be a simple logic that processes data. For example, CPUs (Intel Core/AMD Ryzen), GPUs, and specialized units like FPGAs and PLCs are all known as logic processors that execute instructions, perform calculations, and manage data flow.
	For this reason, the § 101 (software per se) rejection of claims 8-14 is upheld. 

	Applicant Argues that “Specifically, since the currently amended claims 1 and 8 have specific steps of "uniform down-sampling" and "statistical test" to screen the sub-data to be analyzed, the currently amended claims 1 and 8 can improve a reliability of computers in automatically assessing cluster stability, rather than simply performing the mental process on computers.” However, the claims fail to disclose in detail how such improvement is achieved. Simply arguing that a method can improve the reliability of computers in automatically assessing cluster stability, rather than simply performing a mental process on computers, does not show an improvement to any technical field or technology. See MPEP 2106.05(f)(1) – “(1) Whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished.”.
	With regard to the new amendment, which recites that “the higher the cluster stability indicator is, the higher a stability of a final cluster result is.” This new add element is a mere instruction used to implement an abstract idea. The claims must include more than mere instructions to perform a method on generic machinery to qualify as an improvement to an existing technology. See MPEP § 2106.05(f). 

	Applicant Argues that “claims 1 and 8 as currently amended as a whole are integrated into a practical application”. However, the Applicant fails to point out which limitations or elements integrate the abstract ideas into a practical application. The claims do not disclose in detail how to achieve the results that the applicant appears to claim. It appears that the Applicant has an idea for a solution to a problem, but fails to describe the problem in detail within the claims and specification. See MPEP 2106.05(f)(1).
	For this reason, the § 101 abstract idea (mental process) rejection of claims 1-14 is upheld.
	
	Applicant's arguments, filed on 02/10/2026, with respect to the rejection of claims 1-14 under 35 U.S.C. §103 (Applicant’s arguments, pages 13-15), have been fully considered and are but are moot because the independent claims are amended and introduce new limitations that were not previously presented newly found prior art has been applied.
Claim Rejections - 35 USC § 101
5.	35 U.S.C. §101 reads as follows:
	Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new 	and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

	Claims 8-14 rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.

	As per independent claim 8, the claim recites system comprising a processor. The specification does not have support for the processor to be a hardware processor. It is known in computer science that a processor can be a simple a logical module implemented as software. Because a processor can be a simple software module that processes data, the examiner asserted that the processor of the claims is a simple software module given a broadest reasonable interpretation (BRI). 
	The claims use various steps that would be reasonably understood by one of ordinary skill in the art to mean software per se steps, software-based steps implementation, or an abstract concept based on software per se. Examples of steps used in the claim are “uniformly down-sampling a raw data to be clustered to generate a plurality of groups of sub-data thereof; wherein the processor calculates a plurality of similarities of the raw data to be clustered with the plurality of groups of sub-data according to at least one statistical test; wherein the processor keeps the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed and clusters the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results; wherein the processor organizes a plurality of cluster label models of the sub-data cluster results,  generates a plurality of organized sub-data cluster results according to the organized cluster label models, and calculates a cluster stability indicator according to the organized sub-data cluster results.”  Those steps are concepts well understood in the art as software per se steps implementations. As mention above the specification, there are not hardware components explicitly described functioning as the hardware used to implement the above steps. Since a computer software is merely a set of instructions capable of being executed by a computer system. Accordingly, these claims are rejected as non-statutory for failing to disclose such hardware where the software steps describe above can be implemented. Please amend the claim to recite hardware elements, such as a processor and/or memory, and ensure that the specification provides support for the processor and/or memory as hardware.
	Dependent claims 9-14 are also rejected for inheriting the deficiencies of the base claims.	

	Claims 1-14 are rejected under 35 U.S.C. §101 because the claimed invention is directed to an abstract idea (Mental Process) without significantly more. The claims describe steps to evaluate cluster stability indicators.
	The following is an analysis based on 2019 Revised Patent Subject Matter Eligibility Guidance (2019 PEG).


Step 1, Statutory Category?
	Claims 1-7 are directed to a method.
	Claims 8-14 are directed to a system.
	Therefore, claims 1-7 fall into at least one of the four statutory categories. 
	Claims 8-14, which recite a system comprising a processor without support in the specification that the processor is a hardware processor. The claims 8-14 are rejected as software per se, and for this reason, the claims 8-14 fail to fall within any of the four statutory categories.

Step 2A, Prong I: Judicial Exception Recited?
	The examiner submits that the foregoing claim limitations constitute a “Mental Process”, as the claims cover performance of the limitations in the human mind, given the broadest reasonable interpretation.

	As per independent claims 1 and 8, the claims similarly recite the limitations of:
	“uniformly down-sampling a raw data to be clustered to generate a plurality of groups of sub-data thereof;” A human can observe and mentally judge data using criteria that would create a group of the data for each criterion. For example, imagine a human recruiter receives 10 resumes for a job opening. They need to quickly sort these into different groups for further review. The recruiter mentally applies several specific criteria to create these groups. There is nothing so complex in the limitation that could not be doing in the human mind.
	“calculating a plurality of similarities of the raw data to be clustered with the plurality of groups of sub-data according to at least one statistical test;” A human can observe data and judge the observed data using statistical criteria to identify similarity between them. Upon identify the similarity a human can mentally group the data in groups. The statistical test is merely an element used to implement the abstract idea herein. For example, imagine a real estate agent assesses house values on a street by mentally scanning properties. The agent identifies houses fitting their "mental baseline" ("similar homes") versus those that do not ("outliers"). There is nothing so complex in the limitation that could not be doing in the human mind.
	“keeping the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed;” A human can observe a collection of data and apply a threshold to the observed data to identify data in the collection that is above the applied threshold. For example, imagine a person checking canned goods sets an expiration date threshold (e.g., November 1, 2025). They observe each can's date and apply the threshold, removing any identified as expired (before the threshold date). There is nothing so complex in the limitation that could not be doing in the human mind.
	“clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of sub-data cluster results;” A human can observe a collection of data and make judgments according with the criteria to generate a sub-collection of the observed data. For example, imagine a photographer has just finished a wedding shoot and needs to select the best photos to edit and deliver to the client. The raw data is a collection of 1,500 unedited digital images, and the selected best photos are the result of the judgments of the raw data, thereby creating the sub-collection of the observed data. The cluster algorithm is merely an element used to implement the abstract idea herein. There is nothing so complex in the limitation that could not be doing in the human mind.
	“organizing a plurality of cluster label models of the sub-data cluster results and generating a plurality of organized sub-data cluster results according to the organized cluster label models;” A human can observe data that represent models and mentally organize observed models. For example, imagine a city planner reviews blueprints (data representing models) for a development project. The planner applies mental criteria (like cost or sustainability) to mentally sort and organize these distinct architectural models into an order or groups for a meeting. There is nothing so complex in the limitation that could not be doing in the human mind.
	“calculating a cluster stability indicator according to the organized sub-data cluster results.” A human can mentally observe data that is organized in a cluster and define indicators of stability based on simple criteria pre-established. For example, a data analyst observes customer location clusters on a map. The analyst applies pre-established visual criteria (density, separation, size consistency) to mentally judge the stability of these distinct groupings. There is nothing so complex in the limitation that could not be doing in the human mind.

	As per dependent claim 2, the claim recites the limitation of:
	“preprocessing the raw data to generate the raw data to be clustered.” A human can observe and mentally make judgments about the observed data and organize it based on the result of the judgments into groups. There is nothing so complex in the limitation that could not be doing in the human mind.
	As per dependent claim 3, the claim recites the limitation of:
	“wherein the raw data comprises at least a numerical feature or a character feature;” The at least one numerical feature or a character feature is merely an element used to implement abstract ideas.
	“determining whether the raw data comprises the character feature;” A human observes data and makes a judgment to define if a certain character is in the observed data or not. There is nothing so complex in the limitation that could not be doing in the human mind.
	“when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered;” A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data. For example, a driver mentally assigns numerical values (e.g., Green=1, Red=3) to the observed text or color of a traffic signal to quickly organize their response and make a decision. There is nothing so complex in the limitation that could not be doing in the human mind.
	“when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered.” The when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered is merely an instruction used to implement abstract ideas.

	As per dependent claim 4, the claim recites the limitation of:
	“wherein the raw data comprises at least a numerical feature or a character feature;” The at least one numerical feature or a character feature is merely an element used to implement abstract ideas.
	“determining whether the raw data comprises the character feature;” A human observes data and makes a judgment to define if a certain character is in the observed data or not. There is nothing so complex in the limitation that could not be doing in the human mind.
	“when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature and standardizing the transformation numerical feature to generate the raw data to be clustered;” A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data. The human can follow a predefine standard to do the transformation. A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data; the human can follow a predefined standard to do this transformation. A simple example is a teacher using a predefined answer key to mentally transform letter grades (A, B, C, D) into numerical points (1 or 0) to organize and score a test. There is nothing so complex in the limitation that could not be doing in the human mind.
	“when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered.” The when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered. is merely an instruction used to implement abstract ideas.

	As per dependent claim 5, the claim recites the limitation of:
	“according to a raw data cluster label model of the raw data cluster results, utilizing a decision tree classifier to overfit a model prediction for the plurality of cluster label models of the plurality of sub-data cluster results to generate the organized sub-data cluster results.” A human can observe data that is grouped, apply predefined criteria, and refine it into subgroups of data to organize the grouped data into smaller groups. A simple example is a person organizing a large folder of "Photos" into progressively smaller subgroups using predefined criteria such as year and event name (e.g., creating the path Photos/2025/Vacation_Mexico). There is nothing so complex in the limitation that could not be doing in the human mind.
	As per dependent claim 6, the claim recites the limitation of:
	“calculating a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results;” A human can observe data and mentally calculate probabilities related to the data, define data limits based on the resulting probability, and then organize it into groups. A gambler mentally calculates the observed frequency of dice rolls, defining probability-based limits to organize outcomes into groups like "very low" (2, 3, 4) and "average" (5-9) to guide their betting strategy. There is nothing so complex in the limitation that could not be doing in the human mind.

	As per dependent claim 7, the claim recites the limitation of:
	“wherein the step for clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results further comprises the following sub-step:” A human can observe a collection of data and make judgments according with the criteria to generate a sub-collection of the observed data. For example, imagine a photographer has just finished a wedding shoot and needs to select the best photos to edit and deliver to the client. The raw data is a collection of 1,500 unedited digital images, and the selected best photos are the result of the judgments of the raw data, thereby creating the sub-collection of the observed data. The cluster algorithm is merely an element used to implement the abstract idea herein. There is nothing so complex in the limitation that could not be doing in the human mind.
	“clustering the raw data to be clustered according to the cluster algorithm to generate a raw data cluster result;” A human can observe data, make judgments, and based on the judgments, organize data into groups. The cluster algorithm is merely an element used to implement the abstract idea herein. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein the indicator evaluation method of the cluster stability further comprises the following sub-step: generating a final cluster result according to the raw data cluster results and the organized sub-data cluster results;” A human can observe data, make judgments, and based on the judgments, organize data into groups; the organized data can be a result or final result judged by the human mentally. A simple example of this is a chef sorting ingredients: they observe various raw foods, judge them mentally (e.g., "this goes in the fridge, this is a vegetable"), and organize them into final groups like the "refrigerator" pile and the "pantry" pile. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein the final cluster result corresponds to the cluster stability indicator.” The final cluster result corresponding to the cluster stability indicator is merely an instruction used to implement the abstract ideas.

	As per dependent claim 9, the claim recites the limitation of:
	“wherein the processor is communicatively connected to the memory to access the raw data and preprocesses the raw data to generate the raw data to be clustered.” A human can observe and mentally make judgments about the observed data and organize it based on the result of the judgments into groups. There is nothing so complex in the limitation that could not be doing in the human mind.

	As per dependent claim 10, the claim recites the limitation of:
	“wherein the raw data comprises at least a numerical feature or a character feature;” The at least one numerical feature or a character feature is merely an element used to implement abstract ideas.
	“wherein the processor determines whether the raw data comprises the character feature when the processor preprocesses the raw data;” A human observes data and makes a judgment to define if a certain character is in the observed data or not. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered;” A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data. For example, a driver mentally assigns numerical values (e.g., Green=1, Red=3) to the observed text or color of a traffic signal to quickly organize their response and make a decision. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein when the raw data excludes the character feature, the processor utilizes the raw data as the raw data to be clustered.” The when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered is merely an instruction used to implement abstract ideas.

	As per dependent claim 11, the claim recites the limitation of:
	“wherein the raw data comprises at least a numerical feature or a character feature;” The at least one numerical feature or a character feature is merely an element used to implement abstract ideas.
	“wherein when the processor preprocesses the raw data, determining whether the raw data comprises the character feature;” A human observes data and makes a judgment to define if a certain character is in the observed data or not. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature and standardizes the transformation numerical feature to generate the raw data to be clustered;” A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data. The human can follow a predefine standard to do the transformation. A human can observe data that has a particular character and mentally transform the particular character into a numerical feature to organize the observed data; the human can follow a predefined standard to do this transformation. A simple example is a teacher using a predefined answer key to mentally transform letter grades (A, B, C, D) into numerical points (1 or 0) to organize and score a test. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein when the raw data excludes the character feature, the processor standardizes the numerical feature of the raw data to generate the raw data to be clustered.” The when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered. is merely an instruction used to implement abstract ideas.

	As per dependent claim 12, the claim recites the limitation of:
	“wherein when the processor organizes the cluster label models of the sub-data cluster results and generates the organized sub-data cluster results according to the organized cluster label models, the processor utilizes a decision tree classifier to overfit a model prediction for the cluster label models of the sub-data cluster results according to a raw data cluster label model of the raw data cluster results to generate the organized sub-data cluster results.” A human can observe data that is grouped, apply predefined criteria, and refine it into subgroups of data to organize the grouped data into smaller groups. The decision tree classifier, model prediction and raw data cluster label model are merely an element used to implement abstract ideas. A simple example is a person organizing a large folder of "Photos" into progressively smaller subgroups using predefined criteria such as year and event name (e.g., creating the path Photos/2025/Vacation_Mexico). There is nothing so complex in the limitation that could not be doing in the human mind.

	As per dependent claim 13, the claim recites the limitation of:
	“wherein when the processor calculates the cluster stability indicator according to the organized sub-data cluster results, the processor calculates a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results and averages a plurality of highest cluster probabilities of each of the plurality of data points in the raw data to be clustered to generate the cluster stability indicator.” A human can observe data and mentally calculate probabilities related to the data, define data limits based on the resulting probability, and then organize it into groups. A gambler mentally calculates the observed frequency of dice rolls, defining probability-based limits to organize outcomes into groups like "very low" (2, 3, 4) and "average" (5-9) to guide their betting strategy. A human can also mentally calculate an average of observed data and define lower and higher averages. There is nothing so complex in the limitation that could not be doing in the human mind.

	As per dependent claim 14, the claim recites the limitation of:
	“wherein when the processor clusters the plurality of groups of the sub-data to be analyzed according to the cluster algorithm to generate a plurality of the sub-data cluster results,” A human can observe a collection of data and make judgments according with the criteria to generate a sub-collection of the observed data. For example, imagine a photographer has just finished a wedding shoot and needs to select the best photos to edit and deliver to the client. The raw data is a collection of 1,500 unedited digital images, and the selected best photos are the result of the judgments of the raw data, thereby creating the sub-collection of the observed data. The cluster algorithm is merely an element used to implement the abstract idea herein. There is nothing so complex in the limitation that could not be doing in the human mind.
	“the processor clusters the raw data to be clustered to generate a raw data cluster result according to the cluster algorithm;” A human can observe data, make judgments, and based on the judgments, organize data into groups. The cluster algorithm is merely an element used to implement the abstract idea herein. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein the processor generates a final cluster result according to the raw data cluster results and the organized sub-data cluster results;” A human can observe data, make judgments, and based on the judgments, organize data into groups; the organized data can be a result or final result judged by the human mentally. A simple example of this is a chef sorting ingredients: they observe various raw foods, judge them mentally (e.g., "this goes in the fridge, this is a vegetable"), and organize them into final groups like the "refrigerator" pile and the "pantry" pile. There is nothing so complex in the limitation that could not be doing in the human mind.
	“wherein the final cluster result corresponds to the cluster stability indicator.” The final cluster result corresponding to the cluster stability indicator is merely an instruction used to implement the abstract ideas.
	Accordingly, claims 1-14 recite at least one abstract idea.

Step 2A, Prong II: Integrated into a Practical Application?
	The claims recite the following additional limitations/elements:

	As per independent claims 1 and 8, the claims similarly recite the limitation of:
	“a cluster stability” The cluster stability is merely an element used in the implementation of abstract ideas.
	“wherein the higher the cluster stability indicator is, the higher final cluster result is.” The cluster stability indicator is merely an element used in the implementation of abstract ideas.

	As per dependent claim 2, the claim recites the limitation of:
	“receiving a raw data;” This limitation is example of adding insignificant extra-solution activity to the judicial exception (see MPEP § 2106.05(g)). Specifically, the additional limitation exemplifies mere data gathering, without any further processing or analysis.

	As per independent claim 8, the claim recites the limitation of:
	“a processor.” This element is example of mere instruction to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (see MPEP § 2106.05(f)). Specifically, the additional elements of the limitations invoke computers or other machinery merely as a tool to perform an existing process. Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) do not provide improvements to the functioning of a computer or to any other technology or technical field; and do not integrate a judicial exception into a practical application.
	As per dependent claim 2, the claim recites the limitations of:
	“storing a raw data;” This limitation is example of adding insignificant extra-solution activity to the judicial exception (see MPEP § 2106.05(g)). Specifically, the additional limitation exemplifies mere data gathering, without any further processing or analysis.
	“a data storage device” This element is example of mere instruction to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (see MPEP § 2106.05(f)). Specifically, the additional elements of the limitations invoke computers or other machinery merely as a tool to perform an existing process. Use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) do not provide improvements to the functioning of a computer or to any other technology or technical field; and do not integrate a judicial exception into a practical application.
	“wherein the processor is communicatively connected to the data storage device to access the raw data and preprocesses the raw data to generate the raw data to be clustered.” Considering the “communicatively connected to the data storage device to access the raw data” this part of the limitation is that this is a simple example of transmitting data. This limitation is example of adding insignificant extra-solution activity to the judicial exception (see MPEP § 2106.05(g)). Specifically, the additional limitation exemplifies mere transmitting data, without any further processing or analysis.
	Therefore, claims 1-14 do not integrate the recited abstract ideas into a practical application.
	
Step 2B: Claim provides an Inventive Concept?
	With respect to the limitations identified as insignificant extra-solution activity above the conclusions are carried over, and both the “receiving ….; store …; and communicatively connected to the data storage device to access the raw data” are well-understood, routine, and conventional operations.
	For support as being well-understood, routine, and conventional for “receiving ….; store …; and communicatively connected to the data storage device to access the raw data” as noted by the courts is well understood routine and conventional, see MPEP 2106.05(d)(ii) “i. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information); … buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network);” and/or MPEP 2106.05(d)(ii) “iv. Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93;”, and/or MPEP 2106.05(d)(II) “iii. Ultramercial, 772 F.3d at 716, 112 USPQ2d at 1755 (updating an activity log);”.
	Looking at the limitations in combination and the claim as a whole does not change this conclusion and the claim is ineligible.	
	Therefore, the claims 1-14 are not patent eligible. 



Claim Rejections - 35 USC § 103
6. 	In the event the determination of the status of the application as subject to AIA  35 U.S.C. § 102 and § 103 (or as subject to pre-AIA  35 U.S.C. § 102 and § 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
 The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:

	A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not 	identically disclosed as set forth in section § 102 of this title, if the differences between the claimed invention 	and the prior art are such that the claimed invention as a whole would have been obvious before the 	effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed 	invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. § 103(a) are summarized as follows:

1.    Determining the scope and contents of the prior art.
2.    Ascertaining the differences between the prior art and the claims at issue.
3.    Resolving the level of ordinary skill in the pertinent art.
4.    Considering objective evidence present in the application indicating obviousness or nonobviousness.


7.	Claims 1 and 8 are rejected under 35 U.S.C. § 103 as being unpatentable over Ben-Hur et al. (US 20080140592 A1) in view of Hummel et al. (US 20160217201 A1).
	
As per claim 1, Ben-Hur teaches an indicator evaluation method of a cluster stability (i.e. “The inventive method, on the other hand, provides a description of the stability of a clustering method”; para. [0080], comprising the following steps:
	uniformly down-sampling a raw data to be clustered to generate a plurality of groups of sub-data thereof (i.e. “For each level of granularity, the algorithm chooses a number of pairs of sub-samples or other perturbations, clusters each sub-sample with the chosen level of granularity, then computes a similarity between pairs of clustering solutions.”; para. [0010]. Further, i.e. “the algorithm is applied in a hierarchical manner by sub-sampling the data and clustering the sub-samples”; para. [0012]. Further, i.e. “The results of a test run on data uniformly distributed on the unit cube”; figs. 3a, 3b, para. [0059]; Examiner note: the uniformly down-sampling is interpreted as the uniformly distributed);
	calculating a plurality of similarities of the raw data to be clustered with the plurality of groups of sub-data according to at least one statistical test (i.e. “the algorithm chooses a number of pairs of sub-samples or other perturbations, clusters each sub-sample with the chosen level of granularity, then computes a similarity between pairs of clustering solutions.”; para. [0010]; Examiner: the statistical test is interpreted as the algorithm);
	clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of sub-data cluster results (i.e. “producing a set of clusterings of sub-samples of the data, and comparing the labels of the intersection of pairs of clusterings… by the clustering algorithm with that particular value of k, many sub-samples will produce similar clusterings”; para. [0050]; Examiner note: the clustering the plurality of groups of the sub-data is interpreted as the producing a set of clusterings of sub-samples of the data. The to be analyzed is interpreted as the comparing the labels);
	calculating a cluster stability indicator according to the organized sub-data cluster results (i.e. “characterize the stability of the clustering for each k. This is accomplished by producing a set of clusterings of sub-samples of the data, and comparing the labels of the intersection of pairs of clusterings using, for example, the correlation similarity measure.”; para. [0050]; Examiner note: the calculating the stability indicator is the characterize the stability indicator)
	wherein the higher the cluster stability indicator is, the higher final cluster result is (i.e. “High pairwise similarities show that the clustering represents a stable pattern in the data”; Abstract. Further, i.e. “k-means clustering according to the present invention produces solutions that are highly stable with respect to sub-sampling. This good result may be due to the global optimization criterion”; para. [0064]; Examiner note: the higher the cluster stability is interpreted as the stable pattern in the data. The higher final cluster result is interpreted as the high pairwise similarities in clustering).
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “keeping the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed; organizing a plurality of cluster label models of the sub-data cluster results and generating a plurality of organized sub-data cluster results according to the organized cluster label models;”
On the other hand, in the same field of endeavor, Hummel teaches keeping the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed (i.e. “the true label list determined by manual inspection is determined incrementally until the comparing produces a statistical confidence above a confidence threshold and a statistical power above a power threshold.”; fig. 4, para. [0016], [0070]; Examiner note: the plurality of similarities greater than a similarity threshold is interpreted as the comparing produces a statistical confidence above a confidence threshold);
	organizing a plurality of cluster label models of the sub-data cluster results (i.e. “Each list of top-k cluster labels 214, 215, and 216 is also re-ranked by the Label Ranking Module 104, according to a decisiveness value for each label as at 217, 218, and 219, respectively, producing a re-ranked list of labels for each algorithm as at 220. The top-k labels are the first k labels in each label list.”; fig. 4, para. [0030], [0049], [0055]-[0057], [0059]; Examiner note: the organizing a plurality of cluster label models is interpreted as the top-k cluster labels 214, 215, and 216 is also re-ranked) and 
generating a plurality of organized sub-data cluster results according to the organized cluster label models (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222.”; fig. 4, para. [0049]; Examiner note: the generating a plurality of organized sub-data cluster results is interpreted as the producing a fused label list 222);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).

As per claim 8, Ben-Hur teaches an indicator evaluation system of a cluster stability (i.e. “a system”; para. [0009]), comprising:
	a processor (i.e. “a processor”; para. [0009]), 
	uniformly down-sampling a raw data to be clustered to generate a plurality of groups of sub-data thereof (i.e. “For each level of granularity, the algorithm chooses a number of pairs of sub-samples or other perturbations, clusters each sub-sample with the chosen level of granularity, then computes a similarity between pairs of clustering solutions.”; para. [0010]. Further, i.e. “the algorithm is applied in a hierarchical manner by sub-sampling the data and clustering the sub-samples”; para. [0012]. Further, i.e. “The results of a test run on data uniformly distributed on the unit cube”; figs. 3a, 3b, para. [0059]; Examiner note: the uniformly down-sampling is interpreted as the uniformly distributed);
	wherein the processor calculates a plurality of similarities of the raw data to be clustered with the plurality of groups of sub-data according to at least one statistical test (i.e. “the algorithm chooses a number of pairs of sub-samples or other perturbations, clusters each sub-sample with the chosen level of granularity, then computes a similarity between pairs of clustering solutions.”; para. [0010]; Examiner: the statistical test is interpreted as the algorithm);
	clusters the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results (i.e. “producing a set of clusterings of sub-samples of the data, and comparing the labels of the intersection of pairs of clusterings… by the clustering algorithm with that particular value of k, many sub-samples will produce similar clusterings”; para. [0050]; Examiner note: the clustering the plurality of groups of the sub-data is interpreted as the producing a set of clusterings of sub-samples of the data. The to be analyzed is interpreted as the comparing the labels);
	calculates a cluster stability indicator according to the organized sub-data cluster results (i.e. “characterize the stability of the clustering for each k. This is accomplished by producing a set of clusterings of sub-samples of the data, and comparing the labels of the intersection of pairs of clusterings using, for example, the correlation similarity measure.”; para. [0050]; Examiner note: the calculating the stability indicator is the characterize the stability indicator);
	wherein the higher the cluster stability indicator is, the higher final cluster result is (i.e. “High pairwise similarities show that the clustering represents a stable pattern in the data”; Abstract. Further, i.e. “k-means clustering according to the present invention produces solutions that are highly stable with respect to sub-sampling. This good result may be due to the global optimization criterion”; para. [0064]; Examiner note: the higher the cluster stability is interpreted as the stable pattern in the data. The higher final cluster result is interpreted as the high pairwise similarities in clustering).  
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “wherein the processor keeps the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed; wherein the processor organizes a plurality of cluster label models of the sub-data cluster results, generates a plurality of organized sub-data cluster results according to the organized cluster label models;”
	On the other hand, in the same field of endeavor, Hummel teaches wherein the processor keeps the plurality of groups of sub-data with the plurality of similarities greater than a similarity threshold as a plurality of groups of the sub-data to be analyzed i.e. “the true label list determined by manual inspection is determined incrementally until the comparing produces a statistical confidence above a confidence threshold and a statistical power above a power threshold.”; fig. 4, para. [0016], [0070]; Examiner note: the plurality of similarities greater than a similarity threshold is interpreted as the comparing produces a statistical confidence above a confidence threshold);
	wherein the processor organizes a plurality of cluster label models of the sub-data cluster results (i.e. “Each list of top-k cluster labels 214, 215, and 216 is also re-ranked by the Label Ranking Module 104, according to a decisiveness value for each label as at 217, 218, and 219, respectively, producing a re-ranked list of labels for each algorithm as at 220. The top-k labels are the first k labels in each label list.”; fig. 4, para. [0030], [0049], [0055]-[0057], [0059]; Examiner note: the organizing a plurality of cluster label models is interpreted as the top-k cluster labels 214, 215, and 216 is also re-ranked), 
	generates a plurality of organized sub-data cluster results according to the organized cluster label models (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222.”; fig. 4, para. [0049]; Examiner note: the generating a plurality of organized sub-data cluster results is interpreted as the producing a fused label list 222);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).

8.	Claims 2, 6-7, 9 and 13-14 are rejected under 35 U.S.C. § 103 as being unpatentable over Ben-Hur et al. (US 20080140592 A1) in view of Hummel et al. (US 20160217201 A1) in further view of Yen et al. (US 20240338438 A1).

	As per claim 2, Ben-Hur and Hummel teach all the limitations as discussed in claim 1 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein before the step for uniformly down-sampling the raw data to be clustered to generate the plurality of groups of the sub-data, further comprising the following steps: receiving a raw data; and preprocessing the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches wherein before the step for uniformly down-sampling the raw data to be clustered to generate the plurality of groups of the sub-data, further comprising the following steps: receiving a raw data (i.e. “receive unstructured, raw packet data (e.g., data collected from a network telescope)”; fig. 6, para. [0035]. Further, i.e. “Darknet data (i.e., Darknet event data) associated with scanning activities of multiple scanners can be received.”; fig. 6, para. [0076]; Examiner note: the raw data is interpreted as the Darknet data); and
	preprocessing the raw data to generate the raw data to be clustered (i.e. “At step 606, data may then be pre-processed, such as to group scanning events by scanner, to combine scanner data with additional data (e.g., DNS and geolocation), or to filter the events to include only top or most relevant scanners.”; fig. 6, para. [0077]; Examiner note:  the to generate the raw data to be clustered is interpreted as the to filter the events to include only top or most relevant scanners where top or most relevant scanners are the raw data to be clustered). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

	As per claim 6, Ben-Hur and Hummel teach all the limitations as discussed in claim 1 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein the step for calculating the cluster stability indicator according to the organized sub-data cluster results comprises the following sub-steps: calculating a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results; and averaging a plurality of highest cluster probabilities of each of the plurality of data points in the raw data to be clustered to generate the cluster stability indicator.”
	On the other hand, in the same field of endeavor, Yen teaches wherein the step for calculating the cluster stability indicator according to the organized sub-data cluster results (i.e. “The cluster stability score is, then, the average of the pairwise distances between the clustering outcomes of two different subsamples.”; para. [0089]; Examiner note: the organized sub-data cluster results are interpreted as the clustering outcomes) comprises the following sub-steps:
	calculating a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results (i.e. “a first probability density function capturing the first clustering assignment matrix can be generated, and a second probability density function capturing the second clustering assignment matrix can be generated.”; fig. 6, para. [0082]); and 
	averaging a plurality of highest cluster probabilities of each of the plurality of data points in the raw data to be clustered to generate the cluster stability indicator (i.e. “The cluster stability score is, then, the average of the pairwise distances between the clustering outcomes of two different subsamples.”; para. [0089]; Examiner Note: the cluster stability indicator is the cluster stability score).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

	As per claim 7, Ben-Hur and Hummel teach all the limitations as discussed in claim 1 above.
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “wherein the step for clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results further comprises the following sub-step: clustering the raw data to be clustered according to the cluster algorithm to generate a raw data cluster result; wherein the indicator evaluation method of the cluster stability further comprises the following sub-step: generating a final cluster result according to the raw data cluster results and the organized sub-data cluster results; wherein the final cluster result corresponds to the cluster stability indicator.”
	On the other hand, in the same field of endeavor, Hummel teaches wherein the indicator evaluation method of the cluster stability further comprises the following sub-step: generating a final cluster result according to the raw data cluster results and the organized sub-data cluster results (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222”; para. [0049]; Examiner note: the final cluster result is interpreted as fused label list 222);
	wherein the final cluster result corresponds to the cluster stability indicator (i.e. “the labels from all lists are combined and ranked according to the decisiveness value.”; para. [0049]; Examiner note: the cluster stability indicator is interpreted as the decisiveness value). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein the step for clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results further comprises the following sub-step:	clustering the raw data to be clustered according to the cluster algorithm to generate a raw data cluster result;”
	On the other hand, in the same field of endeavor, Yen teaches wherein the step for clustering the plurality of groups of the sub-data to be analyzed according to a cluster algorithm to generate a plurality of the sub-data cluster results (i.e. “In order to analyze the stability of the clusters, multiple subsampling versions of the data can be generated by using bootstrap resampling. These samples are clustered individually using the same clustering algorithm.”; para. [0089]; Examiner note: the clustering the plurality of groups of the sub-data to be analyzed is interpreted as the order to analyze the stability of the clusters, multiple subsampling versions of the data can be generated. The cluster algorithm to generate a plurality of sub-data cluster results is interpreted as the samples are clustered individually using the same clustering algorithm) further comprises the following sub-step:
	clustering the raw data to be clustered according to the cluster algorithm to generate a raw data cluster result (i.e. “A clustering result that is not sensitive to sub-sampling, hence more stable, is certainly more desirable. In other words, the cluster structure uncovered by the clustering algorithm should be similar across different samples from the same data distribution.”; para. [0089]);
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

	As per claim 9, Ben-Hur and Hummel teach all the limitations as discussed in claim 8 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “further comprising: a memory, storing a raw data; wherein the processor is communicatively connected to the memory to access the raw data and preprocesses the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches further comprising: a memory, storing a raw data (i.e. “The memory 120 stores machine-readable instructions which, when executed by the processing circuitry”; fig. 1, para. [0038]);
	wherein the processor is communicatively connected to the memory (i.e. “memory 120 coupled to the processing circuitry”; fig. 1, para. [0038])
	to access the raw data (i.e. “Darknet event data is collected.”; para. [0076]) 
	and preprocesses the raw data to generate the raw data to be clustered (i.e. “At step 606, data may then be pre-processed, such as to group scanning events by scanner, to combine scanner data with additional data”; fig. 6, para. [0077]). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

	As per claim 13, Ben-Hur and Hummel teach all the limitations as discussed in claim 8 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein when the processor calculates the cluster stability indicator according to the organized sub-data cluster results, the processor calculates a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results and averages a plurality of highest cluster probabilities of each of the plurality of data points in the raw data to be clustered to generate the cluster stability indicator.”
On the other hand, in the same field of endeavor, Yen teaches wherein when the processor calculates the cluster stability indicator according to the organized sub-data cluster results (i.e. “The cluster stability score is, then, the average of the pairwise distances between the clustering outcomes of two different subsamples.”; para. [0089]; Examiner note: the organized sub-data cluster results are interpreted as the clustering outcomes), 
	the processor calculates a plurality of cluster probabilities of a plurality of cluster labels of a plurality of data points in the raw data to be clustered according to the organized sub-data cluster results (i.e. “a first probability density function capturing the first clustering assignment matrix can be generated, and a second probability density function capturing the second clustering assignment matrix can be generated.”; fig. 6, para. [0082]) 
	and averages a plurality of highest cluster probabilities of each of the plurality of data points in the raw data to be clustered to generate the cluster stability indicator (i.e. “The cluster stability score is, then, the average of the pairwise distances between the clustering outcomes of two different subsamples.”; para. [0089]; Examiner Note: the cluster stability indicator is the cluster stability score).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

	As per claim 14, Ben-Hur and Hummel teach all the limitations as discussed in claim 8 above.
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “wherein the processor generates a final cluster result according to the raw data cluster results and the organized sub-data cluster results; wherein the final cluster result corresponds to the cluster stability indicator.”
	On the other hand, in the same field of endeavor, Hummel teaches wherein the processor generates a final cluster result according to the raw data cluster results and the organized sub-data cluster results (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222”; para. [0049]; Examiner note: the final cluster result is interpreted as fused label list 222);
	wherein the final cluster result corresponds to the cluster stability indicator (i.e. “the labels from all lists are combined and ranked according to the decisiveness value.”; para. [0049]; Examiner note: the cluster stability indicator is interpreted as the decisiveness value).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein when the processor clusters the plurality of groups of the sub-data to be analyzed according to the cluster algorithm to generate a plurality of the sub-data cluster results, the processor clusters the raw data to be clustered to generate a raw data cluster result according to the cluster algorithm;”
	On the other hand, in the same field of endeavor, Yen teaches wherein when the processor clusters the plurality of groups of the sub-data to be analyzed according to the cluster algorithm to generate a plurality of the sub-data cluster results (i.e. “In order to analyze the stability of the clusters, multiple subsampling versions of the data can be generated by using bootstrap resampling. These samples are clustered individually using the same clustering algorithm.”; para. [0089]; Examiner note: the clusters the plurality of groups of the sub-data to be analyzed is interpreted as the order to analyze the stability of the clusters, multiple subsampling versions of the data can be generated. The cluster algorithm to generate a plurality of sub-data cluster results is interpreted as the samples are clustered individually using the same clustering algorithm), 
	the processor clusters the raw data to be clustered to generate a raw data cluster result according to the cluster algorithm (i.e. “A clustering result that is not sensitive to sub-sampling, hence more stable, is certainly more desirable. In other words, the cluster structure uncovered by the clustering algorithm should be similar across different samples from the same data distribution.”; para. [0089]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).

9.	Claims 3-5 and 10-12 are rejected under 35 U.S.C. § 103 as being unpatentable over Ben-Hur et al. (US 20080140592 A1) in view of Hummel et al. (US 20160217201 A1) in further view of Yen et al. (US 20240338438 A1) still in further view of Vaughan et al. (US 20190043610 A1).

	As per claim 3, Ben-Hur, Hummel and Yen teach all the limitations as discussed in claim 2 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein the raw data comprises at least a numerical feature or a character feature; wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature; when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered; and when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches wherein the raw data comprises at least a numerical feature or a character feature (i.e. “the multiple sets of features can include heterogeneous data containing at least one categorical dataset for a feature and at least one numerical dataset for the feature.”; fig. 5, para. [0079]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).
	However, it is noted that the prior art of Ben-Hur, Hummel and Yen do not explicitly teach “wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature; when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered; and when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Vaughan teaches wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature (i.e. “The diagnostic or therapeutic module may further comprise a second diagnostic or therapeutic classifier that can assess a patient's behavior or performance.”; para. [0184]; Examiner note: the character feature is the patient's behavior or performance);
	when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered (i.e. “The second classifier may assign a numerical score to the patient's behavior or performance. The second classifier may compare the numerical score of a patient, or a change in the numerical score, to scores obtained from other patients or other cohorts to determine relative values.”; para. [0184]; Examiner note: the converting the character feature in the raw data to a transformation numerical feature is interpreted as assign a numerical score to the patient's behavior or performance); and
	when the raw data excludes the character feature, utilizing the raw data as the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module or the prediction module. The preprocessed training data can be passed to the training module, which can construct an assessment model 660 based on the training data…. The preprocessed new data can be passed on to the prediction module”; fig. 6, para. [0114]; Examiner note: The raw data excludes the character feature is interpreted as the transformations. The to generate the raw data is interpreted as the preprocessed new data). 
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents, and Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).

	As per claim 4, Ben-Hur, Hummel and Yen teach all the limitations as discussed in claim 2 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein the raw data comprises at least a numerical feature or a character feature; wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature; when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature and standardizing the transformation numerical feature to generate the raw data to be clustered; and when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches wherein the raw data comprises at least a numerical feature or a character feature (i.e. “the multiple sets of features can include heterogeneous data containing at least one categorical dataset for a feature and at least one numerical dataset for the feature.”; fig. 5, para. [0079]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).
	However, it is noted that the prior art of Ben-Hur, Hummel and Yen do not explicitly teach “wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature; when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature and standardizing the transformation numerical feature to generate the raw data to be clustered; and when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Vaughan teaches wherein the step for preprocessing the raw data further comprises the following sub-steps: determining whether the raw data comprises the character feature (i.e. “The diagnostic or therapeutic module may further comprise a second diagnostic or therapeutic classifier that can assess a patient's behavior or performance.”; para. [0184]; Examiner note: the character feature is the patient's behavior or performance);
	when the raw data comprises the character feature, converting the character feature in the raw data to a transformation numerical feature (i.e. “The second classifier may assign a numerical score to the patient's behavior or performance. The second classifier may compare the numerical score of a patient, or a change in the numerical score, to scores obtained from other patients or other cohorts to determine relative values.”; para. [0184]; Examiner note: the converting the character feature in the raw data to a transformation numerical feature is interpreted as assign a numerical score to the patient's behavior or performance) and 
standardizing the transformation numerical feature to generate the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module”; fig.6, para. [0114]); and
	when the raw data excludes the character feature, standardizing the numerical feature of the raw data to generate the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module or the prediction module. The preprocessed training data can be passed to the training module, which can construct an assessment model 660 based on the training data…. The preprocessed new data can be passed on to the prediction module”; fig. 6, para. [0114]; Examiner note: The raw data excludes the character feature is interpreted as the transformations. The to generate the raw data is interpreted as the preprocessed new data). 
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents, and Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).

	As per claim 5, Ben-Hur and Hummel teach all the limitations as discussed in claim 1 above.
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “wherein the step for organizing the cluster label models of the sub-data cluster results, and generating the organized sub-data cluster results according to the organized cluster label models comprises the following sub-step: according to a raw data cluster label model of the raw data cluster results, utilizing a decision tree classifier to overfit a model prediction for the plurality of cluster label models of the plurality of sub-data cluster results to generate the organized sub-data cluster results.”
	On the other hand, in the same field of endeavor, Hummel teaches wherein the step for organizing the cluster label models of the sub-data cluster results (i.e. “Each list of top-k cluster labels 214, 215, and 216 is also re-ranked by the Label Ranking Module 104, according to a decisiveness value for each label as at 217, 218, and 219, respectively, producing a re-ranked list of labels for each algorithm as at 220. The top-k labels are the first k labels in each label list.”; fig. 4, para. [0049]; Examiner note: the organizing a plurality of cluster label models is interpreted as the top-k cluster labels 214, 215, and 216 is also re-ranked), and 
	generating the organized sub-data cluster results according to the organized cluster label models (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222.”; fig. 4, para. [0049]; Examiner note: The generating a plurality of organized sub-data cluster results is interpreted as the producing a fused label list 222) comprises the following sub-step:
	according to a raw data cluster label model of the raw data cluster results (i.e. “re-rank each of the two or more label lists for each of the two or more cluster labeling algorithms according to the label value”; para. [0018]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “utilizing a decision tree classifier to overfit a model prediction for the plurality of cluster label models of the plurality of sub-data cluster results to generate the organized sub-data cluster results.”
On the other hand, in the same field of endeavor, Vaughan teaches utilizing a decision tree classifier to overfit a model prediction for the plurality of cluster label models of the plurality of sub-data cluster results to generate the organized sub-data cluster results (i.e. “A Random Forest classifier, which generally comprises a plurality of decision trees wherein the output prediction is the mode of the predicted classifications of the individual trees, can be helpful in reducing overfitting to training data. An ensemble of decision trees can be constructed using a random subset of features at each split or decision node.”; fig. 7, para. [0121]).
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).

	As per claim 10, Ben-Hur and Hummel teach all the limitations as discussed in claim 9 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “ wherein the raw data comprises at least a numerical feature or a character feature; wherein the processor determines whether the raw data comprises the character feature when the processor preprocesses the raw data; wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered; wherein when the raw data excludes the character feature, the processor utilizes the raw data as the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches wherein the raw data comprises at least a numerical feature or a character feature (i.e. “the multiple sets of features can include heterogeneous data containing at least one categorical dataset for a feature and at least one numerical dataset for the feature.”; fig. 5, para. [0079]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).
	However, it is noted that the prior art of Ben-Hur, Hummel and Yen do not explicitly teach “wherein the processor determines whether the raw data comprises the character feature when the processor preprocesses the raw data; wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered; wherein when the raw data excludes the character feature, the processor utilizes the raw data as the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Vaughan teaches wherein the processor determines whether the raw data comprises the character feature when the processor preprocesses the raw data (i.e. “The diagnostic or therapeutic module may further comprise a second diagnostic or therapeutic classifier that can assess a patient's behavior or performance.”; para. [0184]; Examiner note: the character feature is the patient's behavior or performance);
	wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature to generate the raw data to be clustered (i.e. “The second classifier may assign a numerical score to the patient's behavior or performance. The second classifier may compare the numerical score of a patient, or a change in the numerical score, to scores obtained from other patients or other cohorts to determine relative values.”; para. [0184]; Examiner note: the converts the character feature in the raw data to a transformation numerical feature is interpreted as assign a numerical score to the patient's behavior or performance);
	wherein when the raw data excludes the character feature, the processor utilizes the raw data as the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module or the prediction module. The preprocessed training data can be passed to the training module, which can construct an assessment model 660 based on the training data…. The preprocessed new data can be passed on to the prediction module”; fig. 6, para. [0114]; Examiner note: The raw data excludes the character feature is interpreted as the transformations to standardize. The utilizes the raw data is interpreted as the preprocessed training data can be passed to the training module).  
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of  Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents, and Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).

	As per claim 11, Ben-Hur and Hummel teach all the limitations as discussed in claim 9 above.
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “wherein the raw data comprises at least a numerical feature or a character feature; wherein when the processor preprocesses the raw data, determining whether the raw data comprises the character feature; wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature and standardizes the transformation numerical feature to generate the raw data to be clustered; wherein when the raw data excludes the character feature, the processor standardizes the numerical feature of the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Yen teaches wherein the raw data comprises at least a numerical feature or a character feature (i.e. “the multiple sets of features can include heterogeneous data containing at least one categorical dataset for a feature and at least one numerical dataset for the feature.”; fig. 5, para. [0079]);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity into the combination of the prior arts of Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to break deep networks into manageable shallow pieces, greedy layer-wise pre-training reduces the vanishing gradient problem and provides better initialization (Yen, para. [0060]).
	However, it is noted that the prior art of Ben-Hur, Hummel and Yen do not explicitly teach “wherein when the processor preprocesses the raw data, determining whether the raw data comprises the character feature; wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature and 
standardizes the transformation numerical feature to generate the raw data to be clustered; wherein when the raw data excludes the character feature, the processor standardizes the numerical feature of the raw data to generate the raw data to be clustered.”
	On the other hand, in the same field of endeavor, Vaughan teaches wherein when the processor preprocesses the raw data, determining whether the raw data comprises the character feature (i.e. “The diagnostic or therapeutic module may further comprise a second diagnostic or therapeutic classifier that can assess a patient's behavior or performance.”; para. [0184]; Examiner note: the character feature is the patient's behavior or performance);
	wherein when the raw data comprises the character feature, the processor converts the character feature in the raw data to a transformation numerical feature (i.e. “The second classifier may assign a numerical score to the patient's behavior or performance. The second classifier may compare the numerical score of a patient, or a change in the numerical score, to scores obtained from other patients or other cohorts to determine relative values.”; para. [0184]; Examiner note: the converts the character feature in the raw data to a transformation numerical feature is interpreted as assign a numerical score to the patient's behavior or performance) and 
standardizes the transformation numerical feature to generate the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module”; fig.6, para. [0114]);
	wherein when the raw data excludes the character feature, the processor standardizes the numerical feature of the raw data to generate the raw data to be clustered (i.e. “The preprocessing module can apply one or more transformations to standardize the training data or new data for the training module or the prediction module. The preprocessed training data can be passed to the training module, which can construct an assessment model 660 based on the training data…. The preprocessed new data can be passed on to the prediction module”; fig. 6, para. [0114]; Examiner note: The raw data excludes the character feature is interpreted as the transformations. The to generate the raw data is interpreted as the preprocessed new data). 
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of  Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents, and Yen that teaches near-real-time approach for characterizing Internet Background Radiation to detect and characterize network scanner activity. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).
	As per claim 12, Ben-Hur and Hummel teach all the limitations as discussed in claim 8 above.
	However, it is noted that the prior art of Ben-Hur does not explicitly teach “wherein when the processor organizes the cluster label models of the sub-data cluster results and generates the organized sub-data cluster results according to the organized cluster label models, the processor utilizes a decision tree classifier to overfit a model prediction for the cluster label models of the sub-data cluster results according to a raw data cluster label model of the raw data cluster results to generate the organized sub-data cluster results.”
On the other hand, in the same field of endeavor, Hummel teaches wherein when the processor organizes the cluster label models of the sub-data cluster results (i.e. “Each list of top-k cluster labels 214, 215, and 216 is also re-ranked by the Label Ranking Module 104, according to a decisiveness value for each label as at 217, 218, and 219, respectively, producing a re-ranked list of labels for each algorithm as at 220. The top-k labels are the first k labels in each label list.”; fig. 4, para. [0049]; Examiner note: the organizes a plurality of cluster label models is interpreted as the top-k cluster labels 214, 215, and 216 is also re-ranked) and 
generates the organized sub-data cluster results according to the organized cluster label models (i.e. “The re-ranked label lists are fused 221 by a Label Fusion Module 105 given a fusion algorithm 223, such as CombMNZ, CombSUM, CombMAX and the like, producing a fused label list 222.”; fig. 4, para. [0049]; Examiner note: The generates a plurality of organized sub-data cluster results is interpreted as the producing a fused label list 222);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents into Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to combine several cluster labeling algorithms for improving cluster labeling result because it can make the cluster process more robust and reliable (Hummel, para. [0059]).
	However, it is noted that the prior art of Ben-Hur and Hummel do not explicitly teach “the processor utilizes a decision tree classifier to overfit a model prediction for the cluster label models of the sub-data cluster results according to a raw data cluster label model of the raw data cluster results to generate the organized sub-data cluster results.”
	On the other hand, in the same field of endeavor, Vaughan teaches the processor utilizes a decision tree classifier to overfit a model prediction for the cluster label models of the sub-data cluster results according to a raw data cluster label model of the raw data cluster results to generate the organized sub-data cluster results (i.e. “A Random Forest classifier, which generally comprises a plurality of decision trees wherein the output prediction is the mode of the predicted classifications of the individual trees, can be helpful in reducing overfitting to training data. An ensemble of decision trees can be constructed using a random subset of features at each split or decision node.”; fig. 7, para. [0121]). 
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Vaughan that teaches digital personalized medicine system into the combination of the prior arts of  Ben-Hur that teaches a method and system for unsupervised learning for determining an optimal number of data clusters into which data can be divided to best enable identification of relevant patterns, and Hummel that teaches fusion of multiple labeling algorithms on a cluster of documents. Additionally, this can improve performance of deep learning models by enforcing pre-training.
	The motivation for doing so would be to use machine learning to analyze digital data because it can improve the accuracy and consistency of analysis outcomes (Vaughan, para. [0003]-[0005]).

Prior Art of Record
10.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
	Bekkerman et al. (US 8463784 B1), teaches perform data clustering.
	
Conclusion
11.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTONIO CAIA DO whose telephone number is (469)295-9251.  The examiner can normally be reached on Monday - Friday / 06:30 to 16:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ng, Amy can be reached on (571) 270-1698.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ANTONIO J CAIA DO/
Examiner, Art Unit 2164

/AMY NG/Supervisory Patent Examiner, Art Unit 2164
Read full office action
Prosecution Timeline

Mar 18, 2025
Application Filed
Nov 29, 2025
Non-Final Rejection — §101, §103
Feb 10, 2026
Response Filed
Mar 10, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/509,143
Patent 12597055
IDENTIFYING ITEMS OFFERED BY AN ONLINE CONCIERGE SYSTEM FOR A RECEIVED QUERY BASED ON A GRAPH IDENTIFYING RELATIONSHIPS BETWEEN ITEMS AND ATTRIBUTES OF THE ITEMS
2y 5m to grant Granted Apr 07, 2026
17/220,030
Patent 12579121
MANAGEMENT OF A SECONDARY VERTEX INDEX FOR A GRAPH
2y 5m to grant Granted Mar 17, 2026
17/249,488
Patent 12579129
System and Method for Processing Hierarchical Data
2y 5m to grant Granted Mar 17, 2026
18/320,671
Patent 12579125
SYSTEMS AND METHODS FOR ADMISSION CONTROL INPUT/OUTPUT
2y 5m to grant Granted Mar 17, 2026
19/257,121
Patent 12578842
STRUCTURED SUGGESTIONS
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
69%
Grant Probability
99%
With Interview (+49.9%)
3y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 188 resolved cases by this examiner. Grant probability derived from career allow rate.
INDICATOR EVALUATION METHOD AND INDICATOR EVALUATION SYSTEM OF CLUSTER STABILITY

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email