ips Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 USC 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1 recites (emphasis added):
1. A computer-implemented method comprising:
mapping numerical data points from a dataset into a plurality of buckets organized using logarithmic ranges;
redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket;
and partitioning the plurality of buckets into a set of evenly occupied buckets such that each bucket occupancy is less than a pre-defined threshold value.
Examiner finds that the emphasized portions of claim 1 recite an abstract idea—namely, mathematical concepts and mental processes. See MPEP 2106.04(a)(2)(I), (III).
When read as a whole, the recited limitations are directed to mathematical relationships, for example, “organizing information and manipulating information through mathematical correlations. . .” Id at 2106.04(a)(2)(I)(A):
iv. organizing information and manipulating information through mathematical correlations, Digitech Image Techs., LLC v. Electronics for Imaging, Inc., 758 F.3d 1344, 1350, 111 USPQ2d 1717, 1721 (Fed. Cir. 2014). The patentee in Digitech claimed methods of generating first and second data by taking existing information, manipulating the data using mathematical functions, and organizing this information into a new form. The court explained that such claims were directed to an abstract idea because they described a process of organizing information through mathematical correlations, like Flook's method of calculating using a mathematical formula. 758 F.3d at 1350, 111 USPQ2d at 1721
(emphasis added).
Examiner also finds the redistributing and the partitioning elements can be practically performed in the human mind with the aid of pen and paper.
Taking each element individually, the element “mapping numerical data points from a dataset into a plurality of buckets organized using logarithmic ranges” recites a mathematical relationship in the form of “organizing information and manipulating information through mathematical correlations. . .” Id.
The element “redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket” also recites a mathematical relationship in the form of “organizing information and manipulating information through mathematical correlations. . .” Id. Examiner also finds this element merely requires observation and evaluation of the data and an evaluation/judgment as to how to redistribute the data. Examiner finds this can be practically performed in the human mind with the aid of pen and paper. As such, this element also recites a mental process.
The element “partitioning the plurality of buckets into a set of evenly occupied buckets such that each bucket occupancy is less than a pre-defined threshold value” also recites a mathematical relationship in the form of “organizing information and manipulating information through mathematical correlations. . .” Id. Examiner also finds this element merely requires observation and evaluation of the buckets and an evaluation and/or judgment as to how to partition the buckets. Examiner finds this can be practically performed in the human mind with the aid of pen and paper. As such, this element also recites a mental process.
Turning to the additional elements and whether they integrate the exception, the element “1. A computer-implemented method comprising:” is recited at a high level of generality and such that it amounts to no more than mere instructions to apply the exception using generic computer components. See MPEP 2106.05(f). Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Turning to the additional elements and whether they recite an inventive concept, the element “1. A computer-implemented method comprising:” is recited at a high level of generality and such that it amounts to no more than mere instructions to apply the exception using generic computer components.” See MPEP 2106.05(f). Accordingly, this additional element does not recite an inventive concept.
There is nothing, when considered as whole, that causes these additional elements to integrate the exception or provide an inventive concept. That is, the additional elements “‘[a]dd nothing … that is not already present when the steps are considered separately’”. MPEP 2106.05 (I)(B)(quoting Alice).
Thus, for the reasons above, claim 1 is directed to an abstract idea without significantly more.
Claim 7 is rejected for the same reasons given above for claim 1. Additionally, the elements “7. A computer program product comprising a computer-readable storage medium having a set of instructions stored therein which, when executed by a processor, causes the processor to perform a method comprising” are recited at a high level of generality such that they amount to no more than mere instructions to apply the exception using generic computer components. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the exception and do not recite an inventive concept.
Claim 13 is rejected for the same reasons given above for claim 1. Additionally, the elements “13. A computer system comprising: a processor set; and a computer readable storage medium; wherein: the processor set is structured, located, connected, and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions which, when executed by the processor set, cause the processor set to perform a method comprising” are recited at a high level of generality such that they amount to no more than mere instructions to apply the exception using generic computer components. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the exception and do not recite an inventive concept.
Claim 2 recites “2. The computer-implemented method of claim 1, further comprising: calculating the median values of each bucket in the plurality of buckets.” This element recites mathematical calculation and thus recites and abstract idea.
Claim 3 recites “3. The computer-implemented method of claim 1, wherein the pre-defined threshold value is a percentage of the total data points stored in the evenly occupied buckets.” This element merely recites the mental process and/or the mathematical relationship/calculation performed.
Claim 6 recites “6. The computer-implemented method of claim 1, further comprising: receiving a query dataset including a first data point.” This element recites insignificant extra solution activity in the form of mere data gathering/selecting a particular data source to be manipulated. See MPEP 2106.05(g). As such, it does not integrate the exception. See id. This element recites a well-understood, routine, and conventional (WURC) computer function (receiving or transmitting data over a network and/or storing and retrieving information in memory) and thus does not recite an inventive concept. See MPEP 2106.05(d)(II).
The element “predicting the first data point to be within a first evenly occupied bucket” is a mathematical calculation. This element also merely requires observation and evaluation of the data point and an evaluation and/or judgment as to how to perform the prediction. As such, this element is directed to mathematical concept and/or a mental process.
The element “and responding to the query based on the predicted first evenly occupied bucket.” This element recites insignificant extra solution activity in the form of mere data gathering/selecting a particular data source to be manipulated. See MPEP 2106.05(g). As such, it does not integrate the exception. See id. This element recites WURC computer function (storing and retrieving information in memory) and thus does not recite an inventive concept. See MPEP 2106.05(d)(II).
Claims 8-9, 12, 14-15, and 18 are rejected for the same reasons given above for claims 2-3 and 6, respectively.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 6-9, 12-15, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Milojevic, Power Law Distributions in Information Science: Making the Case for Logarithmic Binning, Year: 2010 in view of Brusco, A comparison of latent class, K-means, and K-median methods for clustering dichotomous data, Year: 2017 and further in view of Schierz US-20210390455-A1.
With respect to claim 1, Milojevic teaches “1. A computer-implemented method comprising: mapping numerical data points from a dataset into a plurality of buckets organized using logarithmic ranges” on p. 4 right column, second full paragraph:
Binning is a procedure of averaging the data that fall in specific bins (i.e., ranges of k). Averaging produces a more accurate answer as to what is the real expected value of the function. We take the bins to be of the same size when k is given in logarithms, which is why this is logarithmic binning.
It appears Milojevic fails to teach “redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket.”
However, Brusco, A comparison of latent class, K-means, and K-median methods for clustering dichotomous data , 2018 teaches
“redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket” on p. 8 (“The K-median partitioning problem requires the selection of exactly K objects (often called exemplars) to serve as cluster centers and the assignment of each object to its nearest exemplar, with the goal of minimizing the sum of the distances of the objects to the exemplars”); and pp. 8-9 (last paragraph on p. 8):
In this paper, we use the multistart fast interchange procedure (see Brusco & Köhn, 2008a, 2009; Köhn, Steinley, & Brusco, 2010) that seeks to obtain solutions that minimize SSKmed. This method is based on the pioneering work of Teitz and Bart (1968), Whitaker (1983), and Hansen and Mladenović (1997). The procedure consists of the following steps:
1. Randomly select K objects to serve as exemplars.
2. Assign each object to the cluster corresponding to its nearest exemplar.
3. Consider each exemplar in turn with respect to its replacement with one of the objects not selected as an exemplar. Any replacement that reduces the sum of the distances of the objects to their exemplars should be accepted.
4. Repeat Step 3 until no replacement of an exemplar will further reduce SSKmed.
Completion of Step 4 guarantees a solution that is locally-optimal with respect to all possible replacements of an exemplar with an object not currently selected as an exemplar.
(Examiner finds repeating step 3 is a redistribution).
Milojevic and Brusco are analogous art because they are from the same field of endeavor as the claimed invention.
It would have been obvious to one skilled in the art before the effective filing date of the invention to modify “mapping numerical data points from a dataset into a plurality of buckets organized using logarithmic ranges” as taught by Milojevic to include “redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket” as taught by Brusco.
The motivation would have been to quickly process dichotomous data into clusters or categories. See Brusco p. 22 first full paragraph.
It appears Milojevic/Brusco fail to explicit teach “and partitioning the plurality of buckets into a set of evenly occupied buckets such that each bucket occupancy is less than a pre-defined threshold value”
However, Schierz US 20210390455 A1 teaches
“and partitioning the plurality of buckets into a set of evenly occupied buckets such that each bucket occupancy is less than a pre-defined threshold value” in paras. 98-99 (under (less than) a particular threshold) para. 143, Fig. 8, and Table 18 (bins (buckets) evenly occupied less than pre-defined threshold (i.e. under bins)).
Schierz and Milojevic/Brusco are analogous art because they are from the same field of endeavor as the as the claimed invention.
It would have been obvious to one skilled in the art before the effective filing date of the invention to modify 1. A computer-implemented method comprising: mapping numerical data points from a dataset into a plurality of buckets organized using logarithmic ranges redistributing the numerical data points among the plurality of buckets based on median values of the data points within each bucket” as taught by Milojevic/Brusco to include “and partitioning the plurality of buckets into a set of evenly occupied buckets such that each bucket occupancy is less than a pre-defined threshold value” as taught by Schierz.
The motivation would have been to easily identify data drift thereby indicating possible model accuracy deterioration. See Schierz abstract.
With respect to claim 2, Brusco teaches “2. The computer-implemented method of claim 1, further comprising: calculating the median values of each bucket in the plurality of buckets. p. 8 (“The K-median partitioning problem requires the selection of exactly K objects (often called exemplars) to serve as cluster centers and the assignment of each object to its nearest exemplar, with the goal of minimizing the sum of the distances of the objects to the exemplars”). The motivation to add this element to claim 1 would have been to process dichotomous data into clusters or categories. See Brusco p. 22 first full paragraph.
With respect to claim 3, Schierz teaches “3. The computer-implemented method of claim 1, wherein the pre-defined threshold value is a percentage of the total data points stored in the evenly occupied buckets”
[0008] In some implementations, identifying the two neighboring elements can include calculating a difference in centroid values between each set of adjacent elements in the centroid vector. Step (k) can include: repeating steps (b) through (j) until a specified time duration is reached; and storing the histogram for later reference. The method can include converting the histogram to a new histogram having a plurality of buckets, each bucket including a lower bound, an upper bound, and a count. The method can include calculating a cumulative count for each of the plurality of buckets. The method can include calculating at least one of a median or a percentile for the new histogram based on the cumulative counts.
[0098] In various examples, the data aggregation module 106 is configured to process a stream of data (e.g., of unknown size or duration) by aggregating (step 121) the data in a collection or series of histograms. The aggregated data can be stored in a data store for subsequent queries and/or can be used to calculate metrics of interest to users of the system 100 (e.g., MLOps service health engineers). Such metrics for a data set can include, for example, minimum, maximum, mean, median, any percentile (e.g., 10th percentile, 90th percentile, quartiles, etc.) and/or counts of values over or under a particular threshold.
The motivation to add this element to claim 1 would have been to easily identify data drift thereby indicating possible model accuracy deterioration. See Schierz abstract.
With respect to claim 6, Schierz teaches “6. The computer-implemented method of claim 1, further comprising: receiving a query dataset including a first data point” in Figs. 8, 9 and para. 87, 89; paras. 143-144 (Examiner finds the query dataset is the scoring dataset; scoring dataset contains at least one data point); para. 102;
“predicting the first data point to be within a first evenly occupied bucket” in Figs. 8, 9 para. 87, 89; and para. 143-144 (scoring data reflects points predicted to be placed in a particular bin (bucket);
“and responding to the query based on the predicted first evenly occupied bucket” in Figs. 8, 9; para. 87, 89; para. 143-144 (response is returned in that the datapoint is placed into a bin (bucket)).
The motivation to add this element to claim 1 would have been to easily identify data drift thereby indicating possible model accuracy deterioration. See Schierz abstract.
Claims 7 and 13 are rejected for the same reasons given above for claim 1.
Claims 8-9, 12; and 14-15; and 18 are rejected for the same reasons give above for claims 2-3 and 6 above.
Allowable Subject Matter
Claims 4-5, 10-11, and 16-17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Reasons for Indicating Allowable Subject Matter
Examiner finds claims 4-5, 10-11, and 16-17 are patent eligible because they reflect an improvement disclosed in the specification at least in paragraphs 6, 38, 40, and 48.
Examiner finds claims the prior art of record fails to teach or suggest
“using intra-file parallelization by implementing a plurality of threads wherein each thread processes a part of the numerical data points and populates a respective set of buckets organized using logarithmic ranges; and merging the respective sets of buckets to generate the plurality of buckets” or “wherein the partitioning includes: using inter-file parallelization by assigning each file of dataset to a different set of threads to partition the dataset in parallel” as required by claims 4-5, 10-11, and 16-17.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALBERT M PHILLIPS, III whose telephone number is (571)270-3256. The examiner can normally be reached 10a-6:30pm EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J Lo can be reached at (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ALBERT M PHILLIPS, III/ Primary Examiner, Art Unit 2159