DETAILED ACTION
This communication is a Non-Final Office Action rejection on the merits. Claims 1-20 are currently pending and have been addressed below.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement (IDS)
The information disclosure statement(s) filed on 11/26/2024 comply with the provisions 37 CFR 1.97, 1.98, and MPEP 609 and is considered by the Examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
The term "similar" in claims 6 and 14 is a relative term which renders the claim indefinite. The term "similar" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purposes the term “similar” has been construed to be an algorithm stored in a repository for a specific product and/or location (e.g., according to their stock keeping unit identification codes and/or retail stores).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., an abstract idea) without reciting significantly more.
Independent Claim 1
Step One - First, pursuant to step 1 in the January 2019 Revised Patent Subject Matter Eligibility Guidance (“2019 PEG”) on 84 Fed. Reg. 53, the claim 1 is directed to a method which is a statutory category.
Step 2A, Prong One - Claim 1 recites: A method, comprising: receiving historical data associated with a retailer and a retail business problem, wherein the historical data comprises transaction data, and sale associated data of a predefined time period; processing the historical data based on the retail business problem to obtain a pre-processed historical data; generating one or more statistical features based on the pre-processed historical data using a feature generation technique; solving the retail business problem based on the one or more statistical features using one or more retail algorithms, wherein the one or more retail algorithms are accessed, and wherein each retail algorithm uses a set of parameters and hyperparameters; identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy, wherein the retail algorithm is identified by comparing accuracy measures of the one or more retail algorithms; determining whether an optimal result is obtained using the identified retail algorithm by comparing an output of the identified retail algorithm with a predefined output threshold, wherein the optimal result is obtained when the output of the identified retail algorithm is in a predefined range of the predefined output threshold; and fine-tuning a system to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner. These claim elements are considered to be abstract ideas because they are directed to “mathematical concepts” which include “mathematical calculations.” In this case, determining an optimal result by comparing an output of the identified retail algorithm with a predefined output threshold is a mathematical calculation. If a claim limitation, under its broadest reasonable interpretation, covers managing interactions between people, then it falls within the “certain methods of organizing human activity” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 - The judicial exception is not integrated into a practical application. Claim 1 includes additional elements: one or more hardware processors; one or more processing techniques; one or more retail algorithms; an algorithm container; and an algorithm calibrator.
The processor is merely used to o fetch and execute computer-readable instructions (Paragraph 0040). The processing technique is merely used to obtain pre-processed historical data (Paragraph 0045). The retail algorithm is merely used to solve the retail problem based on the one or more statistical features (Paragraph 0006). The algorithm container is merely used to store one or more retail algorithms (Paragraph 0006). The algorithm calibrator is merely used to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0006). Merely stating that the step is performed by a computer component results in “apply it” on a computer (MPEP 2106.05f). These elements of “processor,” “processing technique,” “retail algorithm,” “algorithm container,” and “algorithm calibrator” are recited at a high level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer element. The processor is considered “field of use” since it’s just used to receive historical data for an analysis, but the technology is not improved (MPEP 2106.05h). Accordingly, alone and in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B - The claim does not include additional elements that are sufficient to amount significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the claims describe how to generally “apply” the concept of identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy. The specification shows that the processor is merely used to o fetch and execute computer-readable instructions (Paragraph 0040). The processing technique is merely used to obtain pre-processed historical data (Paragraph 0045). The retail algorithm is merely used to solve the retail problem based on the one or more statistical features (Paragraph 0006). The algorithm container is merely used to store one or more retail algorithms (Paragraph 0006). The algorithm calibrator is merely used to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0006). In this case, the fine-tuning/retraining of the machine learning model is recited at a high level of generality. For example, the plain meaning of the “retraining” step is merely describing how the machine learning is receiving continuous data to iteratively adjust the values/parameters to minimize a loss function (e.g., improve an accuracy score). See 2024 AI Guidance, example 47, claim 2. Further, the step of “fine-tuning to obtain an improved system upon determining that the optimal result is not obtained” is considered a well-understood, routine, and conventional function since it's just “performing repetitive calculations” and “receiving or transmitting data over a network” (MPEP 2106.05(d)). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Independent claim 9 is directed to a system at step 1, which is a statutory category. Claim 9 recites similar limitations as claim 1 and is rejected for the same reasons at step 2a, prong one; step 2a, prong 2; and step 2b. Claim 9 further recites “memory” and “communication interface” – which are treated as just an explicit “processor/computer” for storing and executing the operations and are treated under MPEP 2106.05f in the same manner as claim 1. Also, the communication interface is considered a well-understood, routing, and conventional function of “receiving or transmitting data over a network” (MPEP 2106.05(d)). Accordingly, these additional elements are viewed as “apply it on a computer” at step 2a, prong 2 and step 2b.
Independent claim 17 is directed to an article of manufacture at step 1, which is a statutory category. Claim 17 recites similar limitations as claim 1 and is rejected for the same reasons at step 2a, prong one; step 2a, prong 2; and step 2b. Claim 17 further recites “non-transitory machine-readable information storage medium” – which is treated as just an explicit “processor/computer” for storing and executing the operations and is treated under MPEP 2106.05f in the same manner as claim 1. Accordingly, this additional element is viewed as “apply it on a computer” at step 2a, prong 2 and step 2b.
Dependent claims 2-4, 10-12, and 18 are not directed to any additional claim elements. Rather, these claims offer further descriptive limitations of elements found in the independent claims and addressed above - such as to: perform one or more data treatments on the one or more statistical features to obtain one or more transformed statistical features; and solve the retail business problem based on the one or more transformed statistical features using the one or more retail algorithms. In this case, the function of “transforming data” is merely used to remove outliers and/or fill missing values (Paragraph 0064). Transforming data in a data gathering step is not considered an eligible transformation (MPEP 2106.05(c)). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Dependent claims 5-8, 13-16, and 19-20 are not directed to any additional claim elements. Rather, these claims offer further descriptive limitations of elements found in the independent claims and addressed above - such as: wherein the optimal result comprises of the determined retail algorithm, the one or more statistical features, and the set of parameters and the hyperparameters used in the determined retail algorithm; wherein the algorithm calibrator repository uses an optimal result for validation of a similar business problem for a new retailer; wherein the algorithm calibrator repository includes a value suggested by at least one subject matter expert; and wherein the one or more retail algorithms comprise one or more of open-source retail algorithms, licensed retail algorithms, and customized retail algorithms. In this case, the algorithm calibrator is merely used to store the optimal results based on an optimal value suggest by at least one subject matter expert. However, using a database is considered “field of use” MPEP 2106.05h at Step 2A, Prong 2, since the database is not improved, and that data is just placed there. At Step 2B, this is conventional still, storing information in a memory (see MPEP 2106.05d). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Joseph et al. (US 2020/0184494 A1), in view of Joseph et al. (US 2020/0184494 A1).
Regarding claim 1, Khanafer et al. discloses a processor implemented method, comprising (Abstract, Systems and methods for dynamic demand sensing in a supply chain in which constantly-updated data is used to select a machine learning model or retrain a pre-selected machine learning model, for forecasting sales of a product at a specific location. The updated data includes product information and geographic information. Also disclosed are systems and methods relating to demand forecasting and readjusting forecasts based on forecast error; Figure 2 and related text in Paragraph 0062, item 204, Processor):
receiving, via one or more hardware processors, historical data associated with a retailer and a retail business problem, wherein the historical data comprises transaction data, and sale associated data of a predefined time period (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0049, Historical data may be collected from a variety of sources. For example, data may be collected from a client/user that includes historical plus forwarding looking data such as campaigns. In some embodiments, historical client data can include point-of-sales data that provides information on the amount of product sold at a particular day at a particular location; and inventory of a particular product at a particular location. Other types of data can be mined from the web and social media, such as weather data, financial markets, and the like. Calendar data that includes local holidays, along with local event data may also be collected. Promotion campaign details for a particular product at a particular location can also be included, and other relevant events. In summary, any information that relates to, or impacts upon, the sales of a particular product at a particular location, can be used as part of the input dataset; Paragraph 0150, At block 1402, Method 1400 includes collecting historical data during a first time interval. For example, the retailer transmits sales data of the first product on a daily basis corresponding to each store location from Client server 1512 to Data store 1510. The sales data is collected and stored daily in Data store 1510 over a first time interval for future use. In this example, the first time interval includes 1 year. In practice, however, sales data may be collected over any time interval, e.g., weeks, months, years, etc. In this example, daily weather data corresponding to each day of that 1 year period is also collected; As stated in Paragraph 0036 of Applicant’s specification, sale associated data may include promotion details and weather);
processing, via the one or more hardware processors, the historical data based on the retail business problem using one or more processing techniques to obtain a pre-processed historical data (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0082, External data module 110 fetches data (at block 602) from external data source(s) 108 which can include raw data about weather, market indices, trends, etc. The external data source(s) 108 provide data that complements client data source 102 (of FIG. 1). The raw data is cleaned (or validated) to remove outliers, and transformed (at block 604) for storage, at block 606, in the ML storage 106; Paragraph 0083, Pre-processing may include transformation, validation, remediation, or any combination thereof, of the data; As stated in Paragraph 0045 of Applicant’s specification, processing techniques may include data cleaning).
generating, via the one or more hardware processors, one or more statistical features based on the pre-processed historical data using a feature generation technique (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; As stated in Paragraph 0046 of Applicant’s specification, statistical features may include weather. Also, the generation technique used for generating features includes transforming data);
solving, via the one or more hardware processors, the retail business problem based on the one or more statistical features using one or more retail algorithms, wherein the one or more retail algorithms are accessed from an algorithm container, and wherein each retail algorithm uses a set of parameters and hyperparameters (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0047, The demand sensing method can provide predicted daily sales for a single products (for example, according to their stock keeping unit (SKU) identification codes) for single locations (e.g. retail stores) over some horizon (e.g. 13 weeks ahead) for a variety of purposes, including: allowance by the user to use the predictions to drive replenishment orders at the defined locations; and gaining an analytical understanding of the factors driving the predicted sales in order to plan for the future; Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0103, If this is not the first time a forecasting request for this particular product and location is made, then monitor module 112 checks the ML storage 106 to see if any new class of relevant signal data has been added since the last forecast request for the particular product and location, at block 1008. If the answer is yes, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0110, If the time threshold is not surpassed, monitor module 112 proceeds to instruct forecasting module 114 to forecast using the current model at block 1018, without any retraining; Examiner interprets the ML storage as the algorithm container since the ML storage stores one or more retail algorithms for a specific product and location);
identifying, via the one or more hardware processors, a retail algorithm among the one or more retail algorithms that achieves highest accuracy, wherein the retail algorithm is identified by comparing accuracy measures of the one or more retail algorithms (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, A number of ML models, such as gradient-boosted trees, ensemble of trees and support vector regression, were used during the initial training set. A gradient-boosted tree model, Light GBM, was selected during validation, and retrained on the dataset from September 2016 to Jan. 15, 2018. In this example, all the data, except for the last 20%, was used for training the selected model. In some embodiments, the testing dataset may be the smaller of the dataset of the period of the last 10-20 weeks and the last 20% of the entire dataset. In some embodiments, where the historical data set spans 1 year (52 weeks), the training/validation period can be 40-42 weeks, with remaining 10-12 weeks used for testing the selected model. In some embodiments, a nested validation scheme can be used. The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.);
determining, via the one or more hardware processors, whether an optimal result is obtained using the identified retail algorithm by comparing an output of the identified retail algorithm with a predefined output threshold, wherein the optimal result is obtained when the output of the identified retail algorithm is in a predefined range of the predefined output threshold (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; Examiner interprets the criterion selected to determine whether or not the forecast remains viable as the predefined output threshold);
and fine-tuning, via the one or more hardware processors, a system using an algorithm calibrator to obtain an improved system upon determining [an expanded data set], wherein the improved system solves the retail business problem in an accurate manner (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906). In FIG. 12, block series 1222 describes a flowchart of the model selection process 1202 in an embodiment; block series 1224 describes a flowchart of the training process 1212 in an embodiment, and block 1220 refers to the forecasting of the trained ML model.; Paragraph 0126, Retraining of a selected ML model is described in block series 1224, in accordance with one embodiment. A selected ML model is first retrained on an expanded dataset at block 1214; it then makes a forecast corresponding to the period of a testing portion at block 1216, and its accuracy is evaluated, based on its performance in the testing portion, at block 1218. Details of the training/retraining vary slightly, depending on where in the overall process of FIG. 10, the selected model is being trained—within a model selection process (i.e. in block 1006, block 1006, ML storage 106 or 618); or within a retraining process alone (i.e. Block 1006)).
Although Khanafer et al. discloses evaluating accuracy of the model and fine-tuning/retraining the model upon determining that there’s new data available (e.g., retrain when there’s an expanded data set), Khanafer et al. does not specifically disclose fine-tuning/retraining the model upon determining that the optimal result is not obtained (e.g., accuracy below a threshold).
However, Joseph et al. discloses and fine-tuning, via the one or more hardware processors, a system using an algorithm calibrator to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0043, The observer 325 can compare the demand forecasts output by the model 330 against the actual historical data once it is received to determine how accurate the demand forecast results were, and to adjust the machine learning model/algorithm in response thereto as appropriate. The observer unit 325 communicates with the model (re-)training component 111 via connection 328 to train/retrain the selected model 330 whenever the accuracy of the demand forecasts output by the model 330 falls to a specified level or threshold (or when the demand forecast satisfies some other user-configurable criteria). The level at which the tripping point is triggered can be configurable by users).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for fine-tuning/retraining and identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy (e.g., fine-tuning/retraining is in response to detecting that there’s new data) of the invention of Khanafer et al. to further incorporate wherein the fine-tuning/retraining is upon determining that the optimal result is not obtained (e.g., accuracy below a threshold) of the invention of Joseph et al. because doing so would allow the method to retrain the selected model whenever the accuracy of the demand forecasts output by the model falls to a specified level or threshold (see Joseph et al., Paragraph 0043). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 9, Khanafer et al. discloses a system, comprising: a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to (Abstract, Systems and methods for dynamic demand sensing in a supply chain in which constantly-updated data is used to select a machine learning model or retrain a pre-selected machine learning model, for forecasting sales of a product at a specific location. The updated data includes product information and geographic information. Also disclosed are systems and methods relating to demand forecasting and readjusting forecasts based on forecast error; Figure 2 and related text in Paragraph 0062, System 200 includes a system server 202, ML storage 106, client data source 102 and external data source(s) 108. System server 202 can include a memory 206, a disk 208, a processor 204 and a dynamic demand sensing module 120. While one processor 204 is shown, the system server 202 can comprise one or more processors. In some embodiments, memory 206 can be volatile memory, compared with disk 208 which can be non-volatile memory. In some embodiments, system server 202 can communicate with ML storage 106, external data source(s) 108 and client data source 102 via network 210):
receive historical data associated with a retailer and a retail business problem, wherein the historical data comprises transaction data, and sale associated data of a predefined time period (Paragraph 0049, Historical data may be collected from a variety of sources. For example, data may be collected from a client/user that includes historical plus forwarding looking data such as campaigns. In some embodiments, historical client data can include point-of-sales data that provides information on the amount of product sold at a particular day at a particular location; and inventory of a particular product at a particular location. Other types of data can be mined from the web and social media, such as weather data, financial markets, and the like. Calendar data that includes local holidays, along with local event data may also be collected. Promotion campaign details for a particular product at a particular location can also be included, and other relevant events. In summary, any information that relates to, or impacts upon, the sales of a particular product at a particular location, can be used as part of the input dataset; Paragraph 0150, At block 1402, Method 1400 includes collecting historical data during a first time interval. For example, the retailer transmits sales data of the first product on a daily basis corresponding to each store location from Client server 1512 to Data store 1510. The sales data is collected and stored daily in Data store 1510 over a first time interval for future use. In this example, the first time interval includes 1 year. In practice, however, sales data may be collected over any time interval, e.g., weeks, months, years, etc. In this example, daily weather data corresponding to each day of that 1 year period is also collected; As stated in Paragraph 0036 of Applicant’s specification, sale associated data may include promotion details and weather);
process the historical data based on the retail business problem using one or more processing techniques to obtain a pre-processed historical data (Paragraph 0082, External data module 110 fetches data (at block 602) from external data source(s) 108 which can include raw data about weather, market indices, trends, etc. The external data source(s) 108 provide data that complements client data source 102 (of FIG. 1). The raw data is cleaned (or validated) to remove outliers, and transformed (at block 604) for storage, at block 606, in the ML storage 106; Paragraph 0083, Pre-processing may include transformation, validation, remediation, or any combination thereof, of the data; As stated in Paragraph 0045 of Applicant’s specification, processing techniques may include data cleaning).
generate one or more statistical features based on the pre-processed historical data using a feature generation technique (Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; As stated in Paragraph 0046 of Applicant’s specification, statistical features may include weather. Also, the generation technique used for generating features includes transforming data);
solve the retail business problem based on the one or more statistical features using one or more retail algorithms, wherein the one or more retail algorithms are accessed from an algorithm container, and wherein each retail algorithm uses a set of parameters and hyperparameters (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0047, The demand sensing method can provide predicted daily sales for a single products (for example, according to their stock keeping unit (SKU) identification codes) for single locations (e.g. retail stores) over some horizon (e.g. 13 weeks ahead) for a variety of purposes, including: allowance by the user to use the predictions to drive replenishment orders at the defined locations; and gaining an analytical understanding of the factors driving the predicted sales in order to plan for the future; Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0103, If this is not the first time a forecasting request for this particular product and location is made, then monitor module 112 checks the ML storage 106 to see if any new class of relevant signal data has been added since the last forecast request for the particular product and location, at block 1008. If the answer is yes, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0110, If the time threshold is not surpassed, monitor module 112 proceeds to instruct forecasting module 114 to forecast using the current model at block 1018, without any retraining; Examiner interprets the ML storage as the algorithm container since the ML storage stores one or more retail algorithms for a specific product and location);
identify a retail algorithm among the one or more retail algorithms that achieves highest accuracy, wherein the retail algorithm is identified by comparing accuracy measures of the one or more retail algorithms (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, A number of ML models, such as gradient-boosted trees, ensemble of trees and support vector regression, were used during the initial training set. A gradient-boosted tree model, Light GBM, was selected during validation, and retrained on the dataset from September 2016 to Jan. 15, 2018. In this example, all the data, except for the last 20%, was used for training the selected model. In some embodiments, the testing dataset may be the smaller of the dataset of the period of the last 10-20 weeks and the last 20% of the entire dataset. In some embodiments, where the historical data set spans 1 year (52 weeks), the training/validation period can be 40-42 weeks, with remaining 10-12 weeks used for testing the selected model. In some embodiments, a nested validation scheme can be used. The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.);
determine whether an optimal result is obtained using the identified retail algorithm by comparing an output of the identified retail algorithm with a predefined output threshold, wherein the optimal result is obtained when the output of the identified retail algorithm is in a predefined range of the predefined output threshold (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; Examiner interprets the criterion selected to determine whether or not the forecast remains viable as the predefined output threshold);
and fine-tune a system using an algorithm calibrator to obtain an improved system upon determining [an expanded data set], wherein the improved system solves the retail business problem in an accurate manner (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906). In FIG. 12, block series 1222 describes a flowchart of the model selection process 1202 in an embodiment; block series 1224 describes a flowchart of the training process 1212 in an embodiment, and block 1220 refers to the forecasting of the trained ML model.; Paragraph 0126, Retraining of a selected ML model is described in block series 1224, in accordance with one embodiment. A selected ML model is first retrained on an expanded dataset at block 1214; it then makes a forecast corresponding to the period of a testing portion at block 1216, and its accuracy is evaluated, based on its performance in the testing portion, at block 1218. Details of the training/retraining vary slightly, depending on where in the overall process of FIG. 10, the selected model is being trained—within a model selection process (i.e. in block 1006, block 1006, ML storage 106 or 618); or within a retraining process alone (i.e. Block 1006)).
Although Khanafer et al. discloses to evaluate accuracy of the model and to fine-tune/retrain the model upon determining that there’s new data available (e.g., retrain when there’s an expanded data set), Khanafer et al. does not specifically disclose to fine-tune/retrain the model upon determining that the optimal result is not obtained (e.g., accuracy below a threshold).
However, Joseph et al. discloses and fine-tune a system using an algorithm calibrator to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0043, The observer 325 can compare the demand forecasts output by the model 330 against the actual historical data once it is received to determine how accurate the demand forecast results were, and to adjust the machine learning model/algorithm in response thereto as appropriate. The observer unit 325 communicates with the model (re-)training component 111 via connection 328 to train/retrain the selected model 330 whenever the accuracy of the demand forecasts output by the model 330 falls to a specified level or threshold (or when the demand forecast satisfies some other user-configurable criteria). The level at which the tripping point is triggered can be configurable by users).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for fine-tuning/retraining and identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy (e.g., fine-tuning/retraining is in response to detecting that there’s new data) of the invention of Khanafer et al. to further incorporate wherein the fine-tuning/retraining is upon determining that the optimal result is not obtained (e.g., accuracy below a threshold) of the invention of Joseph et al. because doing so would allow the method to retrain the selected model whenever the accuracy of the demand forecasts output by the model falls to a specified level or threshold (see Joseph et al., Paragraph 0043). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 17, Khanafer et al. discloses one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause (Paragraph 0062, System 200 includes a system server 202, ML storage 106, client data source 102 and external data source(s) 108. System server 202 can include a memory 206, a disk 208, a processor 204 and a dynamic demand sensing module 120. While one processor 204 is shown, the system server 202 can comprise one or more processors. In some embodiments, memory 206 can be volatile memory, compared with disk 208 which can be non-volatile memory; Paragraph 0063, System 200 can also include additional features and/or functionality. For example, system 200 can also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in FIG. 2 by memory 206 and disk 208. Storage media can include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Memory 206 and disk 208 are examples of non-transitory computer-readable storage media):
receiving historical data associated with a retailer and a retail business problem, wherein the historical data comprises transaction data, and sale associated data of a predefined time period (Paragraph 0049, Historical data may be collected from a variety of sources. For example, data may be collected from a client/user that includes historical plus forwarding looking data such as campaigns. In some embodiments, historical client data can include point-of-sales data that provides information on the amount of product sold at a particular day at a particular location; and inventory of a particular product at a particular location. Other types of data can be mined from the web and social media, such as weather data, financial markets, and the like. Calendar data that includes local holidays, along with local event data may also be collected. Promotion campaign details for a particular product at a particular location can also be included, and other relevant events. In summary, any information that relates to, or impacts upon, the sales of a particular product at a particular location, can be used as part of the input dataset; Paragraph 0150, At block 1402, Method 1400 includes collecting historical data during a first time interval. For example, the retailer transmits sales data of the first product on a daily basis corresponding to each store location from Client server 1512 to Data store 1510. The sales data is collected and stored daily in Data store 1510 over a first time interval for future use. In this example, the first time interval includes 1 year. In practice, however, sales data may be collected over any time interval, e.g., weeks, months, years, etc. In this example, daily weather data corresponding to each day of that 1 year period is also collected; As stated in Paragraph 0036 of Applicant’s specification, sale associated data may include promotion details and weather);
processing the historical data based on the retail business problem using one or more processing techniques to obtain a pre-processed historical data (Paragraph 0082, External data module 110 fetches data (at block 602) from external data source(s) 108 which can include raw data about weather, market indices, trends, etc. The external data source(s) 108 provide data that complements client data source 102 (of FIG. 1). The raw data is cleaned (or validated) to remove outliers, and transformed (at block 604) for storage, at block 606, in the ML storage 106; Paragraph 0083, Pre-processing may include transformation, validation, remediation, or any combination thereof, of the data; As stated in Paragraph 0045 of Applicant’s specification, processing techniques may include data cleaning).
generating one or more statistical features based on the pre-processed historical data using a feature generation technique (Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; As stated in Paragraph 0046 of Applicant’s specification, statistical features may include weather. Also, the generation technique used for generating features includes transforming data);
solving the retail business problem based on the one or more statistical features using one or more retail algorithms, wherein the one or more retail algorithms are accessed from an algorithm container, and wherein each retail algorithm uses a set of parameters and hyperparameters (Paragraph 0047, The demand sensing method can provide predicted daily sales for a single products (for example, according to their stock keeping unit (SKU) identification codes) for single locations (e.g. retail stores) over some horizon (e.g. 13 weeks ahead) for a variety of purposes, including: allowance by the user to use the predictions to drive replenishment orders at the defined locations; and gaining an analytical understanding of the factors driving the predicted sales in order to plan for the future; Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0103, If this is not the first time a forecasting request for this particular product and location is made, then monitor module 112 checks the ML storage 106 to see if any new class of relevant signal data has been added since the last forecast request for the particular product and location, at block 1008. If the answer is yes, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0110, If the time threshold is not surpassed, monitor module 112 proceeds to instruct forecasting module 114 to forecast using the current model at block 1018, without any retraining; Examiner interprets the ML storage as the algorithm container since the ML storage stores one or more retail algorithms for a specific product and location);
identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy, wherein the retail algorithm is identified by comparing accuracy measures of the one or more retail algorithms (Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, A number of ML models, such as gradient-boosted trees, ensemble of trees and support vector regression, were used during the initial training set. A gradient-boosted tree model, Light GBM, was selected during validation, and retrained on the dataset from September 2016 to Jan. 15, 2018. In this example, all the data, except for the last 20%, was used for training the selected model. In some embodiments, the testing dataset may be the smaller of the dataset of the period of the last 10-20 weeks and the last 20% of the entire dataset. In some embodiments, where the historical data set spans 1 year (52 weeks), the training/validation period can be 40-42 weeks, with remaining 10-12 weeks used for testing the selected model. In some embodiments, a nested validation scheme can be used. The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.);
determining whether an optimal result is obtained using the identified retail algorithm by comparing an output of the identified retail algorithm with a predefined output threshold, wherein the optimal result is obtained when the output of the identified retail algorithm is in a predefined range of the predefined output threshold (Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; Examiner interprets the criterion selected to determine whether or not the forecast remains viable as the predefined output threshold);
and fine-tuning a system using an algorithm calibrator to obtain an improved system upon determining [an expanded data set], wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906). In FIG. 12, block series 1222 describes a flowchart of the model selection process 1202 in an embodiment; block series 1224 describes a flowchart of the training process 1212 in an embodiment, and block 1220 refers to the forecasting of the trained ML model.; Paragraph 0126, Retraining of a selected ML model is described in block series 1224, in accordance with one embodiment. A selected ML model is first retrained on an expanded dataset at block 1214; it then makes a forecast corresponding to the period of a testing portion at block 1216, and its accuracy is evaluated, based on its performance in the testing portion, at block 1218. Details of the training/retraining vary slightly, depending on where in the overall process of FIG. 10, the selected model is being trained—within a model selection process (i.e. in block 1006, block 1006, ML storage 106 or 618); or within a retraining process alone (i.e. Block 1006)), saving, the optimal result in an algorithm calibrator repository along with the historical data and the retail business problem upon determining that the optimal result is obtained, wherein the optimal result comprises of the determined retail algorithm, the one or more statistical features, and the set of parameters and the hyperparameters used in the determined retail algorithm (Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0098, All results produced by forecasting module 114 are stored in ML storage 106. In some embodiments, this includes the selected, trained model and all of the features and hyperparameters associated thereof, along with the forecast results; Paragraph 0152, Next, at block 1406, Method 1400 includes training a machine learning algorithm using the feature data. For example, Processing Resource 1506, trains a machine learning algorithm, such as, a tree-based machine learning algorithm, using the generated feature data for forming a forecast model. Optionally, at block 1406, hyper-parameters of the forecast model may be tuned to improve accuracy of the model).
Although Khanafer et al. discloses evaluating accuracy of the model and fine-tuning/retraining the model upon determining that there’s new data available (e.g., retrain when there’s an expanded data set), Khanafer et al. does not specifically disclose fine-tuning/retraining the model upon determining that the optimal result is not obtained (e.g., accuracy below a threshold).
However, Joseph et al. discloses and fine-tuning a system using an algorithm calibrator to obtain an improved system upon determining that the optimal result is not obtained, wherein the improved system solves the retail business problem in an accurate manner (Paragraph 0043, The observer 325 can compare the demand forecasts output by the model 330 against the actual historical data once it is received to determine how accurate the demand forecast results were, and to adjust the machine learning model/algorithm in response thereto as appropriate. The observer unit 325 communicates with the model (re-)training component 111 via connection 328 to train/retrain the selected model 330 whenever the accuracy of the demand forecasts output by the model 330 falls to a specified level or threshold (or when the demand forecast satisfies some other user-configurable criteria). The level at which the tripping point is triggered can be configurable by users).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for fine-tuning/retraining and identifying a retail algorithm among the one or more retail algorithms that achieves highest accuracy (e.g., fine-tuning/retraining is in response to detecting that there’s new data) of the invention of Khanafer et al. to further incorporate wherein the fine-tuning/retraining is upon determining that the optimal result is not obtained (e.g., accuracy below a threshold) of the invention of Joseph et al. because doing so would allow the method to retrain the selected model whenever the accuracy of the demand forecasts output by the model falls to a specified level or threshold (see Joseph et al., Paragraph 0043). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claims 2 and 10, which are dependent of claims 1 and 9, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 1 and 9. Khanafer et al. further discloses wherein generating the one or more statistical features based on the pre-processed historical data using the feature generation technique comprises: performing, via the one or more hardware processors, one or more data treatments on the one or more statistical features to obtain one or more transformed statistical features (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below).
Regarding claims 3 and 11, which are dependent of claims 2 and 10, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 2 and 10. Khanafer et al. further discloses wherein solving the retail business problem based on the one or more statistical features using the one or more retail algorithms comprises: solving, via the one or more hardware processors, the retail business problem based on the one or more transformed statistical features using the one or more retail algorithms (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; Paragraph 0104, As an example, in the intervening period between the first request and the subsequent request, ML storage 106 may have received weather data that includes a humidity index relevant to the location of the request, which was not present in the data used for the initial forecast. The humidity index is a new class of signal data that can be used in the machine learning forecasting of the particular product at the particular location. Note that if new humidity data has been received during the intervening period, but the new humidity data has no impact on the location of interest, then it is not considered as being relevant. For example, if ML storage 106 receives the humidity index for Washington, D.C., but not for Kanata ON (where the forecast is requested), then this is not considered as a relevant new class of signal data).
Regarding claims 4 and 12, which are dependent of claims 3 and 11, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 3 and 11. Khanafer et al. further discloses wherein fine-tuning the system using the algorithm calibrator comprises: fine-tuning, via the one or more hardware processors, the historical data used for solving the retail business problem by the algorithm calibrator to obtain a fine-tuned historical data (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906). In FIG. 12, block series 1222 describes a flowchart of the model selection process 1202 in an embodiment; block series 1224 describes a flowchart of the training process 1212 in an embodiment, and block 1220 refers to the forecasting of the trained ML model.; Paragraph 0126, Retraining of a selected ML model is described in block series 1224, in accordance with one embodiment. A selected ML model is first retrained on an expanded dataset at block 1214; it then makes a forecast corresponding to the period of a testing portion at block 1216, and its accuracy is evaluated, based on its performance in the testing portion, at block 1218. Details of the training/retraining vary slightly, depending on where in the overall process of FIG. 10, the selected model is being trained—within a model selection process (i.e. in block 1006, block 1006, ML storage 106 or 618); or within a retraining process alone (i.e. Block 1006));
fine-tuning, via the one or more hardware processors, the one or more transformed statistical features that are used for solving the retail business problem by the algorithm calibrator to obtain a plurality of fine-tuned statistical features (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; Paragraph 0104, As an example, in the intervening period between the first request and the subsequent request, ML storage 106 may have received weather data that includes a humidity index relevant to the location of the request, which was not present in the data used for the initial forecast. The humidity index is a new class of signal data that can be used in the machine learning forecasting of the particular product at the particular location. Note that if new humidity data has been received during the intervening period, but the new humidity data has no impact on the location of interest, then it is not considered as being relevant. For example, if ML storage 106 receives the humidity index for Washington, D.C., but not for Kanata ON (where the forecast is requested), then this is not considered as a relevant new class of signal data; Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906));
and fine-tuning, via the one or more hardware processors, the one or more retail algorithms that are used for solving the retail business problem by the algorithm calibrator, wherein each retail algorithm of the one or more retail algorithms is fine-tuned using the plurality of fine-tuned statistical features and a plurality of combinations of the set of parameters, and the hyperparameters as an improvement process until the improved system is obtained (Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0098, All results produced by forecasting module 114 are stored in ML storage 106. In some embodiments, this includes the selected, trained model and all of the features and hyperparameters associated thereof, along with the forecast results; Paragraph 0152, Next, at block 1406, Method 1400 includes training a machine learning algorithm using the feature data. For example, Processing Resource 1506, trains a machine learning algorithm, such as, a tree-based machine learning algorithm, using the generated feature data for forming a forecast model. Optionally, at block 1406, hyper-parameters of the forecast model may be tuned to improve accuracy of the model).
Regarding claims 5 and 13, which are dependent of claims 1 and 9, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 1 and 9. Khanafer et al. further discloses saving, via the one or more hardware processors, the optimal result in an algorithm calibrator repository along with the historical data and the retail business problem upon determining that the optimal result is obtained, wherein the optimal result comprises of the determined retail algorithm, the one or more statistical features, and the set of parameters and the hyperparameters used in the determined retail algorithm (Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0098, All results produced by forecasting module 114 are stored in ML storage 106. In some embodiments, this includes the selected, trained model and all of the features and hyperparameters associated thereof, along with the forecast results; Paragraph 0152, Next, at block 1406, Method 1400 includes training a machine learning algorithm using the feature data. For example, Processing Resource 1506, trains a machine learning algorithm, such as, a tree-based machine learning algorithm, using the generated feature data for forming a forecast model. Optionally, at block 1406, hyper-parameters of the forecast model may be tuned to improve accuracy of the model).
Regarding claims 6 and 14, which are dependent of claims 5 and 13, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 5 and 13. Khanafer et al. further discloses using, via the one or more hardware processors, the algorithm calibrator repository for determining the optimal result for validation of a similar business problem for a new retailer (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0048, The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions); Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; Examiner interprets the standard criteria set by the user as the optimal result for validation, wherein the criteria set by the user may be for a specific item and store combination).
Regarding claims 7 and 15, which are dependent of claims 1 and 9, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 1 and 9. Khanafer et al. further discloses wherein the predefined output threshold and the predefined range of the predefined output threshold are decided based on one of: an optimal value suggested by a generative artificial intelligence (AI) based model present for the retail business problem, an algorithm calibrator repository maintained for the retail business problem, and a value suggested by at least one subject matter expert (SME) (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0048, The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions); Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; It can be noted that the claim language is written in alternative form. The limitation taught by Khanafer et al. is based on a value suggested by at least one subject matter expert (SME)." Examiner interprets the standard criteria set by the user as the predefined output threshold suggested by at least one SME).
Regarding claims 8, 16, and 20, which are dependent of claims 1, 9, and 17, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claims 1, 9, and 17. Khanafer et al. further discloses wherein the one or more retail algorithms comprises one or more of open-source retail algorithms, licensed retail algorithms, and customized retail algorithms (Paragraph 0047, The demand sensing method can provide predicted daily sales for a single products (for example, according to their stock keeping unit (SKU) identification codes) for single locations (e.g. retail stores) over some horizon (e.g. 13 weeks ahead) for a variety of purposes, including: allowance by the user to use the predictions to drive replenishment orders at the defined locations; and gaining an analytical understanding of the factors driving the predicted sales in order to plan for the future; It can be noted that the claim language is written in alternative form. The limitation taught by Khanafer et al. is based on customized retail algorithms." Examiner interprets the retail algorithms customized for a specific product and location as the customized retail algorithms).
Regarding claim 18, which is dependent of claim 17, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claim 17. Khanafer et al. further discloses wherein generating the one or more statistical features based on the pre-processed historical data using the feature generation technique comprises performing one or more data treatments on the one or more statistical features to obtain one or more transformed statistical features, wherein solving the retail business problem based on the one or more statistical features using the one or more retail algorithms comprises solving, the retail business problem based on the one or more transformed statistical features using the one or more retail algorithms (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; Paragraph 0104, As an example, in the intervening period between the first request and the subsequent request, ML storage 106 may have received weather data that includes a humidity index relevant to the location of the request, which was not present in the data used for the initial forecast. The humidity index is a new class of signal data that can be used in the machine learning forecasting of the particular product at the particular location. Note that if new humidity data has been received during the intervening period, but the new humidity data has no impact on the location of interest, then it is not considered as being relevant. For example, if ML storage 106 receives the humidity index for Washington, D.C., but not for Kanata ON (where the forecast is requested), then this is not considered as a relevant new class of signal data), wherein fine-tuning the system using the algorithm calibrator comprises:
fine-tuning, the historical data used for solving the retail business problem by the algorithm calibrator to obtain a fine-tuned historical data (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906). In FIG. 12, block series 1222 describes a flowchart of the model selection process 1202 in an embodiment; block series 1224 describes a flowchart of the training process 1212 in an embodiment, and block 1220 refers to the forecasting of the trained ML model.; Paragraph 0126, Retraining of a selected ML model is described in block series 1224, in accordance with one embodiment. A selected ML model is first retrained on an expanded dataset at block 1214; it then makes a forecast corresponding to the period of a testing portion at block 1216, and its accuracy is evaluated, based on its performance in the testing portion, at block 1218. Details of the training/retraining vary slightly, depending on where in the overall process of FIG. 10, the selected model is being trained—within a model selection process (i.e. in block 1006, block 1006, ML storage 106 or 618); or within a retraining process alone (i.e. Block 1006));
fine-tuning, the one or more transformed statistical features that are used for solving the retail business problem by the algorithm calibrator to obtain a plurality of fine-tuned statistical features (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0075, FIG. 4 illustrates a transformation examples 400 in accordance with one embodiment. Examples of features 402 can include data related to: point of sales, weather, events/holidays, market index, web traffic and promotions. Features 402 may include additional categories of data, fewer, or different categories than those shown in FIG. 4.; Paragraph 0076, Example 1 404, shows how data related to a rare event, which is in binary form, is transformed to a form that includes integers, by specifying the number of days to the event. For example, the rare event can have the value ‘0’ to indicate the day a store is open (e.g. Mon-Sat) and ‘1’ to indicate the day a store is closed (e.g. Sunday). The series of ‘0’s and ‘1’s is transformed, instead, to a series of integers that indicate how many days away that a given day is to the rare event; Paragraph 0077, Example 2 406 shows an example of transforming consecutive dates to a tabular form that lists year (in one row); month (in a second row) and date (in the third row); Paragraph 0078, Example 3 408 shows an example of transforming temperature values on certain dates, to temperature values in relation to the lowest temperature reading (6° C.). The original 6° C. reading is transformed to ‘0’; 7° C. to ‘1’; 8° C. to ‘2’, and so forth. Graphical representations of transformations are discussed below; Paragraph 0104, As an example, in the intervening period between the first request and the subsequent request, ML storage 106 may have received weather data that includes a humidity index relevant to the location of the request, which was not present in the data used for the initial forecast. The humidity index is a new class of signal data that can be used in the machine learning forecasting of the particular product at the particular location. Note that if new humidity data has been received during the intervening period, but the new humidity data has no impact on the location of interest, then it is not considered as being relevant. For example, if ML storage 106 receives the humidity index for Washington, D.C., but not for Kanata ON (where the forecast is requested), then this is not considered as a relevant new class of signal data; Paragraph 0121, Forecasting module 114 receives instructions from monitor module 112, as shown in FIG. 9, to either select a model (block 902), train/retrain (block 904), or forecast (block 906));
and fine-tuning, the one or more retail algorithms that are used for solving the retail business problem by the algorithm calibrator (Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0098, All results produced by forecasting module 114 are stored in ML storage 106. In some embodiments, this includes the selected, trained model and all of the features and hyperparameters associated thereof, along with the forecast results; Paragraph 0152, Next, at block 1406, Method 1400 includes training a machine learning algorithm using the feature data. For example, Processing Resource 1506, trains a machine learning algorithm, such as, a tree-based machine learning algorithm, using the generated feature data for forming a forecast model. Optionally, at block 1406, hyper-parameters of the forecast model may be tuned to improve accuracy of the model).
Regarding claim 19, which is dependent of claim 17, the combination of Khanafer et al. and Joseph et al. discloses all the limitations in claim 17. Khanafer et al. further discloses wherein the one or more instructions which when executed by the one or more hardware processors further cause saving the optimal result in an algorithm calibrator repository along with the historical data and the retail business problem upon determining that the optimal result is obtained, wherein the optimal result comprises of the determined retail algorithm, the one or more statistical features, and the set of parameters and the hyperparameters used in the determined retail algorithm (Paragraph 0048, The data processing services are composed of various components of a machine learning pipeline. Per user request, features may be generated from the raw user-specific and public datasets. Then one or more quantile regression models can be trained with these features. Selection of features and hyperparameters can be achieved through the evaluation of each model on the same validation set. The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions). In a last step, data related to predictions, prediction quality, and prediction contributions may be gathered and illustrated to the user by a number of interactive visualizations that are found in user-application interfaces mentioned above; Paragraph 0098, All results produced by forecasting module 114 are stored in ML storage 106. In some embodiments, this includes the selected, trained model and all of the features and hyperparameters associated thereof, along with the forecast results; Paragraph 0152, Next, at block 1406, Method 1400 includes training a machine learning algorithm using the feature data. For example, Processing Resource 1506, trains a machine learning algorithm, such as, a tree-based machine learning algorithm, using the generated feature data for forming a forecast model. Optionally, at block 1406, hyper-parameters of the forecast model may be tuned to improve accuracy of the model), wherein the predefined output threshold and the predefined range of the predefined output threshold are decided based on one of: an optimal value suggested by a generative artificial intelligence (AI) based model present for the retail business problem, an algorithm calibrator repository maintained for the retail business problem, and a value suggested by at least one subject matter expert (SME) (Figure 2 and related text in Paragraph 0062, item 204, Processor; Paragraph 0048, The evaluation comprises managing a simulated inventory for the period of time equivalent to the validation set, where orders are given based on simple heuristics and key performance metrics are measured, such as excessive inventory over a period of time and number of stock out days. Once a model is chosen (for best performance for an item and store combination), the contribution of each feature (on the demand predictions) may be evaluated through model interpretation techniques (e.g. SHapley Additive exPlantions); Paragraph 0107, If the answer at block 1010 is no, monitor module 112 proceeds to block 1012 to evaluate the performance of the machine learning model used in the previous forecast. With reference to FIG. 9, once the forecasting module 114 provides a forecast, the forecast is stored in the ML storage 106. Monitor module 112 evaluates the forecast on an ongoing basis by comparing the forecasted values with the actual values as the latter are uploaded to ML storage 106 on an ongoing basis. Evaluation methods known in the art may be used to evaluate the accuracy of the forecasted values, and a criterion may be selected to determine whether or not the forecast remains viable. In some embodiments, the evaluation method can be selected from mean absolute percentage error (MAPE); mean absolute scaled error (MASE), mean absolute error (MAE), and Weighted Mean Absolute Percentage Error (WMAPE). If the forecast is not deemed viable, then monitor module 112 flags the request to undergo a full model selection process at block 1006, which is subsequently sent to forecasting module 114 (see FIG. 9); Paragraph 0137, The best ML model may be selected according to a configuration set by the user, or any standard criteria such as MASE, MAE, WMAPE (Weighted Mean Absolute Percentage Error), etc.; It can be noted that the claim language is written in alternative form. The limitation taught by Khanafer et al. is based on a value suggested by at least one subject matter expert (SME)." Examiner interprets the standard criteria set by the user as the predefined output threshold suggested by at least one SME).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Alagappan et al. (US 2024/0152775 A1) – discloses determining the accuracy of each model (i.e., validating each model) to determine which model is the optimal prediction mode 122. The optimal prediction model 122 is the model that most accurately predicts the testing data 113B. As previously mentioned, the modeling engine 120 sends the plurality of prediction models 121 to the validator 125, and the validator 125 receives the plurality of prediction models 121 from the modeling engine 120. The validator 125 charts the prediction of each prediction model to testing data 113B to determine which model is most accurate (e.g., the optimal prediction model 122) (see at least Paragraphs 0063-0064).
Chen (CN 117035841 A) – discloses a sales prediction model is constructed by different machine learning and statistical models, and the constructed sales prediction model is compared. At the same time, deep learning models such as RecurrentNeuralNetworks (RNN) and LongShortTermMemory (LSTM) can also be integrated. By attempting different models and evaluating their performance, the most suitable model or model combination can be found as a sales prediction model. The integrated learning method, such as random forest, gradient lifting tree or stacking method, is combined with the prediction result of multiple models to further improve the prediction accuracy and stability of the sales prediction model (see at least Pages 6-7).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARJORIE PUJOLS-CRUZ whose telephone number is (571)272-4668. The examiner can normally be reached Mon-Thru 7:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Patricia H Munson can be reached at (571)270-5396. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARJORIE PUJOLS-CRUZ/Examiner, Art Unit 3624