Last updated: April 19, 2026
Application No. 18/038,998
APPARATUS AND METHOD FOR PROCESSING A DEPTH MAP COMPRISING DETERMINING AN UPDATED DEPTH VALUE FOR A PIXEL BASED ON A COST FUNCTION OF DEPTH VALUES IN A DIRECTION FROM THE PIXEL

Final Rejection §101§103
Filed
May 26, 2023
Examiner
DICKERSON, CHAD S
Art Unit
2683
Tech Center
2600 — Communications
Assignee
Koninklijke Philips N V
OA Round
2 (Final)
This examiner grants 63% of cases after interview

— +23.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 600 resolved cases, 2023–2026
Examiner Intelligence

DICKERSON, CHAD S View full profile →
Grants 63% of resolved cases
Career Allow Rate
376 granted / 600 resolved
+0.7% vs TC avg
Strong +23% interview lift
Without
With
+23.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
35 currently pending
Career history
635
Total Applications
across all art units
Statute-Specific Performance

§101
8.8%
-31.2% vs TC avg
§103
55.5%
+15.5% vs TC avg
§102
14.9%
-25.1% vs TC avg
§112
18.1%
-21.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 600 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments, see page 9, filed 12/9/2025, with respect to the specification objection have been fully considered and are persuasive.  The objection of the specification has been withdrawn. 
Applicant's arguments filed 12/9/2025 have been fully considered but they are not persuasive regarding the 101 rejection.  The claims do not create a link between the improvement of an entire improved depth map and the claim language.  The added claim amendments do not integrate the judicial exception into a practical application and is not considered as additional elements considered significantly more than the judicial exception.  Since the claims do not encompass the complete improved depth maps that is stated within the specification as the overall improvement and purposed of the invention, the 101 rejection is maintained.
Applicant's arguments filed 12/9/2025 have been fully considered but they are not persuasive.  The arguments state that the applied references do not disclose the features of: 
“wherein the plurality of first candidate depth values comprises a first candidate depth value along a first direction from the at least one first pixel, wherein none of the plurality of first candidate depth values along the first direction through a plurality of intervening pixels has a cost function which is less than the cost function for the first candidate depth value”  
The Examiner respectfully disagrees with this assertion and would like to briefly explain why below.  
	Regarding the primary reference, figure 2A or 2B disclose pixels in a certain direction, and the pixels contain a depth value that is calculated.  This is taught in ¶ [74]-[76].  The cost value is further calculated by containing a minimum value from the group of pixels and is used to calculate the new cost value for the cost values of the other pixels in figure 2A or 2B.  This means that the further cost values calculated are equal to or greater than the cost value of a pixel considered with the minimum cost value, which can be either pixel 1 or 2.  This is explained in ¶ [70]-[75].  With the other cost values of other pixels not less than a minimum, these other cost values are not less than a cost value of the minimum cost value of a pixel if the minimum value is associated with pixel 1.  Thus, the rejection of the claims are maintained.  
	Moreover, the addition of the Konieczny reference performs a similar calculation as the primary reference.  As stated in ¶ [124]-[126], the cost function for image fragments calculates a specific cost value by taking a previous cost value and either adds to an initial cost value or keeps it the same.  In either case, the cost value is not less than an initial cost value, which would also perform the features above.  In ¶ [140]-[147], the pixels or image fragments in figure 5 show different pixels, where one can be considered as a pixel associated with a first candidate depth value.  In figure 5 in box 505 pixels can be within a certain direction and have a cost value that is equal to or greater than an initial cost value.  Thus, this additional reference along with the primary reference both could be applied to disclose the features of the claims.  
	Therefore, based on the above, the features of the claims are disclosed below.  

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. The claim(s) recite(s) mathematical concepts of  mathematical relationships and mathematical formula or equations.  In particular, determining a cost value based on a cost function is considered as mathematical formula or equation that involves some mathematical calculations.  Defining the first candidate depth values to through a plurality of intervening pixels has a cost function less than the first candidate depth value with the distance of a first pixel to the first candidate depth value is a larger distance than other intervening pixels is an example of a mathematical relationship between distance and the amount of the cost function.  This judicial exception is not integrated into a practical application because determining an updated depth value based on a first depth value does not transform the elements into another state or thing nor provide an improvement to the overall invention.  Neither does the further defining of intervening pixels along a direction or another direction being disclosed integrate the judicial exception into a practical application.  The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the receipt of a first depth map and determining first candidate depth values are data gathering step that represent insignificant extra-solution activity.  
The above rationale is applicable to claims 14 and 15 as well with the only difference being the use of circuitry for claim 14 and a computer program stored on a non-transitory medium when executed on a processor. The addition of circuitry or the execution of the computer program on a non-transitory medium on a processor are considered as recitations of the technological environment of where to apply the invention that not amount to significantly more than the judicial exception. 

Claim 2 is considered as an abstract idea of a mathematical relationship between the cost function cost gradient and a certain threshold.  This, taken alone or in combination, does not integrate the abstract idea into a practical application.  Neither does this limitation provide any additional elements that would be considered as to amount to significantly more than the judicial exception.  Claim rationale applies to clam 16 as well.

Claims 3 contains additional elements that do not improve the invention to be considered as integrating the abstract idea into a practical application.  In addition, the selecting of intervening pixels to not include depth values in first candidate depth values is considered as selecting a particular type of data to be manipulated, which is a form of insignificant extra-solution activity and does not amount to significantly more than the judicial exception.  This applies to claim 17 as well.

Claim 4 is considered as an abstract idea of a mathematical relationship that defines a cost contribution dependent on a difference between image values and offset by a disparity value.  This feature, taken alone or in combination with the abstract idea of claim 1, does not integrate the abstract idea into a practical application. The claim does not contain an additional element that would be considered significantly more than the judicial exception.  This applied to claim 18 as well.   

Claims 5 and 6 is considered as an additional element that does not integrate the abstract idea into a practical application by determining the gravity or a defining a direction.  These determinations appear to be a manner of gathering data, which is considered as a data gathering step or insignificant extra-solution activity that does not amount to more than the judicial exception.  The above applies to claims 19 and 20 as well.

Claims 7 and 8 are considered as abstract ideas that express a mathematical relationship or mathematical calculation.  For claim 7, this is considered a mathematical calculation when disclosing the difference.  Claim 8 discloses the relationship of being asymmetric when it comes to the value of a model depth value.  Neither of these limitations, alone or in combination, does not integrate the abstract idea into a practical application.  The additional element of the depth model for a scene is considered as a data gathering step.  The additional element does not amount to significantly more than the judicial exception. 

Claims 9 and 10 are considered as an additional element that selects a particular type of data to be manipulated, which is considered as insignificant extra-solution activity.  These additional elements do not integrate the abstract idea into a practical application, nor does these additional elements amount to significantly more than the judicial exception.

Claim 11 contains an abstract idea of the cost function, which is a mathematical function, being dependent on a type of depth value.  The mathematical relationship is defined by this dependent relationship.  This limitation, taken alone or in combination with the abstract idea of claim 1, does not integrate the abstract idea into a practical application.  The additional elements of the claim detailing the type of depth value is a form of selecting a particular type of data to be manipulated, which is considered as insignificantly extra solution activity.  These additional elements do not amount to significantly more than the judicial exception. 

Claim 12 is considered as an additional element that does not integrate the abstract idea into a practical application since this limitation selects a particular type of data to be manipulated.  The selection iteratively of this data is a form of insignificantly extra-solution activity that does not improve the invention.  Moreover, this limitation does not add significantly more than the judicial exception.  

Claim 13 is considered as an additional element thar does not integrate the abstract idea into a practical application since this limitation selects a type of candidate value for a third pixel.  Neither does this part of the claim amount to adding significantly more than the judicial exception.  The last two claim limitations are considered as a mathematical relationship the cost function of the third candidate depth value and the distance outlined in the last limitation.  These limitations, taken alone or in combination with the abstract idea in claim 1, does not integrate the abstract idea into a practical application.   

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 4, 6, 14, 15 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh (US Pub 2010/0182410) in view of Konieczny (US Pub 2018/0144491).

Re claim 1: (Currently Amended) Verburgh ‘410 discloses a method of processing a depth map, the method comprising:
receiving a first depth map (e.g. the system discloses block 106 receiving a depth map from the depth from motion block 102, which is taught in ¶ [65] and [66].);

[0065] A depth from motion (DFM) block 102 provides a depth map based on an input motion vector field 124, for example using a known method (for examples, see WO 99/40726, U.S. Pat. No. 6,563,873, and WO 01/91468). Based on the incoming motion vector field 124 (i.e. coming from a motion estimator), the depth from motion block 102 calculates a depth cue. For example, one possible way to implement a DFM block 102 is by comparing the motion vector field with a background motion vector field. In that case, the background motion vector field is computed based on an overall movement in the image (e.g. pan-zoom parameters). Based on the motion vector field 124 and the background motion vector field, depth from motion is calculated. This is done by calculating the AVD (Absolute Vector Distance, i.e. the sum of the absolute values of the differences of X and Y components of the background motion vector and input motion vector) for vectors in the vector field. The result is a vector field representing a motion estimation of the video signal. However, the use of background vector field and pan-zoom parameters is optional. The motion estimation is converted into a depth map.

[0066] In block 106, the depth map is smoothed. In blocks 108 and 110, the input image is filtered by a Gauss-kernel. Block 108 handles the luminance (Y), and block 110 handles the chromaticity (U,V). Preferably an edge preserving filter is used. The result is a blurred image.	

determining a plurality of first candidate depth values for at least one first pixel, wherein the plurality of first candidate depth values comprise depth values for at least a second pixel (e.g. the system discloses several pixels that are evaluated for a cost value.  For each pixel, depth values are calculated based on using the cost values, which is taught in ¶ [70]-[74].);

[0070] In block 118, the partially blurred image output of block 116, the smoothed depth-from-motion output of block 106 are combined to obtain an improved depth map. To this end, a cost value is computed relating to a pixel. In block 118, the lines of the image are scanned (meandered), starting at the uppermost line down to the bottom line, and alternating from left to right and from right to left. The cost value of a current pixel is the minimum value taken from a set of candidate cost values. A candidate cost value is the sum of the cost value of another pixel and the absolute value of the difference between the other pixel's luminance value and the current pixel's luminance value. Instead of luminance values, any pixel parameters such as chromaticity-related values or any color related values may be used.

[0071] FIGS. 2A and 2B depict the candidate pixels that correspond to the candidate cost values. FIG. 2A shows the situation when a line is traversed from left to right, whereas FIG. 2B shows the situation when a line is traversed from right to left. In both Figs., the current pixel is indicated by a cross. The arrow indicates the order in which the pixels are processed. The numbered pixels 1-4 indicate the candidate pixels corresponding to the candidate cost values. Since the lines are processed from top to bottom, it can be seen that the pixels directly adjacent to the current pixel are used, insofar as they have already been processed. Consequently, the three pixels above the current pixel and the previously processed pixel are used to determine the candidate cost values.

[0072] The cost value may be computed as follows. First, the candidate values are computed, i.e., for each of the candidate pixels 1-4 as illustrated in FIGS. 2A or 2B, the candidate cost value is computed. This candidate cost value is the cost value of the candidate pixel plus the absolute difference between the luminance of the candidate pixel and the luminance of the current pixel, i.e.

[0073] candidate_cost=cost_of_candidate_pixel+ABS(luminance_of_candidate_pixel-luminance_of_current_pixel)

[0074] Of the four candidate cost values corresponding to the four candidate pixels 1-4, the minimum value is determined and this is the cost value assigned to the current pixel. The cost value for each pixel may then be used as a depth cue. For example the depth value of a pixel is derived from the cost function by dividing a fixed constant by the cost value.

determining a cost value for each of the plurality of first candidate depth values based on a cost function (e.g. a cost value corresponds to a depth value indicative to the distance away from a viewer, which the cost function is used to derive the depth value.  A cost value is determined for each pixel that is associated with a depth value of the pixel, which is taught in ¶ [70]-[74] above and [13].);

[0013] As an intermediate step, a cost function is determined which is related to the depth value, for example the cost function may coincide with the depth or be inverse proportional to the depth. Preferably, a low cost function corresponds to a depth value indicative of a location far away from the viewer, whereas a high cost function corresponds to a depth value indicative of a location close to the viewer.

selecting at least one first depth value from the plurality of first candidate depth values based on the cost values for the plurality of first candidate depth values (e.g. when selecting a minimum cost candidate, a depth from motion value associated with a corresponding pixel of the candidate cost value is selected.  The depth values are associated with each pixel, which is taught in ¶ [70]-[74] above, [75] and [76].); and
determining an updated depth value for the at least one first pixel based on the first depth value (e.g. a depth map contains depth values.  An initial depth value is output from the depth from motion block (102).  Later in the process, this initial depth map with depth values is improved, or updated, based on the initial depth map, which is taught in ¶ [65], [66] and [70] above.);
wherein the plurality of first candidate depth values comprises a first candidate depth value along a first direction from the at least one first pixel (e.g. the candidate values are computed for pixels in a certain direction including one pixel, which is taught in ¶ [72]-[74] above.), 
wherein none of the plurality of first candidate depth values along the first direction through a plurality of intervening pixels has a cost function which is less than the cost function for the first candidate depth value (e.g. the candidate cost values of the previous pixels, which can be considered as intervening pixels, are higher cost values than the current pixel cost value as the minimum cost values, which is taught in ¶ [70]-[74] above.  With the minimum of pixels 1-4 set as the current pixel cost value, the cost value of an intervening pixel of 2 or 3 includes the current pixels cost value and an absolute difference between luminance of pixels.  This is interpretated as being equal to or greater than based on the absolute difference of pixels 2 and 3 but not less than the cost value of the current pixel cost value of the minimum cost value of a pixel within the group of pixels.  If pixel 1 has the minimum cost value, the cost values of the intervening pixels of 2 and 3 are not less than the cost value of pixel 1 based on the equation mentioned in ¶ [70] above.).
wherein the plurality of intervening pixels lie along the first direction at a first distance from the at least one first pixel (e.g. pixels 2 and 3 can be considered as intervening pixels that are between pixels 1 and 4 in figures 2A or 2B, which the cost value of these is explained in ¶ [70]-[74] above.). 
However, Verburgh fails to specifically teach the features of wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels.
However, this is well known in the art as evidenced by Konieczny.  Similar to the primary reference, Konieczny discloses determining updated depth information (same field of endeavor or reasonably pertinent to the problem).      
Konieczny discloses wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels (e.g. the invention discloses calculating a cost function for image fragments, which can be a pixel or group of pixels detailed in ¶ [18].  The invention discloses calculating cost for image fragments in ¶ [124]-[126] that ensures a minimum value and further values are the same or greater.  As seen in figure 5 and box 505, a first image fragment can have a distance to the first candidate depth value associated with ddprev4 to the pixel associated with dprev3 that is greater in distance than the distance from dprev3 to the left most fragment associated with dprev2.  Figure 5 is explained in ¶ [140]-[147].).

[0018] The signal processing logic can be a processor, e.g. a multi-purpose processor or a digital signal processor (DSP), an ASIC, a FPGA, CPU, GPU and the like. The depth information value can be, for example, a depth value, a disparity value, or an index or label representing a depth value or a disparity value. The fragment can be, for example, a pixel or a group of pixels of the current digital image and/or the digital reference image. The similarity measure can be, for example, a matching cost or a matching probability, wherein the matching cost is a measure indicating a difference between the current fragment and the reference fragment and increases with increasing difference and wherein the matching probability is a measure indicating a likelihood/probability that the current fragment and the reference fragment match and decreases with increasing difference. The previously processed fragment can be of the same current digital image or of a previously processed digital image of the same view—temporally preceding digital image—or of a different view—same or previous time instant.

[0124] In an embodiment, which is herein also referred to as PREV-BEST-1, the similarity measure is a matching cost, which may also shortly be referred to as cost, and the weighted similarity measure is a weighted matching cost, which may also shortly be referred to as weighted cost. More specifically, the weighting function is a conditional penalty function, wherein the weighted matching cost C.sub.current (d) for a given image fragment and for a depth information value candidate d is defined as a sum of image fragment matching costs M.sub.current (d) and a constant penalty value, conditionally, if the given depth information value candidate is different from the depth information value d.sub.prev selected for a previously processed fragment, i.e.:


    PNG
    media_image1.png
    132
    690
    media_image1.png
    Greyscale

As one person skilled in the art will appreciate, the above equation (7) can also be expressed as:

C.sub.current(d)=M.sub.current(d)+T.sub.potts (d, d.sub.prev).

In an embodiment, the matching cost M.sub.current (d) can be, for instance, the sum of absolute differences (SAD) given by the above equation (2) or the sum of squared differences (SSD) given by the above equation (3). The matching cost M.sub.current (d) is also referred to as local matching cost because it does not consider the depth information values selected for previously processed fragments, e.g. in the vicinity of the currently processed fragment, and thus also no changes or transitions between depth information values of previously processed fragments and the currently processed fragment.

[0125] As described by equations (7) and (8), such embodiments comprise a weighting function which penalizes the matching cost M.sub.current (d) by a penalty value “penalty” when the depth information value candidate d (or d.sub.i) is different from the previously selected depth information value d.sub.prev (e.g. when of a transition or change of depth information values with regard to previously processed fragments, e.g. fragments in the vicinity of the currently processed fragment), and maintains the matching cost M.sub.current (d)=C.sub.current (d)) when the depth information value candidate d (or d.sub.i) is identical or equal to the previously selected depth information value d.sub.prev (e.g. when of no transition or change of depth information values with regard to previously processed fragments, e.g. fragments in the vicinity of the currently processed fragment).

[0126] Generally, the previously selected depth information value d.sub.prev can relate to any previously processed fragment. In an embodiment, the currently processed fragment is within the vicinity of the previously processed fragment, as will be explained in more detail further below in the context of FIG. 5. For example, d.sub.prev can be an estimated depth information value for a pixel or a pixel group directly neighboring the currently processed pixel or pixel group, for which the weighted matching cost C.sub.current (d) is calculated.


[0140] As one person skilled in the art will appreciate, in the algorithms shown in FIGS. 3 and 4 only a single depth information value d.sub.prev selected for a previously processed fragment is used for calculating the weighted similarity measure for a depth information value candidate, which is schematically illustrated in box 501 of FIG. 5. In FIG. 5, x and y refer to the coordinates of the fragments in horizontal (x) and vertical (y) direction, and t refers to the time instant of the fragment, wherein the arrows are directed from the previously processed fragments towards the currently processed fragment. The examples depicted in FIG. 5 relate to a typical application, where the fragments are processed horizontally from left to right. Accordingly box 501 depicts an embodiment, in which only the depth information value d.sub.prev of the horizontally left neighbor fragment of the same time instance (i.e. same current digital image) is used for calculating the weighted similarity measure of the depth information value candidate. Instead of the horizontally left neighbor fragment of the same time instance other previously processed fragments may be used, e.g. the upper left neighbor or lower left neighbor of the same time instance, the corresponding fragment (i.e. the fragment having the same x,y coordinates in the digital image) of a previous time instant (i.e. of a previous digital image, e.g. of the same view), the corresponding fragment (i.e. the fragment having the same x,y coordinates in the digital image) of a different view (i.e. of a digital image representing a different view or perspective to the same 3D scene), or neighbor fragments of the same or a different (e.g. previous) time instant and/or same or different views.

[0141] Typically neighbor fragments or corresponding fragments (i.e. same position/x,y coordinates in a different digital image) are most meaningful due to their spatial relationship to the currently processed fragment. However, in further embodiments the depth information values of other previously processed fragments may be used, e.g. when spatial characteristics of the digital image are known, for which depth information values of other previously processed fragments are more suitable to improve the depth information value estimation.
[0142] The same considerations apply for the PREV-BEST-2 embodiments described based on FIG. 4.

[0143] Embodiments of the present disclosure, furthermore, also cover implementations, where a plurality of depth information values of previously processed fragments are considered and used for calculating the weighted similarity measure, e.g. for determining whether the similarity measure shall be penalized or not.

[0144] For instance, in embodiments in accordance with the disclosure the weighted similarity measure (here weighted matching cost) can be calculated using a set of P previously estimated depth information values d.sub.prev.sub.k as follows:

 
    PNG
    media_image2.png
    132
    586
    media_image2.png
    Greyscale


As schematically illustrated in box 503 of FIG. 5, in an embodiment, d.sub.prev.sub.1, d.sub.prev.sub.2, d.sub.prev.sub.3 can be three exemplary estimated/selected depth information values for fragments neighboring to the currently processed fragment, for which the weighted matching cost C.sub.current (d) is currently calculated (referred to as PREV-BEST-3). These neighboring fragments can be in various spatial relationships to the currently processed fragment, a further one is illustrated in box 505 of FIG. 5, where four exemplary depth information values d.sub.prev.sub.1. . . 4 are considered (referred to as PREV-BEST-4).
[0145] Box 503 depicts an embodiment, in which the depth information values d.sub.prev1 to d.sub.prev3 of three left neighbor fragments of the same time instance (i.e. same current digital image) are used for calculating the weighted similarity measure, namely the top left neighbor (d.sub.prev1) the horizontal left neighbor (d.sub.prev2), and the bottom left neighbor (d.sub.prev3).

[0146] Box 505 depicts an embodiment, in which additionally to the depth information values d.sub.prev1 to d.sub.prev3 of three left neighbor fragments of box 503 the depth information value d.sub.prev4 of a top vertical neighbor of the same time instance (e.g. same current digital image) is used for calculating the weighted similarity measure.

[0147] As mentioned before, in embodiments the set of previously selected depth information values d.sub.prev.sub.k can also include depth information values related to different time instances than the currently processed fragment, e.g. related to a previous (in time) image frame. In the exemplary PREV-BEST-3-T1 embodiment schematically illustrated in box 507 of FIG. 5, a set of four depth information values d.sub.prev1.sub.1 . . . 4 is used, where d.sub.prev.sub.1 . . . 3 are spatial neighbors of the processed fragment in the same image (same as for box 503) and d.sub.prev.sub.4 is a co-located fragment (i.e. fragment having the same position or x,y coordinates in the digital image), neighboring with the processed fragment in time (i.e. comprised in the directly preceding digital image).

Therefore, in view of Konieczny, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels, incorporated in the device of Verburgh, in order to determine the distance or depth information of several image fragments, which can provide high fidelity in estimating depth information values in a computationally efficient manner (as stated in Konieczny ¶ [13]).   

Re claim 4: (Currently Amended) Verburgh discloses the method of claim 1, 
wherein the cost function comprises a cost contribution (e.g. the candidate cost equation discloses a cost of candidate pixel and absolute value of luminance between different pixels, which both could be considered as a cost contribution.  This is taught in ¶ [72]-[74] above and [13].),

[0013] As an intermediate step, a cost function is determined which is related to the depth value, for example the cost function may coincide with the depth or be inverse proportional to the depth. Preferably, a low cost function corresponds to a depth value indicative of a location far away from the viewer, whereas a high cost function corresponds to a depth value indicative of a location close to the viewer.

wherein the cost contribution is dependent on a difference between image values of multi-view images for pixels that are offset by a disparity matching a first candidate depth value to the cost function as applied to the first candidate depth value (e.g. the difference between the luminance of a current pixel and candidate pixel is found, and an absolute value of the difference is used as a cost contribution.  The cost is offset using a cost of candidate pixel value that is a cost value of a candidate pixel selected, which is taught in ¶ [70]-[74] above.). 

Re claim 6: (Currently Amended) Verburgh discloses the method of claim 1, wherein the first direction is vertical direction in the first depth map (e.g. the pixels in figure 2 can have a horizontal or a vertical direction, which is seen in figure 2 and explained in ¶ [70]-[74] above.).  

Re claim 14: (Currently Amended) Verburgh discloses an apparatus comprising:
a receiver circuit, wherein the receiver circuit is arranged to receive a first depth map (e.g. the system discloses block 106 receiving a depth map from the depth from motion block 102, which is taught in ¶ [65] and [66] above.); and
a processor circuit (e.g. a processor is used to perform the steps of the invention, which is taught in ¶ [112].),

[0112] The method and system may also be implemented in a computer program product for computing a depth map comprising depth values representing distances to a viewer for respective pixels of an image. The computer program product comprises computer executable instructions for causing a processor to perform the step of determining a depth related cost value of a current pixel, the depth related cost value being a minimal cost value among a plurality of candidate cost values, wherein at least one of the candidate cost values is based on a depth related cost value of at least one pixel in a local neighborhood of the current pixel and on a difference between a color attribute of the at least one pixel in the local neighborhood and a corresponding color attribute of the current pixel, and at least one of the candidate cost values is based on a depth related cost value relating to at least one pixel outside the local neighborhood and on a difference between a color attribute of the at least one pixel outside the local neighborhood and a corresponding color attribute of the current pixel. The computer program product further comprises computer executable instructions for causing a processor to perform the step of assigning a depth value to the current pixel in dependence on the determined depth related cost value.
 
	wherein the processor circuit is arranged to determine a plurality of first candidate depth values for at least one first pixel, wherein the plurality of first candidate depth values comprising depth values for at least a second pixel (e.g. the system discloses several pixels that are evaluated for a cost value.  For each pixel, depth values are calculated based on using the cost values, which is taught in ¶ [70]-[74] above.);
	wherein the processor circuit is arranged to determine a cost value for each of the plurality of first candidate depth values based on a cost function (e.g. a cost value corresponds to a depth value indicative to the distance away from a viewer, which the cost function is used to derive the depth value.  A cost value is determined for each pixel that is associated with a depth value of the pixel, which is taught in ¶ [13] and [70]-[74] above.);
	wherein the processor circuit is arranged to select a first depth value from the plurality of first candidate depth values based on the cost values for the plurality of first candidate depth values (e.g. when selecting a minimum cost candidate, a depth from motion value associated with a corresponding pixel of the candidate cost value is selected.  The depth values are associated with each pixel, which is taught in ¶ [70]-[76] above.); and
wherein the processor circuit is arranged to determine an updated depth value for the at least one first pixel based on the first depth value (e.g. a depth map contains depth values.  An initial depth value is output from the depth from motion block (102).  Later in the process, this initial depth map with depth values is improved, or updated, based on the initial depth map, which is taught in ¶ [65], [66] and [70] above.);

wherein the plurality of first candidate depth values comprises a first candidate depth value along a first direction from the at least one first pixel (e.g. the candidate values are computed for pixels in a certain direction including one pixel, which is taught in ¶ [72]-[74] above.), 
wherein none of the plurality of first candidate depth value along the first direction to through a plurality of intervening pixels has a cost function which is less than the cost function for the first candidate depth value (e.g. the candidate cost values of the previous pixels, which can be considered as intervening pixels, are higher cost values than the current pixel cost value as the minimum cost values, which is taught in ¶ [70] above.).

wherein the plurality of intervening pixels lie along the first direction at a first distance from the at least one first pixel (e.g. pixels 2 and 3 can be considered as intervening pixels that are between pixels 1 and 4 in figures 2A or 2B, which the cost value of these is explained in ¶ [70]-[74] above.). 
However, Verburgh fails to specifically teach the features of wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels plurality.
However, this is well known in the art as evidenced by Konieczny.  Similar to the primary reference, Konieczny discloses determining updated depth information (same field of endeavor or reasonably pertinent to the problem).      
Konieczny discloses wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels plurality (e.g. the invention discloses calculating a cost function for image fragments, which can be a pixel or group of pixels detailed in ¶ [18] above.  The invention discloses calculating cost for image fragments in ¶ [124]-[126] above that ensures a minimum value and further values are the same or greater.  As seen in figure 5 and box 505, a first image fragment can have a distance to the first candidate depth value associated with ddprev4 to the pixel associated with dprev3 that is greater in distance than the distance from dprev3 to the left most fragment associated with dprev2.  Figure 5 is explained in ¶ [140]-[147] above.).

Therefore, in view of Konieczny, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention was made to have the feature of wherein a second distance from the at least one first pixel to the first candidate depth value is larger than the first distance from the at least one first pixel to the plurality of intervening pixels plurality, incorporated in the device of Verburgh, in order to determine the distance or depth information of several image fragments, which can provide high fidelity in estimating depth information values in a computationally efficient manner (as stated in Konieczny ¶ [13]).   

Re claim 15: (Currently Amended) Verburgh discloses a computer program stored on a non- transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 1 (e.g. a computer program is stored on a computer program product to stored executable instructions on a processor, which is taught in ¶ [112] above and [114].).

[0114] The carrier of a computer program may be any entity or device capable of carrying the program. For example, the carrier may include a storage medium, such as a ROM, for example a CD ROM or a semiconductor ROM, or a magnetic recording medium, for example a floppy disc or hard disk. Further the carrier may be a transmissible carrier such as an electrical or optical signal, which may be conveyed via electrical or optical cable or by radio or other means. When the program is embodied in such a signal, the carrier may be constituted by such cable or other device or means. Alternatively, the carrier may be an integrated circuit in which the program is embedded, the integrated circuit being adapted for performing, or for use in the performance of, the relevant method.

Re claim 20: (New) Verburgh discloses the apparatus of claim 14, wherein the first direction is vertical direction in the first depth map (e.g. the pixels in figure 2 can have a horizontal or a vertical direction, which is seen in figure 2 and explained in ¶ [70]-[74] above.).

Claim(s) 2, 3 and 16-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh, as modified by Konieczny, as applied to claim 1 and 14 above, and further in view of Barone (US Pub 2015/0023586).

Re claim 2: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1,
wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when for the distance is less than a distance threshold, and
wherein the cost function along the first direction has a monotonically [[a]] decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold.

However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).     
Barone discloses wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when for the distance is less than a distance threshold (e.g. when the disparity value or depths are below the value of 6, the cost function has an increasing cost gradient when below the threshold of 6.  This is taught in ¶ [121]-[142].), and 

[0121] In the embodiment considered, a disparity histogram is built for each tile TIL only with the disparity values of the candidates belong to the respective tile.
[0122] For example, FIG. 10a schematically shows the tile TIL of FIG. 9, which comprises, e.g., a 10.times.10 block of pixels. Specifically, with each pixel PX of the tile TIL is associated a number of candidate pixels, and as mentioned in the foregoing, the number of candidate pixels could be fixed or variable. Moreover, with each pixel pair is associated a disparity or depths value d.
[0123] These disparity or depths values d for each pixel PX are used to build a histogram of disparity. Specifically, the histogram shows the occurrences of the disparity or depths value d for all pixels in the title TIL. For example, in case 4 candidate pixels would be associated with each pixel PX of the 10.times.10 tile TIL, a total of 400 values would be distributed in the histogram. Thus, the histogram of disparity per tile facilitates discovering the outliers. In fact, even when a matching is very good from a similarity point of view, the association could not be the correct one. In fact, as shown in FIG. 10b, it is possible to recognize as outliers the candidates with low occurrences in the histogram.

[0124] In the embodiment considered, the histogram is then used to modify the results of the cost function calculated during the matching phase. For example, in an embodiment, the occurrences occ(d) are used as the inverse weight of cost function. Accordingly, high occurrences decrease the cost function, so also a non-minimum original cost value could win:

E 2 ( d ) = E 1 ( d ) / occ ( d ) ##EQU00001##

[0125] Accordingly, in the embodiment considered, during the matching phase is determined for a given pair of pixel a disparity value d and a respective cost function E1(d). Conversely, during the filtering phase is determined the occurrence occ(d) of the disparity value d in the whole tile TIL associated with the respective pixel, and the occurrence occ(d) of the disparity value d is used to weight the initial cost function.
[0126] For example, considering the exemplary case wherein 4 candidate pixels having the following disparity values are associated with a pixel in the reference image: [0127] the first pixel pair has a disparity d=4 and a cost function E1=20; [0128] the second pixel pair has a disparity d=5 and a cost function E1=30; [0129] the third pixel pair has a disparity d=6 and a cost function E1=15; and [0130] the fourth pixel pair has a disparity d=7 and a cost function E1=25,
[0131] the disparity value d=6 would have the lowest value for the cost function E1.

[0132] Now, considering that the histogram for the respective tile would show the following occurrences for the above mentioned disparity values d: [0133] occ(4)=4; [0134] occ(5)=2; [0135] occ(6)=5; and [0136] occ(7)=25.

[0137] Accordingly, the modified cost function would have as final result: [0138] E2(4)=20/4=5; [0139] E2(5)=30/2=15; [0140] E2(6)=15/5=3; and [0141] E2(7)=25/25=1.

[0142] Thus, the filter stage would select indeed the fourth pixel pair with the disparity value d=7, which has the lowest value for the modified cost function.

wherein the cost function along the first direction has a monotonically [[a]] decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold (e.g. when the disparity value is 6 or above, the cost gradient decreases after the disparity value of 6.  This is taught in ¶ [121]-[142] above.).

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when for the distance is less than a distance threshold, and wherein the cost function along the first direction has a monotonically [[a]] decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold, incorporated in the device of Verburgh, in order to determination if the cost function gradient increases or decreases, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ¶ [114]).   

Re claim 3: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values. 
However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).   	  
Barone discloses wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values (e.g. each pixel pair is associated with a disparity or depths value d, which is taught in ¶ [122] above.  The system determines candidate pixels as outliers for their removal in order to prevent wrong selection of this pixel associated with each candidate pixel, which is taught in ¶ [112]-[119].). 

[0112] As mentioned in the foregoing, in various embodiments, the matching phase 2008 does not merely select the best matching block in the second image, as it is usual for local methods, but multiple candidate blocks with a low error are associated with each candidate pixel and processed by the filter stage 2014.

[0113] Generally, the number of pixels in the second image associated with each candidate pixel in the reference image may be fixed or variable. For example, only those pixel pairs may be selected for which the respective result of the cost function is below a given threshold. For example, in some embodiments, this threshold value and/or a maximum number of pixel pairs are configurable.

[0114] In some embodiments a DLEM (Disparity Local Energy Min) filter stage is used for the final selection and also for an outliers removal.

[0115] The filter may be based on the following energy function:
E(d)=Edata(d)+.lamda.Esmooth(d)
[0116] Specifically, in some embodiments, the calculation of the above equation is split into two separate cost functions:
E1(d)=Edata(d)
E2(d)=Esmooth(d)

[0117] In some embodiments, the first cost function E1(d) is used during the matching phase, e.g., during the step 3020 shown in FIG. 7. Thus, the matching phase 2008 provides a plurality of possible solutions representing the most similar pixel blocks between the two images, e.g., at the end of matching multiple candidate pixels in the second image are associated with each pixel of the reference image or the subset of pixels in case a pre-matching or another operation has been performed to reduce the number of pixels used during the matching phase.

[0118] Conversely, the second cost function is used during the filter step 2014 to define a weight for the final cost function. The filter stage modifies the result of the first cost function E1(d) and selects the best association for each pixel. For example, the pair of pixels may be selected, which has the lowest value of the modified cost function.

[0119] In some embodiments, an outliers removal may be performed before the final selection is done in order to reduce to risk of wrong selections. For example, such an outliers removal may be performed based on disparity neighbors values.

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values, incorporated in the device of Verburgh, in order to determination if the a pixel with a disparity or depth value is associated with a candidate pixel or removed as an outlier, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ‘586 ¶ [114]).   

Re claim 16: (New) However, Verburgh fails to specifically teach the features of the apparatus of claim 14,
	wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when the distance is less than a distance threshold,
	wherein the cost function along the first direction has a monotonically decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold.

However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).     
Barone discloses wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when the distance is less than a distance threshold (e.g. when the disparity value or depths are below the value of 6, the cost function has an increasing cost gradient when below the threshold of 6.  This is taught in ¶ [121]-[142] above.), and 
wherein the cost function along the first direction has a monotonically decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold (e.g. when the disparity value is 6 or above, the cost gradient decreases after the disparity value of 6.  This is taught in ¶ [121]-[142] above.).

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the cost function along the first direction has a monotonically increasing cost gradient as a function of distance from the at least one first pixel when the distance is less than a distance threshold, wherein the cost function along the first direction has a monotonically decreasing cost gradient as a function of distance from the at least one first pixel when the distance is more than or equal to a threshold, incorporated in the device of Verburgh, in order to determination if the cost function gradient increases or decreases, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ¶ [114]).   

Re claim 17: (New) However, Verburgh fails to specifically teach the features of the apparatus of claim 14, wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values.
However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).   	  
Barone discloses wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values (e.g. each pixel pair is associated with a disparity or depths value d, which is taught in ¶ [122] above.  The system determines candidate pixels as outliers for their removal in order to prevent wrong selection of this pixel associated with each candidate pixel, which is taught in ¶ [112]-[119] above.). 

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the plurality of first intervening pixels has at least one depth value that is not included in the plurality of first candidate depth values, incorporated in the device of Verburgh, in order to determination if the a pixel with a disparity or depth value is associated with a candidate pixel or removed as an outlier, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ‘586 ¶ [114]).   

Re claim 18: (New) Verburgh discloses the apparatus of claim 14,
	wherein the cost function comprises a cost contribution (e.g. calculating the cost function includes a cost contribution considered as the cost of candidate pixel value, which is taught in ¶ [73] above.),
	wherein the cost contribution is dependent on a difference between image values of multi-view images for pixels that are offset by a disparity matching a first candidate depth value to the cost function as applied to the first candidate depth value (e.g. the candidate cost value is determined by determining the difference between image values of pixels that are offset from one another, which is taught in ¶ [72]-[75] above.).
	However, Verburgh fails to specifically teach the features of pixels that are offset by a disparity matching a first candidate depth value to the cost function as applied to the first candidate depth value. 
However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).   	  
Barone discloses pixels that are offset by a disparity matching a first candidate depth value to the cost function as applied to the first candidate depth value (e.g. the pixels are offset by disparity values.  The first pixel pair can be considered to contain  the first candidate depth value associated with a cost function that is calculated for a first candidate depth value, which is taught in ¶ [124]-[126] above.).

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of pixels that are offset by a disparity matching a first candidate depth value to the cost function as applied to the first candidate depth value, incorporated in the device of Verburgh, in order to determination if the a pixel with a disparity or depth value is associated with a candidate pixel or removed as an outlier, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ‘586 ¶ [114]).   

Claim(s) 5 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh, as modified by Konieczny, as applied to claims 1 and 14 above, and further in view of Peruch (US Pub 2019/0213389).

Re claim 5: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, further comprising determining the first direction as a gravity direction for the first depth map[[;]], wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map.  
	However, this is well known in the art as evidenced by Peruch.  Similar to the primary reference, Peruch discloses determining the gravity of pixels based on the depth maps input (same field of endeavor or reasonably pertinent to the problem).     
	Peruch discloses further comprising determining the first direction as a gravity direction for the first depth map[[;]], wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map (e.g. the invention discloses determining the direction of gravity within the scene of a depth map.  This is performed with determining the bottom pixels, which is explained in ¶ [130]-[134].).

[0130] FIG. 5C is a flowchart of a method 330 for computing a virtual ground plane according to one embodiment of the present invention. In operation 331, the processor analyzes the input depth map of the scene (e.g., with the object of interest 10 segmented) and identifies an orientation of the depth map to determine which direction corresponds to the direction of gravity (informally, the “down” direction) As noted above, the orientation information may be recorded from the IMU 118 at the time that the depth map is captured. In operation 333, the “bottom” pixels or points of the depth map are identified, where “bottom” refers to the portion of the image in the “down” direction identified in operation 331. The bottom of the depth map is assumed to correspond to the closest part of the ground plane 14, which extends away from the depth camera and “up” in the depth map (e.g., toward the top of the image).

[0131] For example, in the depth map shown in FIG. 4A, the “down” direction corresponds to the direction perpendicular to the ground plane an parallel to the vertical axis of the bottle, and the portion of the depth map corresponding to the “bottom” pixels or points are the orange strip at the lower edge of the image.

[0132] In some embodiments, the processor controls the width of the strip of bottom pixels that are identified in operation 333 based on known noise characteristics of the depth camera system 100 (e.g., noise as a function of distance or range of a pixel). The noise characteristics of the depth camera system 100 may include parameters that are stored in the memory of the depth camera system 100 and previously computed by measuring differences between depth maps captured by the depth camera system 100 (or substantially equivalent depth camera systems) and/or parameters computed based on, for example, theoretical predictions noise in the camera image sensors (e.g., image sensors 102a, 104a, and 105a), characteristics of the pattern emitted by the projection source 106, the image resolutions of the image sensors, and constraints of the disparity matching technique. For example, a particular level of error may be acceptable for particular applications. Accordingly, in some embodiments, pixels from the bottom edge of the depth map up until the pixels represent distances that exceed the acceptable error threshold (based on the known noise characteristics of the depth camera as a function of distance or range) are selected as part of the ground plane (subtracting the points or pixels corresponding to the segmented object, if any such pixels were included in this process).

[0133] In operation 335, the processer uses the bottom points or pixels, which are assumed to lie on the same ground plane 14 that is supporting the object 10, to define a partial ground plane or partial plane. For example, in some embodiments, linear regression is applied to the selected bottom points (or depth pixels) along two directions (e.g., two horizontal directions perpendicular to the direction of gravity) to define a virtual ground plane (or an “ideal” virtual ground plane) in accordance with a linear function. In some embodiments of the present invention, outlier points or pixels (e.g., corresponding to noise or foreground clutter objects) are removed from the bottom points or pixels before computing the plane.

[0134] In operation 337, the virtual ground plane defined by the selected ones of the bottom pixels of the depth map is extended to the region under the object of interest 10. Accordingly, aspects of embodiments of the present invention relate to defining a virtual ground plane based on portions of the captured depth map (or 3-D model) that exhibit lower noise (e.g., a portion of the ground 14 that is closer to the depth camera system 100). Based on the assumption that the ground 14 is substantially planar or flat between the low noise portion of the ground 14 closest to the depth camera system 100 and the parts of the ground 14 at the object 10, this virtual ground plane can be extended to the region under the object 10. This increases the accuracy of the measurements of the dimensions of the object in later operations 350 and 370, as described in more detail below.

Therefore, in view of Peruch, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising determining the first direction as a gravity direction for the first depth map[[;]], wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map, incorporated in the device of Verburgh, in order to detect the gravity direction within an input depth map that is used to determine a virtual ground plane, which increases the accuracy of the measurements of the dimensions of an object (as stated in Peruch ¶ [134]).   

Re claim 19: (New) However, Verburgh fails to specifically teach the features of the apparatus of claim 14, further comprising determining the first direction as a gravity direction for the first depth map, wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map.
	However, this is well known in the art as evidenced by Peruch.  Similar to the primary reference, Peruch discloses determining the gravity of pixels based on the depth maps input (same field of endeavor or reasonably pertinent to the problem).     
	Peruch discloses further comprising determining the first direction as a gravity direction for the first depth map, wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map (e.g. the invention discloses determining the direction of gravity within the scene of a depth map.  This is performed with determining the bottom pixels, which is explained in ¶ [130]-[134] above.).

Therefore, in view of Peruch, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising determining the first direction as a gravity direction for the first depth map, wherein the gravity direction is a direction in the first depth map matching a direction of gravity in a scene represented by the first depth map, incorporated in the device of Verburgh, in order to detect the gravity direction within an input depth map that is used to determine a virtual ground plane, which increases the accuracy of the measurements of the dimensions of an object (as stated in Peruch ¶ [134]).   

Claim(s) 7-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh, as modified by Konieczny, as applied to claim 1 above, and further in view of Davison (US Pub 2019/0080463).

Re claim 7: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, further comprising determining a depth model for a portion of a scene represented by the first depth map, and wherein the cost function for a depth value is dependent on a difference between the depth value and a model depth value determined from the depth model.  
	However, this is well known in the art as evidenced by Davison.  Similar to the primary reference, Davison discloses updating a cost function based on a difference (same field of endeavor or reasonably pertinent to the problem).     
	Davison discloses further comprising determining a depth model for a portion of a scene represented by the first depth map, and wherein the cost function for a depth value is dependent on a difference between the depth value and a model depth value determined from the depth model (e.g. the measured depth map is compared to a predicted depth map in order to calculate a cost function associated with the difference between the values, which is taught in ¶ [54], [55] and [67].  Since the cost function is updated based on the difference between the values, this is considered to have a dependent relationship.).  

[0054] As outlined previously with reference to FIG. 2, the mapping engine 330 of the apparatus 300 uses preliminary estimates of the conditions of the 3D space (in the form of initial geometry, appearance and camera pose values—such as there being a predominant reference plane, or the height of the camera above the reference plane) to generate an initial surface model of the 3D space (block 530). This initial surface model, along with the camera pose data retrieved by the pose data interface 320, is used by the differentiable renderer 340 to render a predicted depth map of the observed scene (block 540). An important element of the method is that, given the initial surface model and camera pose data, the differentiable renderer 340 can calculate the (partial) derivatives of the depth values with respect to the model parameters (block 550), as well as render a predicted image and depth for every pixel, at almost no extra computational cost. This allows the apparatus to perform gradient-based minimization in real-time by exploiting parallelisation. The rendered depth map of the frame is compared directly to the measured depth map retrieved from the depth map processor 430 by the depth data interface 310, and a cost function of the error between the two maps is calculated. The partial derivative values calculated by the differentiable rendering process (block 550) are subsequently used to reduce the cost function of the difference/error between the predicted 250 and the measured 240 depth maps (block 560), and therefore optimize the depth map. The initial surface model is updated with the values for the geometric parameters derived from the reduced cost function (block 570) and optimized depth map.

[0055] The updated surface model along with the initial camera pose data (from block 520) is subsequently used by the differentiable renderer 340 to render an updated predicted depth map of the observed scene (block 540). The updated rendered depth map of the frame is compared directly to the original measured depth map for the frame (from block 510), and a cost function (including the error between the two maps) is reduced using the partial derivative values calculated by the differentiable rendering process (block 550). The surface model is updated, again, following optimization and the process (blocks 540, 550, 560, 570) is repeated, iteratively, until the optimization of the rendered depth map converges. The optimization may, for example, continue until the error term between the rendered and measured depth maps falls below a pre-determined threshold value.

[0067] FIG. 8 shows a processor 800 equipped to execute instructions stored on a non-transitory computer-readable storage medium. When executed by the processor, the instructions cause a computing device to obtain an observed depth map for a space (block 810); obtain a camera pose corresponding to the observed depth map (block 820); obtain a surface model (in this example comprising a mesh of triangular elements, each triangular element having height values associated with vertices of the element, the height values representing a height above a reference plane) (block 830); render a model depth map based upon the surface model and the obtained pose, the rendering including computing partial derivatives of rendered depth values with respect to height values of the surface model (block 840); compare the model depth map to the observed depth map, including determining an error between the model depth map and the observed depth map (block 850); and determine an update to the surface model based on the error and the computed partial derivatives (block 860). For each observed depth map (i.e. captured image/frame), the final four steps may be repeated, iteratively, until the rendered depth map optimization (i.e. through minimization of the error between the rendered and the observed depth maps) converges. The convergence of the optimization process may involve the error value between the rendered and the measured depth maps falling below a predetermined threshold.

Therefore, in view of Davison, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising determining a depth model for a portion of a scene represented by the first depth map, and wherein the cost function for a depth value is dependent on a difference between the depth value and a model depth value determined from the depth model, incorporated in the device of Verburgh, in order to calculate a cost function based on a difference between a depth values, which allows for further optimization of depth maps (as stated in Davison ¶ [55]).   

Re claim 8: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 7, wherein the cost function is asymmetric with respect to whether the depth value exceeds the model depth value or is less than or equal to  the model depth value.  
However, this is well known in the art as evidenced by Davison.  Similar to the primary reference, Davison discloses updating a cost function based on a difference (same field of endeavor or reasonably pertinent to the problem).     
	Davison discloses wherein the cost function is asymmetric with respect to whether the depth value exceeds the model depth value or is less than or equal to  the model depth value (e.g. the cost function is not symmetric to whether or not the model depth value is exceeded or is less than a calculated depth value, which is taught in ¶ [54], [55] and [67] above.).
Therefore, in view of Davison, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the cost function is asymmetric with respect to whether the depth value exceeds the model depth value or is less than or equal to  the model depth value, incorporated in the device of Verburgh, in order to calculate a cost function based on a difference between a depth values, which allows for further optimization of depth maps (as stated in Davison ¶ [55]).   
   
Re claim 9: (Currently Amended) Verburgh discloses the method of claim 7, wherein the depth model is a background model for the scene (e.g. the predefined depth model is used to process a background in order to determine a cost value, which is taught by ¶ [24].).  

[0024] An embodiment comprises a selective blurring unit for blurring at least one region of the image that is in a background according to a predefined depth model before determining the cost value by the optimization unit. The quality of the depth estimate is increased if background objects are somewhat blurred, because the differences in the color attribute are smaller in blurred regions of the image.

Claim(s) 10 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh, as modified by Konieczny, as applied to claim 1 above, and further in view of Varekamp (WO-2018197630 (Pub date: 11/1/2018)).

Re claim 10: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, further comprising including at least one second candidate depth value[[s]] in the plurality set of first candidate depth values,
wherein the at least one section candidate depth value is not from the first depth map, 
wherein the at least one section candidate depth value is selected from the group consisting of a depth value from a second depth map of a temporal sequence of depth maps, 
a depth value independent of a scene, and a depth value determined based on an offset of a depth value for the at least one first pixel,
wherein the temporal sequence of depth maps comprises the first depth map.

However, this is well known in the art as evidenced by Varekamp.  Similar to the primary reference, Varekamp discloses processing depth values (same field of endeavor or reasonably pertinent to the problem).     
Varekamp discloses further comprising including at least one second candidate depth value[[s]] in the plurality set of first candidate depth values (e.g. a second candidate depth value can be determined from a set of candidate values, which is taught in page 17, ll. 27-page 19, ll. 24.),

(51) The second processor 207 receives the first processed depth map and generates a second processed depth map by processing pixels in the first processed depth map in a top to bottom direction. The second processor 207 may thus perform a similar iterative or sequential process but does so in the opposite direction of processing by the first processor 205. The first processor 205 may thus be seen as performing a first pass, with the second processor 207 performing a second pass seeking to address potential errors or artefacts that may have resulted from the first pass.

(52) The second processor 207 is arranged to process the first processed depth map by setting the depth map value of the second processed depth map as either the corresponding pixel in the first processed depth map or as a depth value for a pixel of the first processed depth map which is above the current pixel. Thus, the depth value in the second processed depth map may be set as the depth value for the corresponding depth value for the pixel in the first processed depth map or it may be set to a depth value in the first processed depth map of a pixel above the current pixel. The selection of which depth value to use is based on a depth step criterion which considers both the input depth map and the first processed depth map.

(53) In more detail, for a given pixel, referred to as the second pixel, the second processor 207 determines a depth value D.sub.i,j″ in the second processed depth map. One candidate for this depth value D.sub.i,j″ is the corresponding depth value D.sub.i,j′ in the first processed depth map. This depth value may be referred to as the first candidate depth value D.sub.i,j′ and if this is selected it corresponds to the depth value for the second pixel not being changed.

(54) Alternatively, the depth value D.sub.i,j″ in the second processed depth map may be set to the depth value of the first processed map for a third pixel which is above the second pixel. This depth value may be referred to as a second candidate depth value D.sub.i−1,k′. In many embodiments, the third pixel/the second candidate depth value may be selected from a, typically small, set of pixels above the second pixel. For example, the second candidate depth value D.sub.i−1,k′ may be selected from a set of pixels with k in the interval of [j−N;j+N], where N is typically a small number, e.g. N may be in the interval from 1 to 7.

(55) The approach by the second processor 207 thus processes the first processed depth map in a top to bottom direction and either maintains the depth value of the first processed depth map (D.sub.i,j″←D.sub.i,j′) or allows the depth value to be one propagating down from pixels above the current pixels (D.sub.i,j″←D.sub.i−1,k′).

(56) The choice of whether to maintain the depth value or to replace it is as previously mentioned based on a depth step criterion which is dependent on depth steps both in the input depth map and in the first processed depth map.

(57) Specifically, the depth step criterion may require that the absolute value of a depth step in the input depth map is not above a first threshold and that a backwards depth step in the first processed image is not below a second threshold.

(58) In such an embodiment, the first processed depth map may accordingly be substituted by a depth value from a pixel above if the second processor 207 detects that there is only a small depth step in the input depth map but that there is a large backwards depth step in the first processed depth map (when moving to the current pixel). In this case, the large depth step in the first processed depth map may indicate that the first processor 205 has introduced a depth transition which is not there in the original input depth map. Accordingly, the second processor 207 allows the depth value which is further forward to propagate from the third pixel to the current pixel (the second pixel).

(59) More specifically, the depth step criterion may comprise a requirement that an absolute depth step in the input depth map between the third pixel and a pixel above the second pixel is not above a third threshold (|D.sub.i−1,k−D.sub.i−1,j|≤t.sub.3) and/or that a maximum absolute depth step in the input image between the second pixel and a pixel above the second pixel is not above a fourth threshold (|D.sub.i−1,j−D.sub.i,j|≤t.sub.4). These requirements may provide advantageous indications that there are no corresponding depth steps in the input image in either the horizontal or vertical direction with respect to the current pixel (the second pixel) and the target pixel (the third pixel). Accordingly, these requirements provide a good indication that the original depth map had no local depth transitions/steps/discontinuities and that therefore any such local transitions/steps/discontinuities is likely to have been introduced by the processing of the first processor 205.


(60) The depth step criterion may in many embodiments comprise a requirement that a (size of a) backwards depth step in the first processed image from the third pixel to the second pixel is not below a fifth threshold, i.e. that stepping to the current pixels results in a sufficiently large depth step away from the camera/view point. In embodiments where an increasing depth value D is indicative of the pixel being closer to the camera/view point, this can be expressed by (D.sub.i,k′−D.sub.i−1,j′≤t.sub.5). This may provide a suitable indication that a backwards depth step, which may potentially have been incurred by the processing of the second processor 207, is present.

wherein the at least one section candidate depth value is not from the first depth map (e.g. a candidate depth value can be a depth value from another depth map that is not the first depth map, which is taught in page 21, ll. 9- page 22, ll. 5.), 

(69) In the system of FIG. 2, the second processor 207 is coupled to a third processor 209 which is fed the second processed depth map and which in response generates a third processed depth map. The third processor 209 essentially performs a symmetric operation to that of the second processor 207 but in a bottom-to-top direction.

(70) The third processor 209 may thus receive the second processed depth map and generate a third processed depth map by processing pixels in the second processed depth map in a bottom to top direction. The third processor 209 may thus perform a similar iterative or sequential process as the second processor 207 but does so in the opposite direction. The third processor 209 may thus be considered to perform a third pass seeking to address potential errors or artefacts that may have resulted from specifically the first pass.


(71) The third processor 209 is arranged to process the second processed depth map by setting the depth map value of the third processed depth map as either the corresponding pixel in the second processed depth map or as a depth value for a pixel of the second processed depth map which is below the current pixel. Thus, the depth value in the third processed depth map may be set as the depth value for the corresponding depth value for the pixel in the second processed depth map or to a depth value in the second processed depth map of pixel above the current pixel. The selection of which depth value to use is again based on a depth step criterion which considers both the input depth map and the first processed depth map.

(72) In more detail, for a given pixel, referred to as the fourth pixel, the third processor 209 determines a depth value D.sub.i,j′″ in the third processed depth map. One candidate for this depth value D.sub.i,j′″ is the corresponding depth value D.sub.i,j″ in the second processed depth map. This depth value may be referred to as the third candidate depth value D.sub.i,j″.

(73) Alternatively, the depth value D.sub.i,j′″ in the third processed depth map may be set to the depth value of the second processed map for a fifth pixel which is below the second pixel. This depth value may be referred to as a fourth candidate depth value D.sub.i+1,k″. In many embodiments, the fifth pixel/the fourth candidate depth value may be selected from a, typically small, set of pixels above the fourth pixel. For example, the fourth candidate depth value D.sub.i+1,k″ may be selected from a set of pixels with k in the interval of [j−N;j+N] where N is typically a small number, e.g. N may be in the interval from 1 to 7.

wherein the at least one section candidate depth value is selected from the group consisting of a depth value from a second depth map of a temporal sequence of depth maps (e.g. the invention discloses selecting a candidate depth values from one of the depth maps that are determined from different areas of the pixels, which is taught in page 21, ll. 9- page 22, ll. 5 above.), 
a depth value independent of a scene, and a depth value determined based on an offset of a depth value for the at least one first pixel (e.g. the offset depth map contains depth values that are determined from an offset of these values from an already processed map.  This is taught in page 24, ll. 1-page 25, ll. 16.),

(85) In some embodiments, the depth processing apparatus of FIG. 2 may further comprise a depth offset processor 211 which may process a depth map that has been processed by the first processor 205 and optionally also by the second processor 207 and/or the third processor 209 to generate an offset depth map in which some depth values may be offset to restore some of the forward depth step that has been reduced by the previous processing. Thus, the depth map processed by the depth offset processor 211 may be an already processed depth map which is derived from the first processed depth map, i.e. the processing by the first processor 205 has been part of the process resulting in the depth map. Accordingly, the first processed depth map has been derived from the first processed depth map, e.g. by the processing by the second processor 207 and/or the third processor 209. It will be appreciated that the derived depth map includes the straightforward derivation of simply using the first processed depth map without changing this, i.e. the already processed depth map being fed to the depth offset processor 211 may be the first processed depth map itself in some embodiments.

(86) An example of the approach of the depth offset processor 211 may be elucidated with reference to FIG. 7 which first shows a possible depth profile D.sub.i,j for a column in the input depth map (with increasing y values corresponding to higher rows in the image (i.e. y increases towards the top). The next profile shows the resulting depth profile D.sub.i,j′ of first processed depth map generated by the first processor 205. As can be seen, no forwards depth steps for increasing y values are allowed and thus the depth values are decreased where such depth steps occur.

(87) The lowest chart of FIG. 7 illustrates a possible resulting depth profile D.sub.i,j″ after an exemplary depth processing by the depth offset processor 211. In the example, some of the forward depth steps have been re-introduced. Specifically, in the example, the original depth values, and thus depth steps, have been reinstated subject to a maximum depth step value D.sub.maxstep. Specifically, the depth offset processor 211 may perform the following operation:

(88) TABLE-US-00003 1. for row i ← 1 ... N.sub.rows 2.  for column j ← 1 ... N.sub.cols−1 3. D″.sub.i,j ← D′.sub.i,j + min(D.sub.i,j − D′.sub.i,j, D.sub.maxstep)

where D.sub.i,j is the depth value of the original input depth map and D.sub.i,j′ is the result of processing by the first processor 205 (and optionally the second processor 207 and third processor 209). D.sub.i,j″ is the final output of the depth offset processor 211. Parameter D.sub.maxstep is the maximum allowed depth difference (typically set to e.g. 5 for a depth map with dynamic range of 0 . . . 255).

(89) Thus, in this approach, some, typically small, forward depth steps are allowed/reintroduced. This approach may in many scenarios result in improved perceived quality. For example, it may introduce some depth texture and variation in image objects (thereby e.g. preventing or mitigating a flat/cardboard look of e.g. the moving person) while still ensuring that these are perceived to not stand-out/float over e.g. the ground.

(90) The depth offset processor 211 may accordingly offset the depth values of the received depth map (which may e.g. be received from any of the first processor 205, the second processor 207, or the third processor 209 depending on the specific embodiment). The offset for a given pixel may be determined based on the difference between the depth value for this pixel in respectfully the received processed depth map and the original input depth map.

wherein the temporal sequence of depth maps comprises the first depth map (e.g. when creating multiple depth values and maps, a previous depth map with depth values are offset to create the depth maps, which is taught in page 24, ll. 1-page 25, ll. 16 above.).
Therefore, in view of Verakamp, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising including at least one second candidate depth value[[s]] in the plurality set of first candidate depth values, wherein the at least one section candidate depth value is not from the first depth map, wherein the at least one section candidate depth value is selected from the group consisting of a depth value from a second depth map of a temporal sequence of depth maps, a depth value independent of a scene, and a depth value determined based on an offset of a depth value for the at least one first pixel, wherein the temporal sequence of depth maps comprises the first depth map, incorporated in the device of Verburgh, in order to have low complexity and low resource demanding approach for selecting and determining depth values using an offset, which can improve depth maps consistency and accuracy (as stated in Varekamp page 4, ll. 18-21).   

Re claim 12: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, further comprising processing a plurality of pixels of the first depth map by iteratively selecting at least one third pixel, wherein the at least one third pixel is a portion of the at least one first pixel.
However, this is well known in the art as evidenced by Varekamp.  Similar to the primary reference, Varekamp discloses processing depth values (same field of endeavor or reasonably pertinent to the problem).     
Varekamp discloses further comprising processing a plurality of pixels of the first depth map by iteratively selecting at least one third pixel, wherein the at least one third pixel is a portion of the at least one first pixel (e.g. a third pixel can be selected from a first depth map that is also a part of a second depth map.  This is iteratively selected when performing the development of depth maps in different passes, which is taught in page 5, ll. 19-28 and page 21, ll. 13-18.).

(32) In accordance with an optional feature of the invention, the apparatus further comprises a second processor for generating a second processed depth map by processing a plurality of second pixels in the first processed depth map in a top to bottom direction; wherein generating a depth value for each second pixel in the second processed depth map comprises if a depth step criterion is met setting the depth value in the second processed depth map for the second pixel as a depth value of the first processed map for a third pixel for the second pixel, the third pixel being above the second pixel, and otherwise setting the depth value in the second processed depth map for the second pixel as a depth value for the second pixel in the first processed depth map; wherein the depth step criterion is dependent on depth steps both in the input depth map and in the first processed depth map.

(70) The third processor 209 may thus receive the second processed depth map and generate a third processed depth map by processing pixels in the second processed depth map in a bottom to top direction. The third processor 209 may thus perform a similar iterative or sequential process as the second processor 207 but does so in the opposite direction. The third processor 209 may thus be considered to perform a third pass seeking to address potential errors or artefacts that may have resulted from specifically the first pass.

Therefore, in view of Verakamp, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising processing a plurality of pixels of the first depth map by iteratively selecting at least one third pixel, wherein the at least one third pixel is a portion of the at least one first pixel, incorporated in the device of Verburgh, in order to have low complexity and low resource demanding approach for selecting and determining depth values using an offset, which can improve depth maps consistency and accuracy (as stated in Varekamp page 4, ll. 18-21).   

Claim(s) 11 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Verburgh, as modified by Konieczny, as applied to claim 1 above, and further in view of Barone and Varekamp.

Re claim 11: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1, 
wherein the cost function for a depth value is dependent on a type of the depth value,
wherein the type of depth values is selected from the group consisting of a depth value of the first depth map[[;]], a depth value of the first depth map closer than a distance threshold[[;]], a depth value of the first depth map farther away than a distance threshold[[;]], a depth value from a second depth map of a temporal sequence of depth maps [[;]], a depth value having a scene independent depth value offset relative to a depth value of the first depth value[[;]], a depth value independent of a scene is represented by the first depth map[[;]], a depth value determined based on an offset of a depth value for the at least one first pixel,
wherein the temporal sequence of depth maps comprises the first depth map.

However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).   	  
Barone discloses wherein the cost function for a depth value is dependent on a type of the depth value (e.g. the cost function is dependent on a disparity value, which is taught in ¶ [124]-[126] above.),
wherein the type of depth values is selected from the group consisting of a depth value of the first depth map (e.g. the depth maps contain depth values that are associated with candidate pixels selected, which is taught in ¶ [43] and [54].  The disparity is considered as the value used in association with candidates of a tile, which is taught in ¶ [121]-[123].), 

[0043] In the embodiment considered, the processing unit 20 processes the images IMG and generates a depth map image DMI. For example, the processing unit 20 may process the images IMG acquired form the cameras 10 and provide a depth map, wherein each pixel of the depth map is identified by a depth value. For example, such a depth map may be considered as being a grayscale image, wherein the darkest value is the furthest while the lightest value is the closest (or vice versa).

[0054] Thus, during the pre-matching step 2010 a subset of candidate pixels is selected, and the matching step 2012 is performed for the candidate pixels. This means that holes may be created, but these may be filled later on during a refinement phase. Thus, the pre-matching step 2010 is optional insofar as the subsequent matching step 2012 could also be performed on all pixels of the reference image.

[0056] Thus, generally, at least some disclosed methods for generating a depth map DMI from a plurality of images IMG have in common that a plurality of reference pixels are selected in a first image, e.g., either all pixels or only a subset of pixels. Next, with each reference pixel is associated a respective pixel in the second image and the disparity between each reference pixel and the respective pixel in the second image is determined. Finally, a depth value may be calculated for each reference pixel as a function of the respective disparity.

[0121] In the embodiment considered, a disparity histogram is built for each tile TIL only with the disparity values of the candidates belong to the respective tile.

[0122] For example, FIG. 10a schematically shows the tile TIL of FIG. 9, which comprises, e.g., a 10.times.10 block of pixels. Specifically, with each pixel PX of the tile TIL is associated a number of candidate pixels, and as mentioned in the foregoing, the number of candidate pixels could be fixed or variable. Moreover, with each pixel pair is associated a disparity or depths value d.

[0123] These disparity or depths values d for each pixel PX are used to build a histogram of disparity. Specifically, the histogram shows the occurrences of the disparity or depths value d for all pixels in the title TIL. For example, in case 4 candidate pixels would be associated with each pixel PX of the 10.times.10 tile TIL, a total of 400 values would be distributed in the histogram. Thus, the histogram of disparity per tile facilitates discovering the outliers. In fact, even when a matching is very good from a similarity point of view, the association could not be the correct one. In fact, as shown in FIG. 10b, it is possible to recognize as outliers the candidates with low occurrences in the histogram.

a depth value of the first depth map closer than a distance threshold (e.g. the disparity or depth value can be below or above a certain distance threshold in order to determine a specific cost function value, which is taught in ¶ [124]-[142] above.), 
a depth value of the first depth map farther away than a distance threshold (e.g. the disparity or depth value can be below or above a certain distance threshold in order to determine a specific cost function value, which is taught in ¶ [124]-[142] above.).

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the cost function for a depth value is dependent on a type of the depth value, wherein the type of depth values is selected from the group consisting of a depth value of the first depth map, a depth value of the first depth map closer than a distance threshold, a depth value of the first depth map farther away than a distance threshold, incorporated in the device of Verburgh, in order to determination if the a pixel with a disparity or depth value is associated with a candidate pixel or removed as an outlier, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ‘586 ¶ [114]).   

However, the combination above fails to specifically teach the features of a depth value from a second depth map of a temporal sequence of depth maps, a depth value having a scene independent depth value offset relative to a depth value of the first depth value, a depth value independent of a scene is represented by the first depth map, a depth value determined based on an offset of a depth value for the at least one first pixel,
wherein the temporal sequence of depth maps comprises the first depth map. 

However, this is well known in the art as evidenced by Varekamp.  Similar to the primary reference, Varekamp discloses processing depth values (same field of endeavor or reasonably pertinent to the problem).     
Varekamp discloses a depth value from a second depth map of a temporal sequence of depth maps (e.g. a second depth map contains depth values that are used or processed for further depth map processing, which is taught in page 17, ll. 27-page 18, ll. 20.), 

(51) The second processor 207 receives the first processed depth map and generates a second processed depth map by processing pixels in the first processed depth map in a top to bottom direction. The second processor 207 may thus perform a similar iterative or sequential process but does so in the opposite direction of processing by the first processor 205. The first processor 205 may thus be seen as performing a first pass, with the second processor 207 performing a second pass seeking to address potential errors or artefacts that may have resulted from the first pass.
(52) The second processor 207 is arranged to process the first processed depth map by setting the depth map value of the second processed depth map as either the corresponding pixel in the first processed depth map or as a depth value for a pixel of the first processed depth map which is above the current pixel. Thus, the depth value in the second processed depth map may be set as the depth value for the corresponding depth value for the pixel in the first processed depth map or it may be set to a depth value in the first processed depth map of a pixel above the current pixel. The selection of which depth value to use is based on a depth step criterion which considers both the input depth map and the first processed depth map.
(53) In more detail, for a given pixel, referred to as the second pixel, the second processor 207 determines a depth value D.sub.i,j″ in the second processed depth map. One candidate for this depth value D.sub.i,j″ is the corresponding depth value D.sub.i,j′ in the first processed depth map. This depth value may be referred to as the first candidate depth value D.sub.i,j′ and if this is selected it corresponds to the depth value for the second pixel not being changed.
(54) Alternatively, the depth value D.sub.i,j″ in the second processed depth map may be set to the depth value of the first processed map for a third pixel which is above the second pixel. This depth value may be referred to as a second candidate depth value D.sub.i−1,k′. In many embodiments, the third pixel/the second candidate depth value may be selected from a, typically small, set of pixels above the second pixel. For example, the second candidate depth value D.sub.i−1,k′ may be selected from a set of pixels with k in the interval of [j−N;j+N], where N is typically a small number, e.g. N may be in the interval from 1 to 7.

a depth value having a scene independent depth value offset relative to a depth value of the first depth value (e.g. a depth value that is offset from a first depth map is processed by the depth offset processor, which is taught in page 24, ll. 1-page 25, ll. 16.), 

(85) In some embodiments, the depth processing apparatus of FIG. 2 may further comprise a depth offset processor 211 which may process a depth map that has been processed by the first processor 205 and optionally also by the second processor 207 and/or the third processor 209 to generate an offset depth map in which some depth values may be offset to restore some of the forward depth step that has been reduced by the previous processing. Thus, the depth map processed by the depth offset processor 211 may be an already processed depth map which is derived from the first processed depth map, i.e. the processing by the first processor 205 has been part of the process resulting in the depth map. Accordingly, the first processed depth map has been derived from the first processed depth map, e.g. by the processing by the second processor 207 and/or the third processor 209. It will be appreciated that the derived depth map includes the straightforward derivation of simply using the first processed depth map without changing this, i.e. the already processed depth map being fed to the depth offset processor 211 may be the first processed depth map itself in some embodiments.
(86) An example of the approach of the depth offset processor 211 may be elucidated with reference to FIG. 7 which first shows a possible depth profile D.sub.i,j for a column in the input depth map (with increasing y values corresponding to higher rows in the image (i.e. y increases towards the top). The next profile shows the resulting depth profile D.sub.i,j′ of first processed depth map generated by the first processor 205. As can be seen, no forwards depth steps for increasing y values are allowed and thus the depth values are decreased where such depth steps occur.
(87) The lowest chart of FIG. 7 illustrates a possible resulting depth profile D.sub.i,j″ after an exemplary depth processing by the depth offset processor 211. In the example, some of the forward depth steps have been re-introduced. Specifically, in the example, the original depth values, and thus depth steps, have been reinstated subject to a maximum depth step value D.sub.maxstep. Specifically, the depth offset processor 211 may perform the following operation:
(88) TABLE-US-00003 1. for row i ← 1 ... N.sub.rows 2.  for column j ← 1 ... N.sub.cols−1 3. D″.sub.i,j ← D′.sub.i,j + min(D.sub.i,j − D′.sub.i,j, D.sub.maxstep)
where D.sub.i,j is the depth value of the original input depth map and D.sub.i,j′ is the result of processing by the first processor 205 (and optionally the second processor 207 and third processor 209). D.sub.i,j″ is the final output of the depth offset processor 211. Parameter D.sub.maxstep is the maximum allowed depth difference (typically set to e.g. 5 for a depth map with dynamic range of 0 . . . 255).
(89) Thus, in this approach, some, typically small, forward depth steps are allowed/reintroduced. This approach may in many scenarios result in improved perceived quality. For example, it may introduce some depth texture and variation in image objects (thereby e.g. preventing or mitigating a flat/cardboard look of e.g. the moving person) while still ensuring that these are perceived to not stand-out/float over e.g. the ground.
(90) The depth offset processor 211 may accordingly offset the depth values of the received depth map (which may e.g. be received from any of the first processor 205, the second processor 207, or the third processor 209 depending on the specific embodiment). The offset for a given pixel may be determined based on the difference between the depth value for this pixel in respectfully the received processed depth map and the original input depth map.

a depth value independent of a scene is represented by the first depth map (e.g. a first depth map is considered along with other depth maps offset from the first depth map, which is taught in page 24, ll. 1-page 25, ll. 16 above.), 
a depth value determined based on an offset of a depth value for the at least one first pixel (e.g. an offset of a depth value is used for determining another depth value processed by the depth offset processor, which is taught in page 24, ll. 1-page 25, ll. 16 above.),
wherein the temporal sequence of depth maps comprises the first depth map (e.g. a sequence of multiple depth maps can be developed, and the multiple depth maps include the first depth map, which is taught in page 24, ll. 1-page 25, ll. 16 above.).

Therefore, in view of Verakamp, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of a depth value from a second depth map of a temporal sequence of depth maps, a depth value having a scene independent depth value offset relative to a depth value of the first depth value, a depth value independent of a scene is represented by the first depth map, a depth value determined based on an offset of a depth value for the at least one first pixel, wherein the temporal sequence of depth maps comprises the first depth map, incorporated in the device of Verburgh, in order to have low complexity and low resource demanding approach for selecting and determining depth values using an offset, which can improve depth maps consistency and accuracy (as stated in Varekamp page 4, ll. 18-21).   

Re claim 13: (Currently Amended) However, Verburgh fails to specifically teach the features of the method of claim 1,
wherein the plurality of first candidate depth values for a third direction from the at least one first pixel does not comprise a third candidate depth value,
wherein the third candidate depth values is for at least one third pixel along the third direction, wherein the cost function is less than the cost function for the third candidate depth value,
wherein a distance from the at least one first pixel to the third candidate depth value is larger than a distance from the at least one first pixel to the third pixel.

However, this is well known in the art as evidenced by Barone.  Similar to the primary reference, Barone discloses a cost function with a cost gradient increasing or decreasing based on a threshold level (same field of endeavor or reasonably pertinent to the problem).   	  
Barone discloses wherein the third candidate depth values is for at least one third pixel along the third direction, wherein the cost function is less than the cost function for the third candidate depth value (e.g. the invention discloses several candidate pixels associated with disparity or depth values that are used for a specific cost function values.  The candidate depth value (d=7) has a cost function = 25 that is greater than the third pixel pair that can be considered as the third pixel in a third direction that has a cost function value of 15.  This is taught in ¶ [124]-[126] above.).

Therefore, in view of Barone, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of wherein the third candidate depth values is for at least one third pixel along the third direction, wherein the cost function is less than the cost function for the third candidate depth value, incorporated in the device of Verburgh, in order to determination if the a pixel with a disparity or depth value is associated with a candidate pixel or removed as an outlier, which aids in final selection of candidate pixels and removal of outliers (as stated in Barone ‘586 ¶ [114]).   

However, the combination above fails to specifically teach the features of wherein the plurality of first candidate depth values for a third direction from the at least one first pixel does not comprise a third candidate depth value, wherein a distance from the at least one first pixel to the third candidate depth value is larger than a distance from the at least one first pixel to the third pixel. 
However, this is well known in the art as evidenced by Varekamp.  Similar to the primary reference, Varekamp discloses processing depth values (same field of endeavor or reasonably pertinent to the problem).     
Varekamp discloses wherein the plurality of first candidate depth values for a third direction from the at least one first pixel does not comprise a third candidate depth value (e.g. the first candidate values of the first pixel in a top to bottom direction in a third direction can be considered to not comprise a third candidate depth value, which is taught in age 17, ll. 27-page 18, ll. 20 above.),
wherein a distance from the at least one first pixel to the third candidate depth value is larger than a distance from the at least one first pixel to the third pixel (e.g. a distance between a third candidate depth value associated with a third depth map can be greater than a distance between the first pixel and a third pixel within the same depth map, which is described in age 17, ll. 27-page 18, ll. 20 above and page 21, ll. 19-page 22, ll. 23).

(71) The third processor 209 is arranged to process the second processed depth map by setting the depth map value of the third processed depth map as either the corresponding pixel in the second processed depth map or as a depth value for a pixel of the second processed depth map which is below the current pixel. Thus, the depth value in the third processed depth map may be set as the depth value for the corresponding depth value for the pixel in the second processed depth map or to a depth value in the second processed depth map of pixel above the current pixel. The selection of which depth value to use is again based on a depth step criterion which considers both the input depth map and the first processed depth map.

(72) In more detail, for a given pixel, referred to as the fourth pixel, the third processor 209 determines a depth value D.sub.i,j′″ in the third processed depth map. One candidate for this depth value D.sub.i,j′″ is the corresponding depth value D.sub.i,j″ in the second processed depth map. This depth value may be referred to as the third candidate depth value D.sub.i,j″.

(73) Alternatively, the depth value D.sub.i,j′″ in the third processed depth map may be set to the depth value of the second processed map for a fifth pixel which is below the second pixel. This depth value may be referred to as a fourth candidate depth value D.sub.i+1,k″. In many embodiments, the fifth pixel/the fourth candidate depth value may be selected from a, typically small, set of pixels above the fourth pixel. For example, the fourth candidate depth value D.sub.i+1,k″ may be selected from a set of pixels with k in the interval of [j−N;j+N] where N is typically a small number, e.g. N may be in the interval from 1 to 7.

(74) The approach by the third processor 209 thus processes the second processed depth map in a bottom to top direction and either maintains the depth value of the second processed depth map (D.sub.i,j′″←D.sub.i,j″) or allows the depth value to be propagating up from pixels below the current pixels (D.sub.i,j′″←D.sub.i+1,k″).

(75) The choice of whether to maintain the depth value or to replace it is as previously mentioned based on a depth step criterion which is dependent on depth steps both in the input depth map and in the second processed depth map.

(76) Specifically, the depth step criterion may require that the absolute value of a depth step in the input depth map is not above a first threshold and that a backwards depth step in the second processed image is not below a second threshold.

(77) In such an embodiment, the second processed depth map may accordingly be substituted by a depth value from a pixel above if the third processor 209 detects that there is only a small depth step in the input depth map but that there is a large backwards depth step in the second processed depth map (when moving to the current pixel). In this case, the large depth step in the second processed depth map may indicate that the first processor 205 has introduced a depth transition which is not there in the original input depth map. Accordingly, the third processor 209 allows the further forward depth value to propagate from the fifth pixel to the current pixel (the fourth pixel).

Therefore, in view of Varekamp, it would have been obvious to one of ordinary skill at the time the invention was made to have the feature of further comprising processing a plurality of pixels of the first depth map by iteratively selecting at least one third pixel, wherein the at least one third pixel is a portion of the at least one first pixel, incorporated in the device of Verburgh, in order to have low complexity and low resource demanding approach for selecting and determining depth values using an offset, which can improve depth maps consistency and accuracy (as stated in Varekamp page 4, ll. 18-21).   

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Redert discloses creating a depth map.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHAD S DICKERSON whose telephone number is (571)270-1351. The examiner can normally be reached Monday-Friday 10AM-6PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abderrahim Merouan can be reached at 571-270-5254. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/CHAD DICKERSON/           Primary Examiner, Art Unit 2682
Read full office action
Prosecution Timeline

May 26, 2023
Application Filed
Sep 20, 2025
Non-Final Rejection — §101, §103
Dec 09, 2025
Response Filed
Mar 07, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/016,602
Patent 12602908
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM
2y 5m to grant Granted Apr 14, 2026
18/467,670
Patent 12603960
IMAGE ANALYSIS APPARATUS, IMAGE ANALYSIS SYSTEM, IMAGE ANALYSIS METHOD, PROGRAM, AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM COMPRISING READING A PRINTED MATTER, ANALYZING CONTENT RELATED TO READING OF THE PRINTED MATTER AND ACQUIRING SUPPORT INFORMATION BASED ON AN ANALYSIS RESULT OF THE CONTENT FOR DISPLAY TO ASSIST A USER IN FURTHER READING OPERATIONS
2y 5m to grant Granted Apr 14, 2026
18/337,243
Patent 12579817
Vehicle Control Device and Control Method Thereof for Camera View Control Based on Surrounding Environment Information
2y 5m to grant Granted Mar 17, 2026
18/217,785
Patent 12522110
APPARATUS AND METHOD OF CONTROLLING THE SAME COMPRISING A CAMERA AND RADAR DETECTION OF A VEHICLE INTERIOR TO REDUCE A MISSED OR FALSE DETECTION REGARDING REAR SEAT OCCUPATION
2y 5m to grant Granted Jan 13, 2026
18/273,555
Patent 12519896
IMAGE READING DEVICE COMPRISING A LENS ARRAY INCLUDING FIRST LENS BODIES AND SECOND LENS BODIES, A LIGHT RECEIVER AND LIGHT BLOCKING PLATES THAT ARE BETWEEN THE LIGHT RECEIVER AND SECOND LENS BODIES, THE THICKNESS OF THE LIGHT BLOCKING PLATES EQUAL TO OR GREATER THAN THE SECOND LENS BODIES THICKNESS
2y 5m to grant Granted Jan 06, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
63%
Grant Probability
86%
With Interview (+23.0%)
2y 9m
Median Time to Grant
Moderate
PTA Risk
Based on 600 resolved cases by this examiner. Grant probability derived from career allow rate.