Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

@kumarUjjawal kumarUjjawal commented Jan 21, 2026

Which issue does this PR close?

Rationale for this change

SELECT ROUND('999.9'::DECIMAL(4,1)) incorrectly returned 100.0 instead of 1000.

When rounding a decimal value causes a carry-over that increases the number of digits (e.g., 999.9 → 1000.0), the result overflows the original precision constraint. The overflow was silently truncated during display, producing incorrect results.

What changes are included in this PR?

  • Fixes ROUND on DECIMAL so carry-over doesn’t silently truncate/produce wrong results.
  • When decimal_places is a constant (including omitted/NULL), ROUND now reduces the output scale to min(input_scale, max(decimal_places, 0)) (Spark/DuckDB-style), reclaiming
    precision for the integer part.
  • When decimal_places is not a constant (e.g. a column/array), ROUND keeps the original scale and may increase precision by 1 (capped); if precision is already max and the rounded
    value can’t fit, it returns an overflow error instead of a wrong value.
  • Adds/updates sqllogictests for these behaviors and edge cases.

Are these changes tested?

  • Added new sqllogictest case for the specific bug scenario
  • Updated 2 existing tests with new expected precision values
  • All existing tests pass

Are there any user-facing changes?

Aspect Before After
ROUND('999.9'::DECIMAL(4,1)) 100.0 (wrong) 1000 (correct)
Return type (default/literal dp) Decimal128(4, 1) Decimal128(4, 0)
  • Return type for DECIMAL inputs can change: with literal dp it generally reduces scale; with non-literal dp it keeps scale and may increase precision by 1.
  • New error case: when precision is already max and dp is non-literal, ROUND may now error on overflow rather than return a truncated/wrong decimal.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Jan 21, 2026
@kumarUjjawal
Copy link
Contributor Author

Other options considered:

  1. Increase output precision by 1 (PostgreSQL behavior): The return_type should increase precision by 1 to accommodate potential carry-over from rounding. For DECIMAL(p, s) with decimal_places=0, return DECIMAL(p+1, s).

  2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

  3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

@Jefffrey
Copy link
Contributor

Other options considered:

1. Increase output precision by 1 (PostgreSQL behavior): The `return_type` should increase precision by 1 to accommodate potential carry-over from rounding. For `DECIMAL(p, s)` with `decimal_places=0`, return `DECIMAL(p+1, s)`.

2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

Option 1 sounds good. Option 2 would be too strict, not sure I fully understand what option 3 means here.

One thing to keep in mind is we still need to consider edge case where precision is already maxed out. Would this bug occur again?

@kumarUjjawal
Copy link
Contributor Author

Other options considered:

1. Increase output precision by 1 (PostgreSQL behavior): The `return_type` should increase precision by 1 to accommodate potential carry-over from rounding. For `DECIMAL(p, s)` with `decimal_places=0`, return `DECIMAL(p+1, s)`.

2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

Option 1 sounds good. Option 2 would be too strict, not sure I fully understand what option 3 means here.

One thing to keep in mind is we still need to consider edge case where precision is already maxed out. Would this bug occur again?

You were right to point that, i did some more test and found out it was overflowing after 38 digits, I handled that as an error.


// Validate the result fits within the precision
// The max value for a given precision is 10^precision - 1
let max_value = ten.pow_checked(precision as u32).map_err(|_| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round('1234.5678'::decimal(50,4), 2);
----
Decimal256(50, 4) 1234.57
Decimal256(51, 4) 1234.57
Copy link
Member

@martin-g martin-g Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask whether the increase of the precision could be done only when really needed.
And I went checking what the other DBs do:

  1. Postgres v18
postgres=# SELECT pg_typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
 pg_typeof | round
-----------+-------
 numeric   |  1000
(1 row)

(Not helpful)

  1. Apache Spark (v4.0.1)
spark-sql (default)> SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
decimal(4,0)    1000
Time taken: 0.032 seconds, Fetched 1 row(s)

Reduced the scale!

  1. DuckDB
SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
┌────────────────────────────────────────────┬────────────────────────────────────┐
│ typeof(round(CAST(999.9 AS DECIMAL(4,1)))) │ round(CAST(999.9 AS DECIMAL(4,1))) │
│                  varchar                   │            decimal(4,0)            │
├────────────────────────────────────────────┼────────────────────────────────────┤
│ DECIMAL(4,0)                               │                1000                │
└────────────────────────────────────────────┴────────────────────────────────────┘

Reduced the scale!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice, Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case Input Result Type
dp column 123.4567, dp=1 123.5000 DECIMAL(10,4)
carry-over 999.9999, dp=0 1000.0000 DECIMAL(10,4)
constant dp 999.9 1000 DECIMAL(4,0)
scale reduction 1234.5678, dp=2 1234.57 DECIMAL(50,2)
negative dp 12345.67, dp=-2 12300 DECIMAL(7,0)
dp > scale 123.4, dp=5 123.4 DECIMAL(4,1)
negative values -999.9, dp=0 -1000 DECIMAL(4,0)

// Get decimal_places from scalar_arguments
// If dp is not a constant scalar, we must keep the original scale because
// we can't determine a single output scale for varying per-row dp values.
let (decimal_places, dp_is_scalar): (i32, bool) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here decimal_places is initialized as i32 but below it is casted to i8 with as i8 and this may lead to problems. A validation is needed that it is not bigger than i8::MAX

Ok(Arc::new(Field::new(
self.name(),
return_type,
input_field.is_nullable(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also take into account the decimal_places arg.

Postgres:

postgres=# SELECT pg_typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
 pg_typeof | round
-----------+-------
 numeric   |
(1 row)

Apache Spark:

spark-sql (default)> SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
decimal(4,0)    NULL
Time taken: 0.055 seconds, Fetched 1 row(s)

DuckDB:

D SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
┌────────────────────────────────────────────┬──────────────────────────────────────────┐
│ typeof(round(CAST(999.9 AS DECIMAL(4,1)))) │ round(CAST(999.9 AS DECIMAL(4,1)), NULL) │
│                  varchar                   │                  int32                   │
├────────────────────────────────────────────┼──────────────────────────────────────────┤
│ DECIMAL(4,0)                               │                   NULL                   │
└────────────────────────────────────────────┴──────────────────────────────────────────┘

if args.scalar_arguments.len() > 1 {
match args.scalar_arguments[1] {
Some(ScalarValue::Int32(Some(v))) => (*v, true),
Some(ScalarValue::Int64(Some(v))) => (*v as i32, true),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use a safe cast and return an Err if the number is bigger than i32::MAX (or even i8::MAX)

decimal_places: Option<ArrayRef>,
) -> Result<ArrayRef, DataFusionError> {
let number_rows = value.len();
let return_type = value.data_type().clone();
Copy link
Member

@martin-g martin-g Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be wrong for decimals which scale is reduced.
No need to fix/improve it. Just a comment acknowledging that is enough.

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to take a look at this again soon; in the meantime could we update the PR body since it seems outdated compared to the recent changes?

None => (0, true), // No dp argument means default to 0
Some(None) => (0, false), // dp is a column
Some(Some(ScalarValue::Int32(Some(v)))) => (*v, true),
Some(Some(ScalarValue::Int64(Some(v)))) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have i64 handling code since the signature expects only i32 for the 2nd argument

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Int64 handling isn’t “accepting Int64 dp” at runtime; it’s handling how literals are represented during planning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow; does that mean scalar_arguments types don't have type coercion applied to them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ReturnFieldArgs.scalar_arguments are taken directly from the original Expr::Literal values, so they can be ScalarValue::Int64 even though the function signature/coercion will ensure the argument is evaluated as Int32 at runtime.
Type coercion is applied when building the physical expression / evaluating the arguments, but scalar_arguments is just “what literal was written in the query”, not the coerced value. That’s why we handle Int64 there: to (a) treat it as a constant dp for return-type computation and (b) produce a clear out-of-range error for literals that can’t fit in i32.

@kumarUjjawal
Copy link
Contributor Author

I plan to take a look at this again soon; in the meantime could we update the PR body since it seems outdated compared to the recent changes?

I will update it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

round decimal incorrect result

3 participants