Skip to content

Conversation

@till-m
Copy link
Member

@till-m till-m commented Nov 18, 2025

This introduces a user-facing .predict() method to the optimizer to simplify making predictions with the GP. I think that with the support for optimization over non-float parameters, calling .predict() on the GP now requires a bit too much parameter transformation and mangling. Rather than forcing users to handle this complexity themselves, this change provides an interface that handles all necessary conversions internally.

Summary by CodeRabbit

  • New Features

    • Added a predict() method to generate predictions for specified parameters with optional standard-deviation or covariance outputs and automatic GP fitting.
  • Documentation

    • Updated maximize() docs to reference the new predict() workflow with automatic GP fitting.
  • Tests

    • Added test helpers for perimeter-based triangle sampling and area calculation and expanded tests covering prediction behavior across parameter types.

✏️ Tip: You can customize this high-level summary in your review settings.

@codecov
Copy link

codecov bot commented Nov 18, 2025

Codecov Report

❌ Patch coverage is 93.93939% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.78%. Comparing base (c410d51) to head (b35705e).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
bayes_opt/bayesian_optimization.py 93.93% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #593      +/-   ##
==========================================
- Coverage   97.89%   97.78%   -0.11%     
==========================================
  Files          10       10              
  Lines        1185     1218      +33     
==========================================
+ Hits         1160     1191      +31     
- Misses         25       27       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@till-m
Copy link
Member Author

till-m commented Nov 19, 2025

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

✅ Actions performed

Full review triggered.

@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

📝 Walkthrough

Walkthrough

Added a public BayesianOptimization.predict(...) method that accepts single or multiple parameter inputs, optionally fits the internal GP, and returns mean predictions with optional std or covariance. Tests added a FixedPerimeterTriangleParameter and an area_of_triangle helper for perimeter-constrained triangle sampling and evaluation.

Changes

Cohort / File(s) Summary
Core Feature
bayes_opt/bayesian_optimization.py
Added predict(params, return_std=False, return_cov=False, fit_gp=True) to accept a dict or iterable of dicts, normalize inputs to a 2D array, optionally fit the GP via _fit_gp() (raises RuntimeError if no observations when fitting), and return mean with optional std or covariance while ensuring consistent output shapes for single vs multiple inputs. Updated maximize() docs to reference predict() with fit_gp=True.
Tests & Utilities
tests/test_bayesian_optimization.py
Added FixedPerimeterTriangleParameter (subclass of BayesParameter) for Dirichlet-based sampling constrained to a fixed perimeter, and area_of_triangle(sides) using Heron’s formula. Expanded tests to exercise prediction behavior across scalar/vector, integer/float, and categorical parameter types.

Sequence Diagram

sequenceDiagram
    participant User
    participant BO as BayesianOptimization
    participant GP as GaussianProcess

    User->>BO: predict(params, return_std?, return_cov?, fit_gp?)
    alt fit_gp == True
        BO->>BO: check if observations exist
        alt no observations
            BO-->>User: RuntimeError (no observations to fit GP)
        else observations exist
            BO->>GP: _fit_gp() (train/fit)
            GP-->>BO: fitted
        end
    end
    BO->>BO: normalize params -> 2D array
    BO->>GP: predict(mean, and std/cov if requested)
    GP-->>BO: prediction results
    BO->>BO: adjust return shapes (scalar/1D/2D as needed)
    alt return_std or return_cov
        BO-->>User: (mean, uncertainty)
    else
        BO-->>User: mean
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I hopped through code with eager feet,

Predictions ready, neat and sweet.
Triangles measured, perimeters kept,
Means and variances gently swept.
A burrow of tests—ready to meet! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: adding a new .predict() method to the BayesianOptimization optimizer.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/test_bayesian_optimization.py (2)

19-64: Custom triangle parameter and area helper: clarify intent and edge conditions

  • Triangle sampling vs. perimeter and bounds.
    FixedPerimeterTriangleParameter.random_sample draws Dirichlet samples scaled by self.perimeter and only enforces self.bounds. For the current tests (perimeter=1.0, bounds [0, 1] for each side) this is fine, but for inconsistent combinations (e.g. perimeter > 3 * upper_bound) the while loop could run indefinitely because no samples satisfy the bounds. If this helper is ever reused with different parameters, consider either:

    • Asserting a compatible relationship between perimeter and bounds at construction time, or
    • Adding a max-tries safeguard with a clear error message.
  • Heron’s formula vs. current area_of_triangle implementation.
    Standard Heron’s formula uses the semiperimeter s = (a + b + c) / 2, whereas here s is the full perimeter and the comment explicitly says # perimeter. This function is only used as a test objective, so correctness up to a constant factor may be acceptable, but the name area_of_triangle is misleading as‑is. You might want to either:

    • Implement the usual formula:

  • s = np.sum(sides, axis=-1) # perimeter

  • return np.sqrt(s * (s - a) * (s - b) * (s - c))

  • s = 0.5 * np.sum(sides, axis=-1) # semiperimeter
  • return np.sqrt(s * (s - a) * (s - b) * (s - c))
    ```
  • Or rename/comment this as a synthetic triangle-like objective to avoid confusion.

537-590: Custom-parameter save/load test: avoid debug prints and private GP calls

  • The print statements for pbounds (Lines 546–548) look like leftover debugging and add noise to test output. They can safely be removed without losing coverage.

  • Forcing a GP fit via optimizer._gp.fit(...) (Lines 557–558) reaches into a private attribute. With the new public predict(..., fit_gp=True) hook, you could instead trigger a GP fit in a more idiomatic way (e.g., call optimizer.predict(...) once) or rely on the existing logic that fits the GP inside maximize(). If the extra fit is really needed for this scenario, a brief comment explaining why would help future maintainers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c410d51 and 1e51b28.

📒 Files selected for processing (2)
  • bayes_opt/bayesian_optimization.py (2 hunks)
  • tests/test_bayesian_optimization.py (2 hunks)
🔇 Additional comments (1)
bayes_opt/bayesian_optimization.py (1)

369-376: maximize() warning update looks good

The updated warning that points users to predict(..., fit_gp=True) after maximize() is accurate given the new API and clarifies GP fitting expectations without changing behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/test_bayesian_optimization.py (2)

616-643: Fix array/scalar mismatches: predict([point]) returns array, not scalar.

Lines 628-629, 632-634, 637-638, and 640-642 call optimizer.predict([test_point]) with a list input, which returns a 1D array per the shape semantics (validated by test_predict_shape_semantics_dict_vs_list). However, the assertions treat the result as a scalar:

means = optimizer.predict([test_point])  # returns 1D array
assert np.isclose(means, 0, atol=1e-3)  # comparing array to scalar

This causes np.isclose to return a boolean array, making assert fail with "ambiguous truth value" error.

🔎 Proposed fix

Either index the array or pass a dict instead of a list:

Option 1: Index the result

-    means = optimizer.predict([test_point])
-    assert np.isclose(means, 0, atol=1e-3)
+    means = optimizer.predict([test_point])
+    assert np.isclose(means[0], 0, atol=1e-3)

Option 2: Use dict input for scalar return

-    means = optimizer.predict([test_point])
+    means = optimizer.predict(test_point)
     assert np.isclose(means, 0, atol=1e-3)

Apply similar fixes to lines 632-634, 637-638, and 640-642.


685-720: Fix array/scalar mismatches in loop assertions.

Lines 711-712 and 717-719 have the same issue: optimizer.predict([rounded_point]) returns a 1D array, but the comparison treats it as a scalar:

mean_rounded = optimizer.predict([rounded_point])  # returns 1D array
assert np.isclose(means_float[i], mean_rounded, atol=1e-1)  # scalar vs array
🔎 Proposed fix

Option 1: Index the array result

         mean_rounded = optimizer.predict([rounded_point])
-        assert np.isclose(means_float[i], mean_rounded, atol=1e-1)
+        assert np.isclose(means_float[i], mean_rounded[0], atol=1e-1)

Option 2: Use dict input

-        mean_rounded = optimizer.predict([rounded_point])
+        mean_rounded = optimizer.predict(rounded_point)
         assert np.isclose(means_float[i], mean_rounded, atol=1e-1)

Apply the same fix pattern to lines 717-719.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e51b28 and b35705e.

📒 Files selected for processing (2)
  • bayes_opt/bayesian_optimization.py
  • tests/test_bayesian_optimization.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_bayesian_optimization.py (2)
bayes_opt/parameter.py (28)
  • BayesParameter (45-152)
  • bounds (59-61)
  • is_continuous (65-66)
  • is_continuous (171-173)
  • is_continuous (258-260)
  • is_continuous (355-357)
  • random_sample (68-87)
  • random_sample (262-281)
  • random_sample (359-381)
  • to_float (90-97)
  • to_float (175-183)
  • to_float (283-291)
  • to_float (383-394)
  • to_param (100-112)
  • to_param (185-198)
  • to_param (293-306)
  • to_param (396-409)
  • kernel_transform (115-126)
  • kernel_transform (222-234)
  • kernel_transform (308-320)
  • kernel_transform (434-449)
  • to_string (128-147)
  • to_string (200-220)
  • to_string (411-432)
  • dim (151-152)
  • dim (237-239)
  • dim (323-325)
  • dim (452-454)
bayes_opt/util.py (1)
  • ensure_rng (8-30)
bayes_opt/bayesian_optimization.py (2)
bayes_opt/target_space.py (3)
  • params (149-156)
  • params_to_array (308-324)
  • res (654-685)
bayes_opt/acquisition.py (1)
  • _fit_gp (79-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Python 3.12 - numpy >=1.25,<2
  • GitHub Check: Python 3.13 - numpy >=2
  • GitHub Check: Python 3.12 - numpy >=2
  • GitHub Check: Python 3.11 - numpy >=1.25,<2
  • GitHub Check: Python 3.10 - numpy >=1.25,<2
  • GitHub Check: Python 3.9 - numpy >=2
  • GitHub Check: Python 3.10 - numpy >=2
  • GitHub Check: Python 3.11 - numpy >=2
  • GitHub Check: Python 3.9 - numpy >=1.25,<2
🔇 Additional comments (5)
bayes_opt/bayesian_optimization.py (3)

11-11: LGTM: Import appropriately supports new predict() method.

The Iterable import is used correctly in the type annotation for the predict() method parameter.


179-263: LGTM: predict() implementation addresses all prior review concerns.

The implementation now includes:

  • Proper input validation with TypeError for invalid types (lines 222-224)
  • Correct return type annotation supporting both scalar and tuple returns (line 185)
  • Accurate docstring with correct default values (line 193)
  • Mutual exclusivity enforcement for return_std and return_cov (lines 226-229)
  • Clear shape semantics: dict input returns scalars, iterable input returns 1D arrays (lines 248-263)

The string exclusion in the iterable check (line 218) correctly prevents treating strings as iterables of characters.


393-393: LGTM: Documentation correctly references new predict() API.

The updated warning appropriately guides users to call predict() with fit_gp=True for post-optimization predictions.

tests/test_bayesian_optimization.py (2)

19-58: LGTM: FixedPerimeterTriangleParameter correctly implements custom parameter interface.

The class properly implements the BayesParameter interface for perimeter-constrained triangle sampling:

  • Uses Dirichlet distribution for valid triangle generation
  • Correctly normalizes values via to_param and kernel_transform
  • Properly reports dimensionality as 3

722-908: LGTM: Comprehensive test coverage with correct shape handling.

The remaining predict() tests correctly handle array/scalar semantics:

  • Use appropriate assertions for list inputs (arrays)
  • Test edge cases: invalid types, tuple inputs, mutual exclusivity
  • Excellent shape semantics tests validate dict vs. list behavior across different return modes

The mutual exclusivity test (lines 826-839) properly addresses the prior review suggestion.

@till-m till-m merged commit 8688445 into bayesian-optimization:master Dec 27, 2025
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant