DisCo-DSO: Joint Optimization in Hybrid Discrete-Continuous Spaces
Published:
In this blog post, we introduce DisCo-DSO (Discrete-Continuous Deep Symbolic Optimization), a novel approach for joint optimization in hybrid discrete-continuous spaces. DisCo-DSO leverages autoregressive models and deep reinforcement learning to optimize discrete tokens and continuous parameters simultaneously. This unified approach leads to more efficient optimization, robustness to non-differentiable objectives, and superior performance in tasks like decision tree learning and symbolic regression. Let’s dive into the key innovations, applications, and results of DisCo-DSO.
Optimization in hybrid discrete-continuous spaces is a fundamental challenge in AI, with applications in decision tree learning, symbolic regression, and interpretable AI. These problems often involve:
- Discrete tokens (e.g., symbolic structures or decision points).
- Continuous parameters (e.g., thresholds or constant values).
Traditional methods optimize these components separately, leading to inefficiencies and suboptimal solutions. DisCo-DSO (Discrete-Continuous Deep Symbolic Optimization), our novel approach, addresses this by jointly optimizing discrete and continuous variables.
By leveraging autoregressive models and deep reinforcement learning (RL), DisCo-DSO learns the joint distribution of these hybrid spaces. This unified approach results in fewer function evaluations, robustness to non-differentiable objectives, and superior performance.
Here’s a preview of the method in action:
Key Innovations
DisCo-DSO builds upon the principles of deep learning and reinforcement learning, introducing two critical extensions:
- Extension of Autoregressive Models
- DisCo-DSO employs autoregressive models to generate hybrid discrete-continuous solutions by extending the model’s head to predict discrete tokens and continuous parameters conditionally. This enables the joint optimization of discrete and continuous components.
- Extension of the Risk-Seeking Policy Gradient
- DisCo-DSO introduces a risk-seeking policy gradient to handle high-variance, black-box reward functions in hybrid spaces. This extension ensures robust optimization in challenging environments with non-differentiable objectives.
- Sequential Optimization for Decision Trees
- DisCo-DSO formulates decision tree policy search in control tasks as sequential discrete-continuous optimization. It proposes a method for sequentially finding bounds for parameter ranges in decision nodes, enhancing the interpretability and efficiency of the learned policies.
Applications
DisCo-DSO is a versatile framework that can be applied to a wide range of tasks requiring optimization in hybrid discrete-continuous spaces. Hybrid optimization is common in AI for Mathematics, for example, in decision tree learning for control tasks and equation discovery for scientific applications.
Decision Tree Learning for Control
In interpretable control tasks, decision trees must optimize discrete structural decisions (e.g., splitting rules) alongside continuous parameters (e.g., thresholds).
DisCo-DSO formulates this as a sequential optimization problem, generating compact, interpretable policies. The animation below shows a decision tree applied in a control task, demonstrating the backpropagation of gradients through discrete and continuous components:
Equation Discovery for Scientific Applications
In symbolic regression, the goal is to discover mathematical equations from data. DisCo-DSO’s joint modeling approach ensures the generation of precise symbolic expressions with optimized constants, significantly improving accuracy.
Results
We evaluated DisCo-DSO on diverse tasks, demonstrating its superior performance:
By jointly optimizing hybrid designs, DisCo-DSO achieves:
- Higher accuracy.
- Reduced computational cost.
- Improved interpretability.
Conclusion
DisCo-DSO represents a significant step forward in hybrid discrete-continuous optimization:
- Joint Optimization: Discrete and continuous variables are optimized together, leading to holistic designs.
- Efficiency: Fewer objective evaluations reduce computational burden.
- Flexibility: Robust to black-box, non-differentiable tasks.
This method holds immense potential for advancing interpretable AI, symbolic regression, and decision tree learning, pushing the boundaries of hybrid generative design.
📄 Read the full preprint: preprint