SOCP constraint vs Quadratic constraint - trust region methods
$begingroup$
I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.
Suppose we want to minimize a convex function
$$J(x)$$
subject to
$$f(x) = 0$$
$$g(x) leq 0$$
and that we replace these constraints with
$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .
to get meaningful results we impose
$$left|x - x_kright| < s$$
where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.
My question is the following: what's the difference between implementing
$$left|x - x_kright| leq s $$ (i.e., component-wise)
or
$$left|x - x_kright| leq s $$
or
$$(x - x_k)^T(x - x_k) leq s$$
Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?
moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:
$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$
with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.
Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.
Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as
$$J_{new} = J + k s$$
what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=
thanks a million!
optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization
$endgroup$
add a comment |
$begingroup$
I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.
Suppose we want to minimize a convex function
$$J(x)$$
subject to
$$f(x) = 0$$
$$g(x) leq 0$$
and that we replace these constraints with
$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .
to get meaningful results we impose
$$left|x - x_kright| < s$$
where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.
My question is the following: what's the difference between implementing
$$left|x - x_kright| leq s $$ (i.e., component-wise)
or
$$left|x - x_kright| leq s $$
or
$$(x - x_k)^T(x - x_k) leq s$$
Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?
moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:
$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$
with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.
Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.
Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as
$$J_{new} = J + k s$$
what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=
thanks a million!
optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization
$endgroup$
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16
add a comment |
$begingroup$
I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.
Suppose we want to minimize a convex function
$$J(x)$$
subject to
$$f(x) = 0$$
$$g(x) leq 0$$
and that we replace these constraints with
$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .
to get meaningful results we impose
$$left|x - x_kright| < s$$
where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.
My question is the following: what's the difference between implementing
$$left|x - x_kright| leq s $$ (i.e., component-wise)
or
$$left|x - x_kright| leq s $$
or
$$(x - x_k)^T(x - x_k) leq s$$
Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?
moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:
$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$
with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.
Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.
Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as
$$J_{new} = J + k s$$
what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=
thanks a million!
optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization
$endgroup$
I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.
Suppose we want to minimize a convex function
$$J(x)$$
subject to
$$f(x) = 0$$
$$g(x) leq 0$$
and that we replace these constraints with
$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .
to get meaningful results we impose
$$left|x - x_kright| < s$$
where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.
My question is the following: what's the difference between implementing
$$left|x - x_kright| leq s $$ (i.e., component-wise)
or
$$left|x - x_kright| leq s $$
or
$$(x - x_k)^T(x - x_k) leq s$$
Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?
moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:
$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$
with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.
Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.
Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as
$$J_{new} = J + k s$$
what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=
thanks a million!
optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization
optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization
asked Jan 9 at 6:00
venomvenom
296
296
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16
add a comment |
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067137%2fsocp-constraint-vs-quadratic-constraint-trust-region-methods%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067137%2fsocp-constraint-vs-quadratic-constraint-trust-region-methods%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16