Alternating optimization is an iterative procedure for optimizing a real-valued function jointly over all its parameters by alternating restricted optimization over parameter partitions.
Usage
ao(
f,
initial,
target = NULL,
npar = NULL,
gradient = NULL,
...,
partition = "sequential",
new_block_probability = 0.3,
minimum_block_number = 2,
minimize = TRUE,
lower = -Inf,
upper = Inf,
iteration_limit = Inf,
seconds_limit = Inf,
tolerance_value = 1e-06,
tolerance_parameter = 1e-06,
tolerance_parameter_norm = function(x, y) sqrt(sum((x - y)^2)),
tolerance_history = 1,
base_optimizer = Optimizer$new("stats::optim", method = "L-BFGS-B"),
verbose = FALSE,
hide_warnings = TRUE
)
Arguments
- f
(
function
)
Afunction
to be optimized, returning a singlenumeric
value.The first argument of
f
should be anumeric
of the same length asinitial
, optionally followed by any other arguments specified by the...
argument.If
f
is to be optimized over an argument other than the first, or more than one argument, this has to be specified via thetarget
argument.- initial
(
numeric()
orlist()
)
The starting parameter values for the target argument(s).This can also be a
list
of multiple starting parameter values, see details.- target
(
character()
orNULL
)
The name(s) of the argument(s) over whichf
gets optimized.This can only be
numeric
arguments.Can be
NULL
(default), then it is the first argument off
.- npar
(
integer()
)
The length of the target argument(s).Must be specified if more than two target arguments are specified via the
target
argument.Can be
NULL
if there is only one target argument, in which casenpar
is set to belength(initial)
.- gradient
(
function
orNULL
)
Afunction
that returns the gradient off
.The function call of
gradient
must be identical tof
.Can be
NULL
, in which case a finite-difference approximation will be used.- ...
Additional arguments to be passed to
f
(andgradient
).- partition
(
character(1)
orlist()
)
Defines the parameter partition, and can be either"sequential"
for treating each parameter separately,"random"
for a random partition in each iteration,"none"
for no partition (which is equivalent to joint optimization),or a
list
of vectors of parameter indices, specifying a custom partition for the alternating optimization process.
This can also be a
list
of multiple partition definitions, see details.- new_block_probability
(
numeric(1)
)
Only relevant ifpartition = "random"
.The probability for a new parameter block when creating a random partitions.
Values close to 0 result in larger parameter blocks, values close to 1 result in smaller parameter blocks.
- minimum_block_number
(
integer(1)
)
Only relevant ifpartition = "random"
.The minimum number of blocks in random partitions.
- minimize
(
logical(1)
)
Whether to minimize during the alternating optimization process.If
FALSE
, maximization is performed.- lower, upper
(
numeric()
)
Optionally lower and upper parameter bounds.- iteration_limit
(
integer(1)
orInf
)
The maximum number of iterations through the parameter partition before the alternating optimization process is terminated.Can also be
Inf
for no iteration limit.- seconds_limit
(
numeric(1)
)
The time limit in seconds before the alternating optimization process is terminated.Can also be
Inf
for no time limit.Note that this stopping criteria is only checked after a sub-problem is solved and not within solving a sub-problem, so the actual process time can exceed this limit.
- tolerance_value
(
numeric(1)
)
A non-negative tolerance value. The alternating optimization terminates if the absolute difference between the current function value and the one beforetolerance_history
iterations is smaller thantolerance_value
.Can be
0
for no value threshold.- tolerance_parameter
(
numeric(1)
)
A non-negative tolerance value. The alternating optimization terminates if the distance between the current estimate and the beforetolerance_history
iterations is smaller thantolerance_parameter
.Can be
0
for no parameter threshold.By default, the distance is measured using the euclidean norm, but another norm can be specified via the
tolerance_parameter_norm
argument.- tolerance_parameter_norm
(
function
)
The norm that measures the distance between the current estimate and the one from the last iteration. If the distance is smaller thantolerance_parameter
, the procedure is terminated.It must be of the form
function(x, y)
for two vector inputsx
andy
, and return a singlenumeric
value. By default, the euclidean normfunction(x, y) sqrt(sum((x - y)^2))
is used.- tolerance_history
(
integer(1)
)
The number of iterations to look back to determine whethertolerance_value
ortolerance_parameter
has been reached.- base_optimizer
(
Optimizer
orlist()
)
AnOptimizer
object, which can be created viaOptimizer
. It numerically solves the sub-problems.By default, the
optim
optimizer is used. If another optimizer is specified, the argumentsgradient
,lower
, andupper
are ignored.This can also be a
list
of multiple base optimizers, see details.- verbose
(
logical(1)
)
Whether to print tracing details during the alternating optimization process.- hide_warnings
(
logical(1)
)
Whether to hide warnings during the alternating optimization process.
Value
A list
with the following elements:
estimate
is the parameter vector at termination.value
is the function value at termination.details
is adata.frame
with full information about the procedure: For each iteration (columniteration
) it contains the function value (columnvalue
), parameter values (columns starting withp
followed by the parameter index), the active parameter block (columns starting withb
followed by the parameter index, where1
stands for a parameter contained in the active parameter block and0
if not), and computation times in seconds (columnseconds
)seconds
is the overall computation time in seconds.stopping_reason
is a message why the procedure has terminated.
In the case of multiple threads, the output changes slightly, see details.
Details
Multiple threads
Alternating optimization can suffer from local optima. To increase the likelihood of reaching the global optimum, you can specify:
multiple starting parameters
multiple parameter partitions
multiple base optimizers
Use the initial
, partition
, and/or base_optimizer
arguments to provide
a list
of possible values for each parameter. Each combination of initial
values, parameter partitions, and base optimizers will create a separate
alternating optimization thread.
Output value
In the case of multiple threads, the output changes slightly in comparison
to the standard case. It is a list
with the following elements:
estimate
is the optimal parameter vector over all threads.estimates
is alist
of optimal parameters in each thread.value
is the optimal function value over all threads.values
is alist
of optimal function values in each thread.details
combines details of the single threads and has an additional columnthread
with an index for the different threads.seconds
gives the computation time in seconds for each thread.stopping_reason
gives the termination message for each thread.threads
give details how the different threads were specified.
Parallel computation
By default, threads run sequentially. However, since they are independent,
they can be parallelized. To enable parallel computation, use the
{future}
framework. For example, run the
following before the ao()
call:
future::plan(future::multisession, workers = 4)
Progress updates
When using multiple threads, setting verbose = TRUE
to print tracing
details during alternating optimization is not supported. However, you can
still track the progress of threads using the
{progressr}
framework. For example,
run the following before the ao()
call:
progressr::handlers(global = TRUE)
progressr::handlers(
progressr::handler_progress(":percent :eta :message")
)
Examples
# Example 1: Minimization of Himmelblau's function --------------------------
himmelblau <- function(x) (x[1]^2 + x[2] - 11)^2 + (x[1] + x[2]^2 - 7)^2
ao(f = himmelblau, initial = c(0, 0))
#> $estimate
#> [1] 3.584428 -1.848126
#>
#> $value
#> [1] 9.606386e-12
#>
#> $details
#> iteration value p1 p2 b1 b2 seconds
#> 1 0 1.700000e+02 0.000000 0.000000 0 0 0.000000000
#> 2 1 1.327270e+01 3.395691 0.000000 1 0 0.017777920
#> 3 1 1.743664e+00 3.395691 -1.803183 0 1 0.003668547
#> 4 2 2.847290e-02 3.581412 -1.803183 1 0 0.002969742
#> 5 2 4.687468e-04 3.581412 -1.847412 0 1 0.002536774
#> 6 3 7.368057e-06 3.584381 -1.847412 1 0 0.002232075
#> 7 3 1.164202e-07 3.584381 -1.848115 0 1 0.016813755
#> 8 4 1.893311e-09 3.584427 -1.848115 1 0 0.001713753
#> 9 4 9.153860e-11 3.584427 -1.848124 0 1 0.001406908
#> 10 5 6.347425e-11 3.584428 -1.848124 1 0 0.001410484
#> 11 5 9.606386e-12 3.584428 -1.848126 0 1 0.001447678
#>
#> $seconds
#> [1] 0.05197763
#>
#> $stopping_reason
#> [1] "change in function value between 1 iteration is < 1e-06"
#>
# Example 2: Maximization of 2-class Gaussian mixture log-likelihood --------
# target arguments:
# - class means mu (2, unrestricted)
# - class standard deviations sd (2, must be non-negative)
# - class proportion lambda (only 1 for identification, must be in [0, 1])
normal_mixture_llk <- function(mu, sd, lambda, data) {
c1 <- lambda * dnorm(data, mu[1], sd[1])
c2 <- (1 - lambda) * dnorm(data, mu[2], sd[2])
sum(log(c1 + c2))
}
set.seed(123)
ao(
f = normal_mixture_llk,
initial = runif(5),
target = c("mu", "sd", "lambda"),
npar = c(2, 2, 1),
data = datasets::faithful$eruptions,
partition = list("sequential", "random", "none"),
minimize = FALSE,
lower = c(-Inf, -Inf, 0, 0, 0),
upper = c(Inf, Inf, Inf, Inf, 1)
)
#> $estimate
#> [1] 2.0186094 4.2733455 0.2356261 0.4370622 0.3484057
#>
#> $estimates
#> $estimates[[1]]
#> [1] 4.2733454 2.0186103 0.4370607 0.2356267 0.6515943
#>
#> $estimates[[2]]
#> [1] 2.0186094 4.2733455 0.2356261 0.4370622 0.3484057
#>
#> $estimates[[3]]
#> [1] 0.3258511 3.4877826 0.5350353 1.1392719 0.0000000
#>
#>
#> $value
#> [1] -276.36
#>
#> $values
#> $values[[1]]
#> [1] -276.36
#>
#> $values[[2]]
#> [1] -276.36
#>
#> $values[[3]]
#> [1] -421.417
#>
#>
#> $details
#> thread iteration value p1 p2 p3 p4
#> 1 1 0 -2477.6918 0.2875775 0.7883051 0.4089769 0.8830174
#> 2 1 1 -552.2990 4.2761882 0.7883051 0.4089769 0.8830174
#> 3 1 1 -455.6788 4.2761882 2.0595448 0.4089769 0.8830174
#> 4 1 1 -455.6715 4.2761882 2.0595448 0.4059967 0.8830174
#> 5 1 1 -381.0753 4.2761882 2.0595448 0.4059967 0.2349577
#> 6 1 1 -278.2759 4.2761882 2.0595448 0.4059967 0.2349577
#> 7 1 2 -278.2682 4.2800480 2.0595448 0.4059967 0.2349577
#> 8 1 2 -277.1965 4.2800480 2.0228520 0.4059967 0.2349577
#> 9 1 2 -276.3996 4.2800480 2.0228520 0.4363664 0.2349577
#> 10 1 2 -276.3938 4.2800480 2.0228520 0.4363664 0.2372488
#> 11 1 2 -276.3899 4.2800480 2.0228520 0.4363664 0.2372488
#> 12 1 3 -276.3737 4.2740378 2.0228520 0.4363664 0.2372488
#> 13 1 3 -276.3625 4.2740378 2.0190903 0.4363664 0.2372488
#> 14 1 3 -276.3625 4.2740378 2.0190903 0.4363871 0.2372488
#> 15 1 3 -276.3606 4.2740378 2.0190903 0.4363871 0.2359442
#> 16 1 3 -276.3606 4.2740378 2.0190903 0.4363871 0.2359442
#> 17 1 4 -276.3605 4.2735130 2.0190903 0.4363871 0.2359442
#> 18 1 4 -276.3604 4.2735130 2.0187633 0.4363871 0.2359442
#> 19 1 4 -276.3601 4.2735130 2.0187633 0.4369209 0.2359442
#> 20 1 4 -276.3601 4.2735130 2.0187633 0.4369209 0.2357025
#> 21 1 4 -276.3601 4.2735130 2.0187633 0.4369209 0.2357025
#> 22 1 5 -276.3601 4.2733852 2.0187633 0.4369209 0.2357025
#> 23 1 5 -276.3601 4.2733852 2.0186436 0.4369209 0.2357025
#> 24 1 5 -276.3600 4.2733852 2.0186436 0.4370296 0.2357025
#> 25 1 5 -276.3600 4.2733852 2.0186436 0.4370296 0.2356437
#> 26 1 5 -276.3600 4.2733852 2.0186436 0.4370296 0.2356437
#> 27 1 6 -276.3600 4.2733545 2.0186436 0.4370296 0.2356437
#> 28 1 6 -276.3600 4.2733545 2.0186222 0.4370296 0.2356437
#> 29 1 6 -276.3600 4.2733545 2.0186222 0.4370549 0.2356437
#> 30 1 6 -276.3600 4.2733545 2.0186222 0.4370549 0.2356318
#> 31 1 6 -276.3600 4.2733545 2.0186222 0.4370549 0.2356318
#> 32 1 7 -276.3600 4.2733475 2.0186222 0.4370549 0.2356318
#> 33 1 7 -276.3600 4.2733475 2.0186133 0.4370549 0.2356318
#> 34 1 7 -276.3600 4.2733475 2.0186133 0.4370603 0.2356318
#> 35 1 7 -276.3600 4.2733475 2.0186133 0.4370603 0.2356277
#> 36 1 7 -276.3600 4.2733475 2.0186133 0.4370603 0.2356277
#> 37 1 8 -276.3600 4.2733454 2.0186133 0.4370603 0.2356277
#> 38 1 8 -276.3600 4.2733454 2.0186103 0.4370603 0.2356277
#> 39 1 8 -276.3600 4.2733454 2.0186103 0.4370607 0.2356277
#> 40 1 8 -276.3600 4.2733454 2.0186103 0.4370607 0.2356267
#> 41 1 8 -276.3600 4.2733454 2.0186103 0.4370607 0.2356267
#> 42 2 0 -2477.6918 0.2875775 0.7883051 0.4089769 0.8830174
#> 43 2 1 -675.9936 2.0788729 4.3066515 0.4089769 0.8830174
#> 44 2 1 -666.3126 2.0788729 4.3066515 0.2588645 0.8830174
#> 45 2 1 -278.8295 2.0788729 4.3066515 0.2588645 0.4195133
#> 46 2 2 -277.1289 2.0284017 4.3066515 0.2588645 0.4195133
#> 47 2 2 -276.4427 2.0284017 4.2749893 0.2385454 0.4345961
#> 48 2 2 -276.4290 2.0284017 4.2749893 0.2385454 0.4345961
#> 49 2 3 -276.3615 2.0187642 4.2749893 0.2358722 0.4368407
#> 50 2 3 -276.3601 2.0187642 4.2734285 0.2358722 0.4368407
#> 51 2 4 -276.3601 2.0187642 4.2734285 0.2358722 0.4368407
#> 52 2 4 -276.3601 2.0187642 4.2734285 0.2357131 0.4368407
#> 53 2 4 -276.3601 2.0186370 4.2734285 0.2357131 0.4370236
#> 54 2 4 -276.3600 2.0186370 4.2733688 0.2357131 0.4370236
#> 55 2 5 -276.3600 2.0186117 4.2733688 0.2356314 0.4370578
#> 56 2 5 -276.3600 2.0186117 4.2733471 0.2356314 0.4370578
#> 57 2 6 -276.3600 2.0186110 4.2733471 0.2356277 0.4370578
#> 58 2 6 -276.3600 2.0186110 4.2733461 0.2356277 0.4370623
#> 59 2 7 -276.3600 2.0186094 4.2733461 0.2356277 0.4370622
#> 60 2 7 -276.3600 2.0186094 4.2733461 0.2356277 0.4370622
#> 61 2 7 -276.3600 2.0186094 4.2733455 0.2356261 0.4370622
#> 62 3 0 -2477.6918 0.2875775 0.7883051 0.4089769 0.8830174
#> 63 3 1 -421.4170 0.3258511 3.4877825 0.5350353 1.1392719
#> 64 3 2 -421.4170 0.3258511 3.4877826 0.5350353 1.1392719
#> p5 b1 b2 b3 b4 b5 seconds
#> 1 0.9404673 0 0 0 0 0 0.000000000
#> 2 0.9404673 1 0 0 0 0 0.005186796
#> 3 0.9404673 0 1 0 0 0 0.002609968
#> 4 0.9404673 0 0 1 0 0 0.011002779
#> 5 0.9404673 0 0 0 1 0 0.009052038
#> 6 0.6486229 0 0 0 0 1 0.004248619
#> 7 0.6486229 1 0 0 0 0 0.002267122
#> 8 0.6486229 0 1 0 0 0 0.002669811
#> 9 0.6486229 0 0 1 0 0 0.022426605
#> 10 0.6486229 0 0 0 1 0 0.002651215
#> 11 0.6511875 0 0 0 0 1 0.009426832
#> 12 0.6511875 1 0 0 0 0 0.002666235
#> 13 0.6511875 0 1 0 0 0 0.002647877
#> 14 0.6511875 0 0 1 0 0 0.001681566
#> 15 0.6511875 0 0 0 1 0 0.003135920
#> 16 0.6515107 0 0 0 0 1 0.002669573
#> 17 0.6515107 1 0 0 0 0 0.002175093
#> 18 0.6515107 0 1 0 0 0 0.002654076
#> 19 0.6515107 0 0 1 0 0 0.003134251
#> 20 0.6515107 0 0 0 1 0 0.003120184
#> 21 0.6515742 0 0 0 0 1 0.002184868
#> 22 0.6515742 1 0 0 0 0 0.002201080
#> 23 0.6515742 0 1 0 0 0 0.002177477
#> 24 0.6515742 0 0 1 0 0 0.002182007
#> 25 0.6515742 0 0 0 1 0 0.002666712
#> 26 0.6515897 0 0 0 0 1 0.001688480
#> 27 0.6515897 1 0 0 0 0 0.001698971
#> 28 0.6515897 0 1 0 0 0 0.001722574
#> 29 0.6515897 0 0 1 0 0 0.001719236
#> 30 0.6515897 0 0 0 1 0 0.001691103
#> 31 0.6515931 0 0 0 0 1 0.001685143
#> 32 0.6515931 1 0 0 0 0 0.001693487
#> 33 0.6515931 0 1 0 0 0 0.001701593
#> 34 0.6515931 0 0 1 0 0 0.001736641
#> 35 0.6515931 0 0 0 1 0 0.001676083
#> 36 0.6515940 0 0 0 0 1 0.001680613
#> 37 0.6515940 1 0 0 0 0 0.001677275
#> 38 0.6515940 0 1 0 0 0 0.001685381
#> 39 0.6515940 0 0 1 0 0 0.002222061
#> 40 0.6515940 0 0 0 1 0 0.001789570
#> 41 0.6515943 0 0 0 0 1 0.001671314
#> 42 0.9404673 0 0 0 0 0 0.000000000
#> 43 0.9404673 1 1 0 0 0 0.008172989
#> 44 0.9404673 0 0 1 0 0 0.009584665
#> 45 0.3539259 0 0 0 1 1 0.014842510
#> 46 0.3539259 1 0 0 0 0 0.007028818
#> 47 0.3539259 0 1 1 1 0 0.011126995
#> 48 0.3491053 0 0 0 0 1 0.003321409
#> 49 0.3491053 1 0 1 1 0 0.009912014
#> 50 0.3484435 0 1 0 0 1 0.005136967
#> 51 0.3484435 0 0 0 0 1 0.001705408
#> 52 0.3484435 0 0 1 0 0 0.003152847
#> 53 0.3484435 1 0 0 1 0 0.004318714
#> 54 0.3484435 0 1 0 0 0 0.002247572
#> 55 0.3484435 1 0 1 1 0 0.007146597
#> 56 0.3484061 0 1 0 0 1 0.003495693
#> 57 0.3484062 1 0 1 0 1 0.003862619
#> 58 0.3484062 0 1 0 1 0 0.002729416
#> 59 0.3484062 1 0 0 1 0 0.002663374
#> 60 0.3484057 0 0 0 0 1 0.001688719
#> 61 0.3484057 0 1 1 0 0 0.003008842
#> 62 0.9404673 0 0 0 0 0 0.000000000
#> 63 0.0000000 1 1 1 1 1 0.030427694
#> 64 0.0000000 1 1 1 1 1 0.007467985
#>
#> $seconds
#> [1] 0.13457823 0.10514617 0.03789568
#>
#> $stopping_reason
#> [1] "change in function value between 1 iteration is < 1e-06"
#> [2] "change in function value between 1 iteration is < 1e-06"
#> [3] "change in function value between 1 iteration is < 1e-06"
#>
#> $threads
#> initial partition
#> 1 0.2875775, 0.7883051, 0.4089769, 0.8830174, 0.9404673 sequential
#> 2 0.2875775, 0.7883051, 0.4089769, 0.8830174, 0.9404673 random
#> 3 0.2875775, 0.7883051, 0.4089769, 0.8830174, 0.9404673 none
#> base_optimizer
#> 1 <environment: 0x55e63e3d2980>
#> 2 <environment: 0x55e63e3d2980>
#> 3 <environment: 0x55e63e3d2980>
#>