Optimal deterministic and robust selection of electricity contracts

We address the Electricity Contract Selection Problem (ECSP), of finding best parameters of an electricity contract for a client based on his/her past records of electricity consumption over a fixed time period. The objective is to optimize the electricity bill composed by some fixed cost, the cost of the subscription of the electricity contract and penalties due to overpowering when consumption exceeds subscribed power. The ECSP can be formulated as a convex separable optimization problem subject to total order constraints. Due to this special structure, ECSP is a special case of two well known classes of convex separable optimization problems, namely the minimum network flow under convex separable cost and minimizing convex separable functions under chain constraints. Both classes are well treated in the litterature and can be solved in polynomial time (Ahuja and Orlin in Oper Res 49(5):784–789, 2001; Ahuja et al. in Manag Sci 49(7):950–964, 2003; Best et al. in SIAM J Optim 10:658–672, 2000; Karzanov and McCormick in SIAM J Comput 26(4):1245–1275, 1997; Minoux in Eur J Oper Res 18(3):377–387, 1984; Minoux, in Gallo and Sandi (eds) Netflow at pisa, mathematical programming studies, Springer, Berlin, 1986). In particular, the algorithm in Ahuja and Orlin (2001) achieves the best theoretical time complexity assuming that computing the objective function value at one specific point can be done in constant time. However, when we work on a big amount of historic data as in ECSP, the time required for evaluating the objective function cannot be assumed to be O(1) anymore. In this paper, we propose a new algorithm for ECSP which is specially designed to reduce the computational effort over large scale historical data. We present numerical results showing that our algorithm outperforms the algorithm in Ahuja and Orlin (2001) when applied to consumption data of various types of clients. A robust version of ECSP based on a Seasonal and Trend decomposition approach for modelling consumption uncertainty is also investigated. The resulting worst-case cost minimization problem is shown to be efficiently solvable using the same algorithm as for deterministic ECSP.


Introduction
Optimally selecting an energy contract is an important issue for many industrial customers which have an electricity contract for each delivery point. Contracts may need to fix the maximum of total energy consumption in kilowatt-hours and the maximum of peak demand in kilowatt over a large period, typically 1 year.
The electricity contract problem discussed in [6] can be formulated as a linear program and solved as such in polynomial time. In the problem discussed in [6], there is a fixed cost when the peak demand does not exceed the contract capacity and some additional cost for excess demand.
Combining the optimal energy contract selection problem with a lot sizing problem by considering renewable energy sources has been investigated in [16].
Using a pricing based on the so-called "subscribed power" tends to enforce people to adapt their electricity consumption behaviour in order to match with the subscription [18]. This subscribed power is based on billing the cost of the electricity contract, the cost of some penalty of overpowering with respect to the maximum peak demand defined in the contract and some other fixed costs.
In this paper, we investigate an optimal electricity contract selection problem (ECSP) based on subscribed power which is formulated as minimizing a convex separable function subject to total order constraints. The problem is addressed both in the deterministic case and in the case when consumption data are subject to uncertainty. The paper is organized as follows. Section 2 presents the context and proposes a formulation of the optimization problem considered in its deterministic version. Section 3 presents an efficient algorithm based on an active constraint set approach to solve the optimal selection of electricity contracts. Section 4 investigates a robust version of the problem in which uncertainties on consumption are taken into account. Section 5 presents a series of computational results obtained, both for the deterministic case and for the robust version of the problem. Section 6 presents the main conclusions and some perspectives for future research.

Mathematical formulation of ECSP
In France, ENEDIS is responsible for the management of 95% of the electricity distribution network. The tariffs for using the electricity grid are referred to as TURPE (see enedis.fr).
Over the year, consumption periods are classified into several predefined categories, where each category may have it own consumption behaviour. Under the 5th version of TURPE, denoted TURPE5, K = 5 categories are considered, namely: winter peak hours (1), winter full hours (2), winter off-peak hours (3), summer full hours (4) and summer off-peak hours (5). Electricity used during peak hours is billed at a higher rate than electricity used during off-peak hours, creating an incentive for homeowners to reduce energy usage during peak hours. Let T be the total number of 10 min periods (e.g. T = 52,560 for a year) and T i the total number of 10 min period in category i. Let M = {1, . . . , 12} be the set of all months of the study period, and ∀i ∈ {1, . . . , K }, M i is the subset of M corresponding to category i. For example, for the winter peak hours category with i = 1, M 1 = {1, 2, 12} which corresponds to the 3 months of winter. Each small 10 min period t belongs to a category i and a month m ∈ M i , noted as t ∈ T i,m , and for each t there is a required consumption level c t .
Selecting an electricity contract defined by TURPE5 is to choose for each category of consumption i a value of subscribed power x i ≥ 0 (in kW). The problem of selecting a best contract thus consists in determining a K-dimensional non-negative vector x = (x 1 , . . . , x K ) minimizing the bill due by the consumer. For each category i the corresponding subscribed power x i represents thresholds of consumption, viewed as a commitment of maximum peak demand for each 10 min period.
There is an annual cost for the subscription equal to i s i x i (for given s i > 0 in e/kW/year), this cost is an increasing linear function of x. Excess demand of consumption, called overpowering, is allowed but in that case extra penalty costs are due. The overpowering quantity for t is δ t (x i ) = max(0, c t − x i ). For each category i and month m ∈ M i , the penalty cost is p i t∈T i,m δ t (x i ) 2 (for given p i > 0 in e/kWh), which is a decreasing function of x i . There is also some fixed cost unrelated to subscribed power (e.g. consumption, transports and taxes) which is not necessary to our analysis. The objective function of the problem is the total annual cost F(x) defined as the sum of the annual costs F i (x i ) for the various categories i ∈ {1, . . . , K } : (see Fig. 1 for a graphical representation of a function F i (x i )).
In order to ensure the stability of the network load, TURPE5 imposes that subscribed powers in high consumption categories (e.g. winter peak hours) must be lower than subscribed powers in low consumption categories (e.g. summer off-peak hours). This results into a series of K − 1 constraints of the form ∀i < K , x i ≤ x i+1 , defining a total order on the x i values.
Let c i (resp.:c i ) be the smallest (resp.: greatest) value of consumed power during the time slots in category i. Then we denote by c (resp.:c) the smallest (resp.: greatest) consumed power among the instance. Then the x i variables should meet the bound constraints 0 ≤ c ≤ x i ≤c. Let C =c − c + 1 be the number of integer values among which the optimal solution is searched. This parameter C together with the number of categories K will be useful for expressing the worst-case time complexity of our algorithm.
The ECSP can then be formulated as the following mathematical program P: c ≤ x i ≤c, The constraints (3) are called the order constraints and the constraints (4) are the bound constraints. A solution for P is a K-dimensional vector x = (x 1 , . . . , x K ) T satisfying the constraints (3) and (4). The objective function (2) is separable and convex as stated in the following Proposition 1.

Proposition 1 Each F i is convex with respect to x i .
Proof For all i ∈ {1, . . . , T }, let g t (x i ) be an univariate convex function such that g t (x i ) ≥ 0 for all x i ≥ 0.
For all x ∈ IR T , let us denotes From the convexity of the g t functions and the fact that the norm is a non decreasing function on the set of non negative vectors, it holds: Therefore: Now, from the convexity of the Euclidean norm, it follows: In view of this, the following inequality is shown to hold: and from this, we conclude that: which proves the desires convexity property of G.
For each category i the subscription cost is linear in x i and the penalty cost is of the form:

State of the art and motivation
The ECSP can be reformulated as a minimum separable convex cost network optimization problem in a graph featuring K nodes and 2K − 2 arcs (see "Appendix"). Thus, in order to solve P, one might well consider applying some of the existing solution procedures, in particular: -the algorithm proposed in [13,14] which, in our case, would lead to a worst-case complexity O(K 3 logC); -the algorithms proposed in [1] and in [11] which would lead to a worst-case complexity O(K 2 log(K )log(K C)).
For a survey of solution algorithms for convex cost network optimization problems and related nonlinear optimization problems, we refer the reader to [9]. Note that all the previous algorithms use various versions of minimum cost flow network algorithms running on various specific types of graphs. Some of these graphs could be of very big size (e.g. the number of the arcs could be K C in the graph considered in [1]) and the construction of these graphs is not taken into account in the time complexity of algorithm. Moreover, the algorithms in [1,11] need advanced data structures such as dynamic trees to achieve the time complexity of O(K 2 log(K )log(K C)). Without them, the worst-case complexity would remain O(K 3 logC).
The ECSP also belongs to the class of convex separable function minimization problems under total order constraints. This class of problem appears in the context of isotonic regression which is a well-known problem in statistics. Given K data c 1 , . . . , c K , the Isotonic Regression Problem (IRP) consists in finding K values x 1 , . . . , x K minimizing the norm p distance between the solution and the data subject to order constraints x 1 ≤ · · · ≤ x K .
Formally, it is formulated as follows: The generalized IRP is obtained when the objective function is replaced by any convex separation function. One of the most efficient algorithms for solving generalized IRP has been proposed by Ahuja and Orlin [2] under the name of 'Scaling_PAV'. Assuming that computing (1) at any x in R K can be done in constant time O(1), its complexity is O(K log(C)). In Scaling_PAV, instead of calculating the exact optimal solution of each F i (x i ) at each iteration, one only updates the intervals to which the optimal solution should belong. The K intervals are all initially set equal to (c,c) and reduced by one half after each iteration. As new order constraints are successively detected and saturated, the number of intervals is progressively decreased as variables are fusioned. After log 2 ( C ) iterations, all the intervals have reduced length less than , and an -optimal solution is found. Since, in each iteration, one works with a at most K intervals, the worst-case time complexity of Scaling_PAV is O(K log(C)).
Obviously, the Scaling_PAV algorithm could be applied to the ECSP addressed here. However, in this context of application, the evaluation of the resulting complexity has to be completely reconsidered, because careful examination reveals that the computational effort required to compute each value F i (x i ) involved in the objective function (2) cannot be assumed to be O (1). Indeed, from the expression given in (1), it is readily seen that computing F i (x i ) for any given value of x i , requires O(N + (x i )) arithmetic operations, where N + (x i ) is defined as the number of time steps t in T i such that c t > x i . Thus, for large values of x i (i.e. values close to max(c t , t ∈ T i ), both N + (x i ) and the computation time are reduced. By contrast, for smaller values of x i , N + (x i ) can be of the same order of magnitude as |T i |, the total number of time steps in the definition of F i , thus leading to possibly much bigger computation time. For further discussion and illustration of this dependence of N + (x i ) on x i , please refer to the forthcoming Sect. 3.2. Clearly, since |T i | (the number of consumption data in the category i) can be much larger than K , the number of decision variables, this parameter turns out to be a key factor to be taken into account in the evaluation of an algorithm for solving P. Thus, in the search for computational efficiency in the solution of ECSP, the dependence of N + (x i ) (a measure of the computational effort) with respect to x i has to be taken into account.
In the present paper, we propose a special-purpose algorithm for solving ECSP which will turn out to be practically more efficient than Scaling_PAV, thanks to a proper exploitation of the specific features of the problem pointed out above. The proposed algorithm, which will be called Optim_SP, is based on the same scheme of iterative active constraint set detection as Scaling_PAV. However, its implementation makes essential use of an extended asymmetric binary search procedure, thanks to which the computational effort required in each iteration to minimize the relevant part of the objective function (2) can be significantly reduced, as compared with usual binary search. It is also worth mentioning here that this possibility of resorting to this extended binary search procedure could not be applied within the Scaling_PAV algorithm for the following reason: when running Scaling_PAV, at any given step all the intervals must have the same length, and this cannot be the case in case of using the proposed extended binary search.

Optim_SP: an efficient algorithm based on an active constraint set approach
For k ∈ {1, . . . , K − 1}, let the kth order constraint denote the constraint x k ≤ x k+1 . For k ∈ {0, . . . , K −1} let P k denote the relaxation of the problem P where the last K −1−k order constraints are relaxed, i.e. the order constraints in P k are only the first k order constraints: Note that P 0 is the relaxation of P without the order constraints and P K −1 is P itself. More generally, for B ⊆ {1, . . . , K − 1}, let P(B) denote the relaxation of P where all the order constraints are relaxed except those indexed in B: In addition, we denote P = (B) the restriction of P(B) where the order constraints in B are set to equality: Note that, if B = ∅, then P = (B) = P 0 .

Algorithm's overview
Optim_SP initially solves P 0 and then performs K − 1 iterations thus solving successively P 1 , P 2 , …, T denote the optimal solution of P k found by Optim_SP and let B k denote the set of the indices of the active order constraints in P k associated with x k , i.e. the order constraints in P k that x k satisfies at equality. At each iteration k = 1, . . . , Then Optim_SP iteratively finds the order constraint of greatest index i in P k violated by x, updates B ← B ∪ {i} and finds a new x by solving P = (B). If no such index can be found then Optim_SP sets x k ← x and B k ← B and terminates the iteration k.
Optim_SP can be formally stated as Algorithm 1.

Adjustable binary search for optimizing univariate convex functions F(x)
Each step of Algorithm 1 requires the solution of a problem of univariate convex function minimization over a given interval. To achieve this, we propose an extended binary search procedure called A_Search, formally stated as Algorithm 2.
Please note that the standard binary search procedure corresponds to the choice of a = 1 in A_Search. For this value of a, ymid is the middle of the interval (y,ȳ). For a > 1, ymid is no longer the middle of the interval, and we refer to this case as 'asymmetric binary search'. The motivation behind the use of a > 1 stems from the observation that the computational effort O(N + i (x i )) required to compute the value of one (individual aggregated) component of the objective function as defined in (1)   Note that when a > 1 does not allow anymore the binary search to continue on the smaller subinterval on the right of ymid, we set the value of a to 1 (see line 9 of Algorithm 2) and resume the last iterations with the standard symmetric binary search. With this switch on the value of a, as our binary search scheme performs at least as well as the symmetric binary search scheme on all iterations except perhaps one, it can be checked that the number of iterations in A_Search is at most log 2 (ȳ − y) + 1.

Solving P = (B)
To perform line 2 in Optim_SP which consists in solving P = (∅), i.e. solving P 0 , we call the A_Search function for minimizing each . This function call returns an integer value yopt and the update of x (in line 10 of Algorithm 1) is done as follows: x j ← x j for j / ∈ I and x j ← yopt for j ∈ I .

Correctness of OPTIM_SP
A partition of the index set {1, 2, . . . , K } (of the variables) J = {J 1 , . . . , J k } is called a block partition if its elements are subsets (or blocks) of consecutive indices. Given an x ∈ R K , x is conform to J if x i = x j for all i and j belonging to the same block in J . The correctness of Optim_SP is based on well-known results in the literature. In particular, we quote here the following lemma (Lemma 2 in [2] which uses results in [5]) appropriately restated to fit our context.

Lemma 1 [2]
If J = {J 1 , . . . , J k } is a block partition of {1, 2, . . . , K } such that there exists a x ∈ R K conform to J which is also feasible for P then x is an optimal solution for P.
Then we can state: The solution x output by Optim_SP is optimal for P.
Proof As we cannot find a greatest index 1 ≤ i ≤ K − 1 such that x i > x i+1 , x is feasible for P. LetB denote the last value of B after termination of Optim_SP thenB is the set of active constraints that x satisfies at equality. Let us build a block partition J of {1, 2, . . . , K } fromB as follows: -For each of block of consecutive indices I inB with i andī as respectively the smallest and the greatest indices in I , let us refer to {i, i + 1, . . . ,ī,ī + 1} as a block in J . -For an index 1 ≤ j ≤ K , if neither j nor j − 1 belongs toB, then let us refer to { j} as a single block in J . It is then easy to see that x is a feasible solution of P conform to J . Hence, by Lemma 1, x is optimal for P.

Proposition 2 The worst-case time complexity of Optim_SP is O(K 2 log(C)) provided that the evaluation of each F i (x) at a specific point x can be done in O(1).
Proof We can see that the worst case occurs when the set B in line 10 of Optim_SP contains successively 1, 2, . . . , K − 1 elements and Optim_SP calls K − 1 times A_Search (F I (x), x ī , x i , a) in line 10 with the sets I (as defined in Sect. 3.3) containing successively 2, 3, . . . , K elements. Hence, the total number of times that the functions F i (x) (i = 1, . . . , k) are minimized in the executions of A_Search is K (K +1) 2 − 1. We should also include the minimization of each F i (x) in the K calls of A_Search for solving P 0 at line 2 of Optim_SP. Hence, overall there are K (K +1) is evaluated at most log 2 (C) + 1 times as pointed out in Sect. 3.2. Hence, the number of evaluations of the functions F i (x) is O(K 2 log(C)) in the worst case. .

Optim_SP for a robust version of the problem
Now, if we consider input data as forecasted consumption, actual realisation of electricity consumption can be anticipated to be different from the forecast. Indeed the deviation from predicted consumption can create greater overpowering, the consequence being to increase the electricity bill. In the sequel, we discuss a robust version of the problem of optimal selection of electricity contract and show how it can still be solved using Optim_SP algorithm.

Modelling customer demand
Customer demands are often represented as stochastic process based on influencing factors. Those influencing factors are often external condition (temperature, climate …), physical characteristic of dwelling (type, age …), devices and occupants (occupation and behaviour …), prices and subscriptions [3,12,21]. Works on lightning energy in large office building show that consumption can be accurately simulated by considering occupant behaviour and seasonal variations, and can be described using e.g. Poisson and Normal distribution [21]. The consumption model is useful to predict consumption in the short-term (1 day), mediumterm (3 days), long-term (7 days) [19] and very long-term (1 year) [3]. Many works are based on precise modelling of each device in household [10,15,17]. The authors of [8] present an approach based on hidden Markov chains for the modelling and statistical analysis of electric consumption curves. The "Seasonal and Trend decomposition using Loess" (STL) method [7] decomposes the consumption as:

raw_data(t) = trend(t) + seasonal(t) + remainder(t).
The 'trend' component corresponds to non-stationary long-term evolution of consumption, the 'seasonal' component corresponds to periodical evolution (e.g. period = 1 week), and the remainder (or the residual) variation is what is left over after fitting the model and will be viewed as a realisation of some random process.
In the following, consumption is viewed as a time series in which each sample is collected every 10 min. For each category i, we apply the STL method to decompose the time series and the important parameter to characterize uncertainty is taken to be the standard deviation σ i of the remainder. Then for 10 min period t ∈ {1, . . . , T i } we will typically consider that consumption can deviate from the trend + seasonal forecast by a value represented as an independent realisation of the appropriate probability distribution. In Sect. 5 below, it will be assumed that the residuals are independent realizations of some known probabilistic distribution (typically a truncated normal distribution) with zero mean and known standard deviation.

Robust problem formulation
We propose here to define a robust version of the problem using a concept of uncertainty set similar to the one introduced in [4]. Following the latter, the uncertainty set, corresponding to ν uncertain parameters μ 1 , . . . , μ ν , each taking bounded values between 0 and β, and having a very small probability of all simultaneously taking their maximum value β, would typically be the polyhedron defined by the inequalities 0 ≤ μ i ≤ β and ν i=1 μ i ≤ Γ β, where Γ β is the so-called budget of uncertainty. Increasing Γ β increases the range of possible scenarios against which robustness is to be achieved, thus improving the robustness of the solutions obtained but at the expense of increasing the solution cost.
For each category i, let v = (v 1 , . . . , v |T i | ) be the variable vector of possible deviations, where deviations typically are bounded as −b t ≤ v t ≤ b t . However, in view of the max operator in the expression of the penalty cost, negative deviations will not increase the objective function value. We consider that for a given category i, all b t (t ∈ T i ) are equal and we denote b i this common value. Finally, we consider each deviation is bounded as 0 ≤ v t ≤ b i , where the value b i is chosen to be of the same order of magnitude as the standard deviation σ i (typically b i = r * σ i with r chosen between 2 and 3).
In line with [4], denoting B i the budget of uncertainty for category i, we allow a total deviation at most equal to B i , thus the deviation v has to satisfy the constraint: T i t=1 v t ≤ B i . As an indication about how to choose B i , consider n realisations of a random variable normally distributed with zero mean and standard deviation σ i . The sum of those n random realisations has a variance σ 2 = nσ 2 i , i.e. its standard deviation is σ = √ nσ i . In line with this remark, taking V i to be an integer value of the same order as √ T i , we define the uncertainty set U i as: For defining the robust version of the problem, the overpowering quantity for period t is now For given x i , the worst-case penalty cost for category i in the robust version of the problem is obtained as the maximum value over the uncertainty set U i of the penalty cost function, which is defined as: The objective function for the robust formulation is F r (x) = i F r i (x i ) = i s i x i + p i Π r i (x i ) and the robust version of ECSP can be stated as: We note that in the above formulation, the upper bound values on the variables x i have been changed to take into account the possible increase of the maximum value of consumption by b i for category i.
Proof For any fixed v ∈ U i , we know from Proposition 1 that the function: is convex in x i . Now, since the maximum value in (5) is to be determined with respect to the (finite) set of extreme points of U i , Π r i is recognized as the pointwise maximum of a finite collection of functions, each of which is convex in x i . This proves the claimed result.

Using OPTIM_SP to solve the robust version of the problem
Since it appears that the robust version of the optimum contract selection problem has the same structure as the deterministic version, Optim_SP algorithm turns out to be readily applicable to the robust version of the optimum contract selection problem. However in each call to A_Search function, for a given category i and a given value x i , whenever we have to evaluate the value F r i (x i ), we need now to solve the sub-problem (5).

Proposition 4 For a given x i , any optimal solution to the sub-problem (5) features V i components equal to b i , all the other components being 0.
Proof For a category i, the objective of the sub-problem is to maximize a convex function w.r.t. v, and it is a well-known fact that the maximum is reached at an extreme point of the polyhedron representing the uncertainty set U i . Now, we just have to observe that, since V i has been assumed to be an integer value, each extreme point of the polyhedron U i features exactly V i non zero components equal to their upper bound values b i .
For the category i and given x i , according to proposition 4 the maximization sub-problem (5) is solved when the V i uncertain parameters that maximize the objective function are found i.e. we need to find how to best dispatch those V i time slots over the months M i . This optimum dispatch problem, (5) can be cast into a Multiple-Choice Knapsack Problem (MCKP) with integer variables, which according to [20] can be solved in worst-case time complexity O(N ) (N = V i * |M i | is the number of items). Let us explain how this MCKP is formulated.
Let the variable z m,n = 1 if the algorithm assigns n ∈ {0, . . . , V i } deviations to the month m ∈ M i , else z m,n = 0. Let T n i,m denote the subset of t ∈ T i,m achieving the n highest values of c t .
Proof For a given category i and month m, the penalty cost function is maximized by setting maximum deviations for period t with the highest nonzero δ r t values, i.e. exactly n deviation parameters may take on their upper bound value b i , then the maximum cost value is obtained According to Proposition 5, when n parameters are set equal to their upper bound b i the partial objective cost for month m is: The MCKP to be solved in order to determine the worst-case cost is then: Recall that M i ⊆ M and V i is an integer value of the same order of magnitude as √ T i (≤ √ T ), thus for a given x i determining the optimal value of (5) takes O(|M| * √ T ) worstcase time complexity. Therefore we can state:

Proposition 6
The worst-case time complexity of Optim_SP applied to the robust version of the optimum contract selection problem is O(K 2 M √ T logC).

Experiments on the deterministic version of ECSP
This section is devoted to a detailed computational study of the Optim_SP algorithm on a series of typical real instances of the ECSP problem. For each instance, the evolution of CPU time as a function of the value chosen for the a parameter is analyzed. The results obtained are also compared against those which would be obtained by using the Scaling_PAV algorithm on the same data. In our computational experiments, we consider 5 data sets referred to as D 1 to D 5 . Each data set is defined by specifying the 52,560 values of electricity consumption of the customer under consideration, recorded in every 10-min period along a full year (2017). This set of values is decomposed into 5 subsets (time categories) T 1 to T 5 , and each T i is in turn decomposed into several subsets (corresponding to the relevant months during which consumption has been observed (these subsets are denoted T i,m for some m in M). Data set D 1 corresponds to a big factory in the food industry with subscribed power typically in the range 5000-7500 KW; data set D 2 corresponds to a company providing maintenance of transportation equipment with subscribed power typically in the range 1500-2500 KW; data set D 3 corresponds to an industrial bakery with subscribed power typically in the range 1000-1500 KW; D 4 and D 5 correspond to two big hotels with subscribed power typically in the range 500-1500 KW.
The main characteristics of each data set are shown in Table 1, namely: -for each time category, the minimum and maximum values of the c t values for t ∈ T i ; -for each time category i, the minimum value x 0 i of F i (x i ), the component of the objective function corresponding to time category i ; it is observed that in all cases, the resulting 5 values do not meet the order constraints imposed in ECSP; -the five components x * 1 , . . . , x * 5 of the optimum solution x * to the ECSP problem, which can be observed to satisfy the order constraints x i ≤ x i+1 (i = 1, . . . , 4). Table 2 displays the computational results obtained with scaling_PAV (first line of the table labelled 'sPAV') and Optim_SP (in the following lines for values of a ranging from 1 to 8) for the various instances corresponding to the data sets D 1 to D 5 . For each instance, the column labelled '#evalF' provides the number of calls to the A_Search procedure; the column labelled 'Total_N + ' provides the the evaluation of the resulting total number of arithmetic operations as measured using the N + i (x i ) values; the column labelled 'cpu' indicates the resulting total cpu time (in milliseconds).
All tests reported have been carried out using Python 3.6.8 on an environment of 8-thread quad-core processor with 16 GB RAM and 2.8 GHz CPU running Windows 10 (64 bits). The results shown in Table 2  Optim_SP achieves improved efficiency as compared with one of the best previously known algorithms for solving problems of minimizing a convex separable function under order constraints. This shows that Optim_SP features a good deal of robustness with respect to the a parameter, and that the choice of a particular value for the parameter a is not a critical issue: a practical consequence of this is that a good value for a will be easily obtained on an experimental basis, by observing the behavior of the Optim_SP algorithm on a sample of a few typical instances of the problem for a few values of a in the range (2,8). -if, for each result shown in Table 2, one computes the ratio cpu/TotalN + , it is observed that all values obtained typically range between 5e − 4 and 7e − 4, which shows that cpu is, up to small fluctuations, proportional to TotalN + . This confirms the relevance of our analysis based on the use of TotalN + as a measure of computational effort.

Results with OPTIM_SP on the robust version of ECSP
This section presents various computational experiments carried out on the robust version of ECSP.

Analysis of uncertainties through STL decomposition
Information on categories and values of the standard deviation (σ i ) of remainder obtained by applying STL decomposition to the data set D 1 are shown in Table 3 (similar results would be observed for the other data sets). For example, the winter peak hours cover 3 months from December to February and T i = 1488 10-min periods, and the standard deviation of remainder is σ i = 116. Statistical tests on remainders reveal that they are uncorrelated, have zero means and are normally distributed, then by considering a truncated normal distribution between −3σ i and 3σ i will give correct representation of deviation of the consumption. In our experiments, the robust parameter b i is chosen to be 3 * σ i (r = 3), and we will use parameter corresponds for different values of q ∈ [0, 0.2, 0.5, 1]. Notice that for q = 0 the deterministic version of the problem is obtained. For the summer off-peak hours, the different values of q considered correspond to the values 0, 26, 65 and 130 for V i . Table 4 provides comparative information about the costs of various optimal or sub-optimal solutions to both the deterministic version (q = 0) and the robust version (q > 0) of ECSP for data set D 1 . Table 4 decomposes into three parts:

Numerical results concerning the robust version for the data set D 1
-the first part of the table concerns the contract currently used by the company (a big factory in the food industry), the corresponding solution (6200, 7000, 7000, 7300, 7300), is called the reference solution; the deterministic cost of this reference solution, as well as its costs in terms of the robust objective function for q = 0.2, 0.5, 1 are provided. -the second part concerns the optimal solutions to the deterministic ECSP (for q = 0) and the robust versions of ECSP (for q = 0.2, 0.5, 1); -the third part provides:  The results in Table 4 suggest the following comments: (i) the comparison of the costs of the optimal solutions (part 2 of the table) with the costs of the reference solution shows that the latter is suboptimal in all cases: the difference is 4.2% for q = 0, 0.4% for q = 0.2, and 0.5% for q = 0.5 and q = 1. It is thus observed that the reference solution turns out to be a fairly good approximation (to within 0.5% or so) of the optimal robust solutions for all the (strictly positive) values of q considered. (ii) the comparison between the last three lines of part 2 and the last three lines of part 3 shows the benefit provided by the optimal robust solution over the optimal deterministic solution in the presence of uncertainty: when q = 0.2, the optimal robust function value is 113,750, whereas the robust objective function value for the optimal deterministic solution is 118,212, showing that the former leads to an improvement in cos 3.8% over the latter. The same comparison for q = 0.5 and q = 1 would lead to 6.3% and 7.6% improvement respectively. (iii) The comparison between the first line of part 2 of the table and the first line of part 3 shows that the difference between the deterministic objective function value of the optimal robust solution for q = 0.5 and q = 1, and the optimal deterministic solution value (111296 − 107576 = 3720) represents an increase by 3.3%. This value can be interpreted as the price of robustness. (iv) the cpu times required for solving the robust versions of ECSP are about 100 times more than for solving the deterministic version, however they typically do not exceed 2 or 3 s, which is quite acceptable from the point of view of practical applicability.
To conclude, let us point out that, in spite of the fact that the results in Table 4 only concern the instance D 1 , quite similar conclusions would be obtained by analyzing the other data sets D 2 , . . . , D 5 .

Computational efficiency of OPTIM_SP on the robust ECSP
In addition to the results discussed in Sect. 5.2.2, we provide in Table 5 results showing the computational efficiency of Optim_SP applied to the robust version of ECSP, in a form similar to Table 2. In these experiments, the value q = 0.5 has been taken. The main comments suggested by the results shown in Table 5 are the following: -since the computational effort required to calculate the value of each component of the objective function of the robust problem is significantly bigger than for the deterministic case, it is all but surprising to observe significantly increased cpu values. On average, they can be seen to be multiplied by a factor in the range 50-100. -concerning the influence of varying the value of the a parameter in the range a = 1 to a = 8, a behavior similar to the one observed for the deterministic case can be noticed: for all instances (except D 5 ) Optim_SP outperforms Scaling_PAV for a wide range of values for a. -the relationship between cpu and the Total_N + measure of computational effort, which was almost linear in the deterministic case, is more intricate in the case of the robust ECSP; this is mainly due to the significant overhead induced by the need of repeatedly solving the MCKP discussed in Sect. 4.3.

Conclusion and future works
This paper proposes an efficient polynomial-time algorithm based on an active constraint set approach for the electricity contract selection problem (ECSP). This problem is formulated as minimizing a convex separable function subject to total order constraints. The computational results obtained show that the algorithm features improved CPU time as compared with Scaling_PAV [2], one of the best existing methods for this problem. The robust version of the problem has also been investigated. The construction of an uncertainty set representing realistic scenarios of deviation between realized consumption and forecasted values has been proposed, based on the so-called Seasonal and Trend decomposition method (STL). It has been shown that the robust version of ECSP can be efficiently solved using the same algorithm as for the deterministic version. However, taking uncertainty into account requires computing worst-case values of the objective function. The latter problem is reduced to solving a multiple-choice knapsack problem, which is polynomially solvable.
An interesting direction for future research will be to generalize the approach of robustness. In the present paper, it has been assumed that each consumption of a given category i can only deviate by a value b i which is the same for all the time periods in the category. The next step is to have more detailed evaluation of the σ values within each category, the standard deviation may be different from 1 month to the next (e.g. the consumption of summer off-peak hours in June may be quite different than the one in September). Another interesting subject for future investigation would be to use more accurate stochastic models, possibly capturing statistical dependence from one time instant to the next to better represent uncertainties. Table 5 Computational results on the robust version of ECSP (q = 0.5) for the data sets D 1 , …, D 5 -the set of secondary arcs is composed of K − 1 arcs of the form (1, j) for j = 2, . . . , K .
The flow value s j on each of these arcs has to meet the bound constraints 0 ≤ s j ≤ C, and the associated cost function is identically 0.
Problem P then reduces to determining a minimum cost circulation on the above graph. We observe that the flow values s j on the secondary arcs play the role of slack variables for the constraints x j−1 ≤ x j for j = 2, . . . , K .