Introduction

We are considering the following situation. In a sufficiently well-defined decision context, a decision process has been started. A manager and/or a ‘task force’ team, called decision maker, is confronted with a problem of making ‘the best possible’ decision. In view of this challenging task, the decision maker has appointed a team of consultants, called the analyst,Footnote 1 which is expected to clarify the decision situation and to provide some recommendations. We suppose that:

  • The goal of the decision process, as well as possible ways of achieving it, has been discussed sufficiently well to define a set of potential actions (alternatives).

  • The need to account for multiple and somehow conflicting points of view has been recognized and, based on this, a family of criteria or attributes has been outlined.

  • The analyst has acquired a good knowledge of the decision context, and of the possibilities of interactions she may have with the decision maker (or his representative), and other stakeholders, in order to get all necessary information.

Consequently, we suppose that the analyst has arrived at the stage of reflection where she is about to choose the most suitable multicriteria method to be used within the decision process. In our opinion, this method should be seen as a tool for going deeper into the decision problem, for exploring various possibilities, interpreting them, debating and arguing, rather than a tool able to make the decision. We suppose further that the model of preferences used by the method is, at least partially, co-constructed through interaction between the analyst and the decision maker (or his representative). This co-construction should account for the consequences on which the actions will be judged and for value systems related to the decision context. It should also help the analyst to formulate and express the working hypotheses on which her advice will be based. Moreover, once the method has been chosen, the analyst will collaborate with the decision maker (or his representative) to specify certain characteristics (notably the values attributed to the different parameters) of the preference model that the method requires.

It follows from the above that, in our understanding, the method to be chosen is not expected to discover any good approximation of an objectively best decision, taking into account a pre-existing preference system of the decision maker, but rather more modestly, it is expected to provide the decision maker with results that follow from an adopted way of reasoning, consistent with the working hypotheses (Roy 1999, 2010).

This paper presents questions to guide in the choice of a multicriteria decision aiding method. These questions are presented in a hierarchical order. In our opinion, answering these questions in this order may help the analyst choose the most appropriate method for the decision context. In “A number of actual decision contexts”, we present a number of such contexts to illustrate the diversity of decision situations and to demonstrate further how they are conditioning the analyst’s answers to subsequent questions. The first of these questions—which seems to us to be crucial—is analyzed in “A crucial question conditioning the choice of the method by the analyst”. “Five other key questions to choose the right method” presents the first series of key questions which has to be tackled in relation to the first one. In “Secondary questions”, we present another series of questions that may be addressed to the analyst, which, although realistic, seems to be less general than the previous ones. In “Conclusions”, we highlight two difficulties that the analyst can face in her choice of a multicriteria decision aiding method.

It should be emphasized that we do not aim at providing the reader with strict conditions under which such or such method is the most appropriate, but rather at providing guidelines facilitating the choice of a multicriteria decision aiding method well adapted to the specific context of a case study.

A number of actual decision contexts

Here, we present briefly a number of actual decision contexts to which we will refer later on. Most of these concern a real case that one of us has studied. Note that the list of contexts presented below does not pretend to be representative of the use of MCDA methodology in real world decision problems: it only aims at helping the reader to better understand the arguments that are put forward in the second part of the paper.

At the first reading, this section can be skipped. The reader will be sent back to each of these contexts later, while going along the following sections.

Context no. 1: Commuter rail line. In a large urbanized area, the policy makers have decided to improve the public transportation system. To increase access to different zones, especially for residents of a growing suburb, while using the existing infrastructure as much as possible, the decision to build a rail link (tramway) between the suburb and the employment zone has been taken. Providing such a service does not present any major problem in terms of the line’s layout. The difficulties arise, rather, from design issues related to the number and location of stations as well as capacity-related features. A dozen of variants have already been considered, and all the interested parties have already agreed on five general objectives: (1) minimize investment and operating costs, (2) minimize access time to the stations and line haul times along the rail line, (3) improve the compatibility among urban development, employment, and the transportation system, (4) maximize the well-being of the transport users (increase comfort, safety, etc.), (5) avoid environmental disruption as much as possible. More details about this problem context can be found in example 2 of Roy (1996) and in Labbouz et al. (2008).

Context no. 2: Siting of a nuclear power-plant. The public power supply authority of a country planned to build a nuclear power-plant on a seaside. At the time the consultants’ bureau was called to work on this problem, nine potential sites for a technically feasible placement of the power-plant had already been identified. In order to evaluate and compare these sites, six points of view were chosen: (1) the health and security of the population in the surrounding region, (2) the loss of salmonids in streams absorbing the heat from the power-plant, (3) the biological effects on the surrounding region, (4) the socio-economic impact of the installation, (4) the esthetic impact of the power lines, (5) the investment cost and (6) the operating cost of the power-plant. Further details may be found in Keeney and Nair (1976), Keeney and Robillard (1977), and Roy and Bouyssou (1986).

Context no. 3: Location of a municipal waste incineration plant. The Swiss federal law is charging the cantons with responsibility for installing depolluting plants, in particular, municipal waste incineration plants (MWIP). To encourage cantons to build these plants, the Swiss government gives them a subvention for installing a MWIP. This resulted in overcapacity of MWIPs at the country level. In consequence, when at the end of 1990s two neighboring cantons, Vaud and Fribourg, applied to the government for a subsidy for building their own MWIP, the government asked them to co-ordinate their projects and consider using overcapacity from the neighboring cantons, or to extend the MWIP in Vaud instead of building two new MWIPs. To work out a recommendation for the consensus decision, a body representing four stakeholders was established. These were: Swiss Federal Agency for the Environment, Forests and Landscape, cantonal environmental offices of Vaud and Fribourg, and a representative of a MWIP of Geneva having a big overcapacity. This body invited help from two facilitators (the analyst) from the Federal Polytechnic School of Lausanne (D. Bollinger and R. Słowiński). The five basic scenarios were considered:

  • no new MWIP is build, and the waste of Vaud and Fribourg are distributed over existing plants,

  • two MWIPs are built, one in Vaud and another in Fribourg,

  • one MWIP is built, in Vaud only,

  • one MWIP is built, in Fribourg only,

  • one MWIP is built, at the frontier of Vaud and Fribourg.

Considering different ways of distributing the waste, and a possible extension of the existing MWIP in Vaud (TRIDEL in Lausanne), the ‘task force’ team constructed 17 potential actions. The construction of these potential actions was followed by construction of a consistent family of criteria. This family included 20 criteria, covering diversified aspects such as: ecology (3), economy (5), organization (4), law (3), and psychology (5). One can notice a strongly non-compensatory character of such heterogeneous criteria. Moreover, the four stakeholders had different views on the relative importance of each criterion. The construction of actions and criteria required a substantial amount of work by the body of the four stakeholders helped by the facilitators and by an engineers’ office employed for this purpose. More details can be found in Bollinger et al. (1997) and Maystre and Bollinger (1999).

Context no. 4: Line extension of the Paris metro. Due to important development of the metropolitan Paris, the metro lines set up in the past needed to be extended. At the end of 1980s, the Régie Autonome des Transports Parisiens (RATP) elaborated 12 new metro extension projects into the suburbs, which added up to 42.6 km, and consisted of the construction of 38 new stations, doubling the number of stations in the suburbs. The decision about a priority ranking of these projects was not within the purview of the RATP: it was in charge of national and regional agencies responsible for planning and programming the infrastructure. RATP was chartered, however, to elaborate all the technical, financial, economic and social considerations, necessary to pointing up all the priorities which may validly be set up, independently of any value system pertaining to RATP. To clarify the decisions concerning the time-dependent construction of the various segments of such line extensions, the department of Operational Research of the RATP was in charge of their evaluation and, to a possible extent, of their classification, taking into account the following six criteria of which each reflects a specific point of view of one of the stakeholders: (1) the number of residents and jobs served by the project, (2) the number of daily passengers entering the stations on the line concerned, (3) the capital cost of the project per 1 km of line, (4) the internal rate of return, (5) the advisability of the extension with respect to the general organization of the transit systems in the considered sector, and (6) the structuring effects on urban development. In order to fix a performance on a particular criterion for each extension, it was necessary to rely on many estimations of the population affected, cost elements, value of time and discount rate. One had also to adopt prospective views on such issues as urban development, and the behavior of public transport users. It is thus evident that the validity of any such assessment was subject to a margin of error. A report of this study can be found in Roy and Hugonnard (1982).

Context no. 5: Supplier selection. A postal company sought to equip its regional centers with parcel sorting machines. Toward this end, it announced an international tender for commissioning a prototype (it reckoned that this type of machines is not yet quite ready). This tender consisted of specifications and an upper bound on the cost of the basic version of the machine. The selected supplier was to receive the order for supplying all sorting centers. Nine responses to the tender were preselected and evaluated on a set of 12 qualitative and quantitative criteria, such as: quality of workstations, noise pollution, capital cost, operating cost, sorting speed, ease of use, maintenance cost, ease of installation on the spot, possibility of sorting bar-coded parcels, quality of service, confidence in the supplier (Roy and Bouyssou 1993, chapter 8). The quantitative criteria involved very heterogeneous scales. A consulting firm was appointed to help the decision maker that was the board composed of four company directors: technical, financial, commercial, and human resources. The board requested that the consulting firm would carefully take into account the extent to which:

  1. 1.

    The evaluation of responses on some criteria was partially uncertain or even arbitrary.

  2. 2.

    The four directors had different views on the relative importance of each criterion.

Context no. 6: Responses to tenders. A big company is carrying out a considerable part of its research and innovation work by replying and winning some tenders for bids it regularly receives. Replying to a tender needs, however, many months of work, and sometimes calls for starting up research which has high material costs. For this reason, every week, a committee chaired by the sales manager in charge of the tender budget examines the “new business” files received. Each of these files comes from a service within the company, which has suggested replying to a tender it has received. Each of these is treated as a potential action. The committee must decide for each of them whether to accept or refuse and, in the case of acceptance, how much money to allocate to the service in question for developing a response on behalf of the company. Using information in the file, the reply proposal is evaluated with regard to nine criteria that span three structure-providing points of view: (1) chances of being awarded the contract, (2) strategic interest for the company, and (3) economic interest. In order to organize the decision process, the sales manager has asked the research unit of the company to develop a software tool which makes it possible to sort every week the “new business” files and to assigning each file to one of the following four categories:

  • no restriction concerning acceptance,

  • some hesitation concerning acceptance (a doubtful “yes”),

  • hesitation concerning refusal (a doubtful “no”),

  • unhesitating refusal.

To foster a debate within the committee, each of its members gets information about the nine evaluations of each file, and about the category to which each of them was assigned. Any member of the committee may query any item of the information leading to the assignment (assessment of certain risks, allocation of resources other than those requested, etc.). When some items of information are thus modified, new evaluations are immediately recalculated and the resulting reassignment is made known to committee members.

Context no. 7: Management of highway assets. A road network is managed by a central agency that needs to coordinate and control the activities of many local districts spread over a wide geographic area. Considering that the funds required to satisfy the needs of maintenance usually exceed the available resources, the decision about annual budget allocation for routine maintenance is crucial to achieve the best possible efficiency of the road network. In this decision context, there are many decision levels and a hierarchy of stakeholders with different perspectives and specific objectives. The large amount of data makes the decision process very complex. Thus, highway agencies need tools for the coordinated management of their assets that allow interactions between the stakeholders and the analyst in course of the allocation of available funds according to their preferences.

A case study referring to this context is described by Augeri et al. (2011). Specifically, this study concerns the distribution of maintenance resources from a central administration to regional districts and takes into account local agencies’ maintenance needs and the central authority’s goals in a short-term planning period. A network composed of a number of road sections belonging to an Italian highway agency was used as a pilot study for the proposed methodology. The considered maintenance activity concerned the pavement. The family of criteria was composed of 11 very heterogeneous criteria representing the following aspects of evaluation of the road sections:

  • type of distress recorded during the periodical survey,

  • geometric characteristics of the road section,

  • road functional class,

  • intensity of traffic,

  • accident rate.

The road sections were sorted by road experts into four categories of urgency. The task was to develop an intelligible decision model that would reproduce the expert decisions on the road sections and would facilitate prospective decisions about the degree of urgency of the pavement on new road sections.

Context no. 8: Programming of water supply systems for rural areas. The construction of a new water supply system is usually preceded by regional planning which accounts for long-term water resources management and designing of water supply installations. The intermediate stage between planning and designing is called the programming stage. At this stage, a medium-term decision problem is to be considered which involves socio-economic criteria on the one hand and technical criteria on the other hand. In Poland, in the beginning of 1990s, some rural areas, in particular in the East part of the country, did not have water supply systems (WSS). Regional agencies of rural investments were facing then a hard decision problem, that is, how to program the construction of a WSS in a given area, so as to connect water users (understood as topographically compact groups of receivers, like villages, big farms or food processing plants) in a priority order respecting the urgency of their needs. Roy et al. (1992) consider a case study in which this complex task has been decomposed into two problems:

  1. (a)

    setting up a priority order in which water users should be connected to a new WSS, taking into account economic, agricultural, and sociological consequences of the investment,

  2. (b)

    choosing the best variant of technical construction of the regional WSS evaluated from technical and economic viewpoints and from the viewpoint of relationship with the priority order of users coming from problem (a).

The study took into account both purely technical and socio-economic aspects of WSS programming in the form of distinct criteria. A single WSS typically concerns a set of 20–40 users. In problem (a), they were evaluated with respect to the following criteria: water deficiency, farm production potential, function and activity of the user, structure of settlements, water demand, share of water supply installations in all investments concerning the user, possibility of connecting the user to another existing WSS. The data used to calculate user performances on these heterogeneous criteria were affected by some imprecision and indetermination. As to problem (b), there can be hundreds of technical variants, because it involves different types and locations of water sources, types and capacities of system components, and feasible structures of the distribution network. Each feasible variant was characterized by four criteria: (1) investment and (2) operating costs, (3) reliability and (4) a distance between the socio-economic priority order of users, and the precedence order of users connected to the WSS constructed according to a given variant. The last criterion played a coordinating role between both problems of the programming task. The recommended variant was the one which ensured the best compromise between the four criteria.

A similar problem has been considered by Słowiński (1986), however, with respect to development planning of a jointly operated urban water supply and wastewater treatment system in a 20-year planning horizon. This problem has been formulated as a multiobjective LP problem, where the variables were daily water flows on main pipeline connections between sources and users, and daily inflows of wastewater to discharging treatment plants, both of them in consecutive time periods. The objectives were: expansion and operating cost of (separately) water intakes, recycling treatment plants and discharging treatment plants, reliability of water supply, and environmental quality. Because there was no precise data about cost and reliability coefficients, water pollution indices, discount factors and the user’s demands, the experts accepted to specify for each of them an interval of the most possible values that were included in an interval of least possible although realistic values. This corresponded to the definition of trapezoidal fuzzy numbers and lead to a fuzzy multiobjective LP formulation.

Context no. 9: Clinical decision support in emergency room. Abdominal pain in childhood is a highly prevalent symptom caused by organic diseases, psychosocial disturbances and emotional disorders. In many cases, the exact cause is never known. Medical staff must focus on identifying a minority of cases in need of urgent treatment. The child who complains about abdominal pain is initially examined in the emergency room by a medical intern. The possible outcomes of this evaluation are: ‘discharge’, ‘surgical consult’, and ‘in-hospital observation’. A limited number of clinical signs, symptoms and tests, available at the early stage, make such a triage very difficult. To increase the accuracy of the triage, the Children’s Hospital of Eastern Ontario in Ottawa (CHEO) called for a study aiming at developing a clinical decision support system that would assist medical interns in the emergency room. In order to learn diagnostic patterns from past diagnoses made by surgeons, CHEO provided a data set including records of 647 patients with abdominal pain seen during a 3-year period in the emergency room of CHEO. The patients were described by 12 early stage symptoms, called attributes. The data was collected as part of a retrospective chart study, and thus were not complete (10–20 % of missing values). Most of the attributes were nominal (gender, type of pain), some were numerical (age, duration of pain, temperature, white blood cell count), some were binary (vomiting, recent visit in emergency room, muscle guarding, rebound tenderness), and some indicated a location of a condition on patient’s abdomen (location of pain, location of tenderness). According to medical practice, patient records stored in the data set were sorted into three categories: ‘discharge’, ‘surgical consult’, and ‘in-hospital observation’. The focus of the study—which is reported in Michalowski et al. (2003) and Wilk et al. (2005)—was on inducing from the data set an intelligible decision model consistent with the past decisions of specialists, involving the most relevant attributes from among the 12 available attributes. This decision model has been designed to support emergency room staff and has been embedded into a decision support system on a mobile platform, called MET (Mobile Emergency Triage).

Context no. 10: Credit granting. Every day, bank B receives several credit applications from various firms wishing, for example, to put a hotel or a clinic, or to buy machines for some public works. These files are submitted to a credit evaluator whose mission is to consider all of them and decide about the sort of each particular application. This credit evaluator would like to get a help from operational research department of B, by asking it to develop a software tool that would make a preliminary sorting of the incoming files. He wants that the sorting is based on performances of each file on a family of criteria that takes into account a list of viewpoints he provides. These viewpoints cover three main concerns: profitability, the risk of mortgage non-payment, and commercial impact. The family of criteria has to take into account how the application looks like (in particular, what is motivation, requested amount and duration), various characteristics of personality and situation of the applicant (in particular, legal status of the applicant’s enterprise, age of managers, balance sheets from recent years), as well as the history of possible past applications and commercial relations of bank B with the applicant. As soon as a new application for credit arrives, an assistant of the credit evaluator analyzes it in view of defining its performances on the family of criteria. Taking into account an imprecision, or even a lack of some data, which makes the definition of performances highly subjective, the assistant of the credit evaluator can be led to fix not one but two performances per criterion, which correspond to optimistic and pessimistic value, respectively. The credit evaluator requires that on this base each file is assigned to one of the following categories:

Category C1: Files apparently bad that should be rejected after a quick verification.

Category C2: Files rather bad that, for some commercial reasons, could be nevertheless accepted and that have to be transferred to another department.

Category C3: Files rather good that nevertheless need to be completed with some additional information before being carefully examined.

Category C4: Files apparently good that should be accepted after a quick verification.

If necessary, the credit evaluator would like a fifth category to be appended:

Category C5: Atypical files that could hardly be assigned to one of the above four categories.

The multicriteria decision aiding method used to assist the credit evaluator in the above context has been implemented as a computer program which served many years and remained confidential. Even if it was more complex, this method inspired the ELECTRE TRI method (Yu 1992; Roy and Bouyssou 1993).

Context no. 11: Monitoring of risk zones. One is considering here a territory T where some old iron mines exploited using the ‘room and pillar’ caving method present a risk of collapse or ground subsidence causing major damages to surface buildings and infrastructures. T has been partitioned into zones which exhibit homogeneous characteristics of the underground and of the surface infrastructure (segments of highways and other roads, schools, commercial centers, public buildings, apartment buildings, entertainment parks, etc.). For more details, see Merad et al. (2004). A family of criteria has been defined to assess the gravity of risk that a zone may present based on its characteristics. The risk prevention authority responsible for the monitoring of territory T is wishing that each zone is assigned to one of the following categories:

Category C1: Very low risk zones requiring reference leveling (topographic surveys) only.

Category C2: Low risk zones requiring reference and annual leveling.

Category C3: Zones of risk sufficiently high for an in depth investigation including geological boring if necessary.

Category C4: High risk zones requiring long-term continuous monitoring based on recording of underground microseismic activity.

Criteria used for evaluation of the homogeneous zones are grouped under two headings: ‘susceptibility of the mine to collapse’ and ‘surface sensitivity’. The first refers to the ‘probability of rupture’ and the second to both the ‘intensity of the rupture’ and the ‘value and vulnerability of assets’. In the first group, there are criteria such as: mean stress applied on pillars, existence of fault, superimposition of pillars, size and regularity of pillars, and sensitivity of rock to flooding. The second group has criteria such as: depth of the top mined layer, maximum expected subsistence, expected surface deformation, zone extent, and vulnerability of buildings, roads, railways, bridges and various networks (electricity, water, gas). For some criteria the risk grows with the performance, and for the others, it decreases.

This is a complex decision-making problem in which the available information is uncertain (missing information, such as geological data) and imprecise (mining works maps), and in which knowledge is incomplete (e.g., soil-structure interaction). The risk prevention authority would like to be supported by a method permitting identification and sorting of homogeneous zones into the four predefined risk categories.

Context no. 12: Engineering design of a chemical reactor. In the engineering design of a chemical reactor for the production of p-xylene, it is necessary to determine the most appropriate parameters for the efficient production of this compound. It is used as an intermediate in production of various plastics such as: polyesters and polyamides. p-Xylene is to be produced by isomerization of o-xylene over H-modernite catalyst in a flow reactor. A kinetic model of this reaction, defined by a set of differential equations, is a basis for formulation of a multiobjective nonlinear programming problem. The decision (design) variables are: temperature and pressure of the process, reactor volume, feed flow rate and catalyst weight. The feasible values of the decision variables are constrained by technological requirements and by equations following from the kinetic model. The values of the decision variables should give the best compromise between four conflicting objectives. The first objective—reactor volume, to be minimized—represents the designer’s attitude to reduce the size of the chemical installation. It influences both investment and operating costs of the reactor. The second objective—catalyst weight, to be minimized—expresses the designer’s aspiration to reduce the weight of the H-modernite used in the reactor. The third objective—mass production of p-xylene—is to be maximized. The last objective—concentration ratio between p-xylene and o-xylene, to be maximized—is correlated with the quality of the final product and with the level of transformation of o-xylene, i.e. with the efficiency of the production process (Jaszkiewicz et al. 1995). One can notice that the objectives involve heterogeneous scales. Moreover, as the kinetic model of the reaction is an approximation of the real process, the performances of feasible solutions on particular objectives should also be considered as approximate during the search of the best compromise solution.

A crucial question conditioning the choice of the method by the analyst

To make the choice of a multicriteria decision aiding method, the analyst should, in our opinion, start with reflecting on the best or even the only way of answering the following essential question:

Taking into account the context of the decision process, what type(s) of results the method is expected to bring, so as to allow elaboration of relevant answers to questions asked by the decision maker?

The type of results produced is a feature which distinguishes various methods of multicriteria decision aiding. Depending on the decision context, this is not the same type of results that may bring useful information able to guide the decision aiding process in the right way, and to work out some conclusions, or even a recommendation. Moreover, the type of results is conditioning the way in which the analysis is inserted in the decision process. The analyst should keep all this in mind when answering the above question.

A review of multicriteria decision aiding methods (well defined and sufficiently operational to be taken into account by the analyst) led us to distinguish five main types of results that the analyst may want to consider in relation to methods clearly associated with them. Other types of results corresponding to different decision problems could also be considered (Bana e Costa 1996; Tsoukiàs 2007) but, to our knowledge, there is no well-established method that could produce them.

At this stage of reflection, the analyst can hesitate among several types of convenient results so that she can keep more than one type at the moment. Even if she selects only one type, the method producing this type of results may not be unique. Consequently, in many cases, the answer to the above initial question may lead at this stage to a short list of more than one method. The final choice of a particular method will result from the answers to questions formulated in “Five other key questions to choose the right method” and “Secondary questions”.

  1. (a)

    Type 1: A numerical value (utility, score) is assigned to each potential action.

It is possible that this type of result is imposed by the decision maker. This could be the case of contexts: Commuter rail line (1), Siting of a nuclear power-plant (2), and possibly Responses to tenders (6). Many methods can produce this type of results: MAVT (Keeney and Raiffa 1976), MAUT (Dyer 2005), UTA (Jacquet-Lagrèze and Siskos 1982), MACBETH (Bana e Costa et al. 2005), AHP, (Saaty 2005), SMART (Winterfeld and Edwards 1986), TOPSIS (Hwang and Yoon 1981), Choquet Integral (Grabisch and Labreuche 2005), representative value function of UTAGMS (Kadziński et al. 2012), and others.

The application of these methods requires (either in course of applying the method, or even before) that the scale of each evaluation criterion is an interval scale. Sometimes, these scales have to be identical. Constructing such scales calls for interacting with the decision maker (or his representative) in a way which is specific for each of these methods. The analyst should keep this requirement in mind when drawing up a short list of methods.

In some contexts, for example, Location of a municipal waste incineration plant (3), Line extension of the Paris metro (4), or Supplier selection (5), one should take into account some viewpoints for which the definition of associated criteria, as well as the data necessary for defining performances of each action on these criteria, can involve some arbitrariness, uncertainty or, more generally, some indetermination. Such an indetermination can be handled using probabilistic or fuzzy modeling, however, this modeling can also be arbitrary to some extent [see, e.g., Fuzzy AHP (Wang et al. 2008) and Fuzzy TOPSIS (Wang et al. 2003)]. This difficulty can be bypassed through sensitivity analysis, when there is only one or two viewpoints that need the handling of the impact of this indetermination. It can happen, however, that the diversity and importance of the sources of indetermination make it difficult to assign a numerical value (or even a small interval) to each of the potential actions. For this reason, the analyst may give up the idea of requesting this type of results.

  1. (b)

    Type 2: The set of actions is ranked (without associating a numerical value to each of them) as a complete or partial weak order.

This type of result can be considered only if the set A of potential actions is known a priori. It is not convenient when the potential actions are examined as they arrive (week after week, month after month, etc.); this is the case of such contexts as Clinical decision support in emergency room (9), or Credit granting (10). It seems well adapted to such contexts as Line extension of the Paris metro (4), or Programming of water supply systems for rural areas (8).

The methods relevant here are: ELECTRE III, IV (Figueira et al. 2005, 2013), PROMETHEE I and II (Brans and Mareschal 2005), all Robust Ordinal Regression methods (Greco et al. 2010b) producing necessary and possible rankings, like UTAGMS (Greco et al. 2008c), GRIP (Figueira et al. 2009b), Extreme Ranking Analysis (Greco et al. 2012), RUTA (Kadziński et al. 2013), ELECTREGKMS and PROMETHEEGKMS (Greco et al. 2011), and moreover, the Dominance-based Rough Set Approach to ranking (Greco et al. 2001; Słowiński et al. 2009; Szeląg et al. 2013), and Machine Learning approach (Dembczyński et al. 2010).

Note, moreover, that the results of type 1 and 2 are well adapted to the case when the expected result is a list of k-best actions that are diverse enough and should be analyzed further by the decision maker.

  1. (c)

    Type 3: A subset of actions, as small as possible, is selected in view of a final choice of one or, at first, few actions.

As in the case of type 2, this type of result is not convenient when the set A of potential actions is not known a priori. It is also not convenient in the contexts like Line extension of the Paris metro (4), Management of highway assets (7), Programming of water supply systems for rural areas (8), Clinical decision support in emergency room (9), or Monitoring of risk zones (11). In these contexts, many potential actions, and not only one, are intended for joint execution. The type of result considered here is convenient when the potential actions are modeled as alternatives, i.e. such that a joint execution of any two of them is excluded. This is the case of such contexts as Commuter rail line (1), Siting of a nuclear power-plant (2), Location of a municipal waste incineration plant (3), Supplier selection (5), or Engineering design of a chemical reactor (12).

This type of results is produced directly by such methods as: ELECTRE I and IS (Figueira et al. 2005, 2013), PROMETHEE V (Brans and Mareschal 2005), Rubis (Bisdorff et al. 2008). The methods of multiobjective optimization (Branke et al. 2008) also lead to this type of results, however, they are applied when actions of the set A are vectors of variables subject to some mathematical programming constraints. These methods will also be considered later for type 5 of results.

Note that this type of result is also relevant to the case of a multicriteria choice of the best portfolio of objects with cardinality and cost constraints. In this case, the set A is composed of alternatives that represent those combinations of objects which respect cardinality and cost constraints. The methods designed for this case are discussed, e.g., by Liesiö et al. (2008), Metaxiotis and Liagkouras (2012) and Greco et al. (2013a).

Remark, moreover, that the methods considered for type 1 and type 2 of results can also be used in this case: the top ranked actions can be seen as result of type 3. Thus, when the set A is defined as a set of alternatives, the analyst can put on the short list quite a few methods.

  1. (d)

    Type 4: Each action is assigned to one or several categories, given that the set of categories has been defined a priori.

It is possible that this type of results is imposed by the decision maker. This could be the case of such contexts as: Responses to tenders (6), Management of highway assets (7), Clinical decision support in emergency room (9), Credit granting (10), or Monitoring of risk zones (11). This type of results is particularly well adapted to the contexts where the set A is not defined a priori, like Clinical decision support in emergency room (9) and Credit granting (10). It can also be convenient to presort when a large number of potential actions have been listed at the starting point of the decision process (Jaszkiewicz and Ferhat 1999). This could happen in such contexts as: Commuter rail line (1), or Siting of a nuclear power-plant (2). Such a presorting is also used in some interactive multiobjective optimization procedures, where it concerns a number of non-dominated solutions proposed for evaluation to the decision maker in each dialogue phase (Greco et al. 2008a).

Quite various methods provide this type of results. Let us mention those that rely on Dominance-based Rough Set Approach (Greco et al. 2001, 2002a, 2005; Błaszczyński et al. 2007; Dembczyński et al. 2009; Słowiński et al. 2009), UTADIS (Devaud et al. 1980), PREFDIS (Zopounidis and Doumpos 2000), UTADISGMS (Greco et al. 2010a), ELECTRE TRI-B (initially ELECTRE TRI) (Figueira et al. 2005; Yu 1992), ELECTRE TRI-C (Almeida-Dias et al. 2010), ELECTRE TRI-NC (Almeida-Dias et al. 2012), filtering method (Perny 1998), PROAFTN (Belacel 2000), TRINOMFC (Léger and Martel 2002), PAIRCLASS (Doumpos and Zopounidis 2004), THESEUS (Fernandez and Navarro 2011), among others. These methods are distinguished by many features, in particular by: ordered or non-ordered categories, the way of defining the categories, the hypotheses and logical foundations of the assignment procedure, the nature of requested preference information. One can observe, however, a lack of sorting methods that take into account some additional constraints on the categories, like a balanced composition of the categories (for example, men and women). Recently, however, a sorting method called DIS-CARD has been proposed (Kadziński and Słowiński 2012), which takes into account desired cardinalities of the categories. Another tentative of handling constrained sorting problems has been made by Mousseau et al. (2003).

  1. (e)

    Type 5: A subset of potential actions enjoying some remarkable properties is provided to serve as a base in the following stage of the decision aiding process.

This type of result may be required when the set of alternatives A contains a very large number of actions (more than one hundred). This is, for example, the case when the actions are defined by vectors of variables subject to some mathematical programming constraints. Then, it may be interesting to get a restricted subset of A, called A′, composed of actions enjoying some remarkable properties, and then replace A by A′ at later stages of the decision aiding process. In multiobjective optimization, the set A′ is a set of non-dominated actions (also called efficient or Pareto-optimal solutions) or an approximation of this set. Mathematical foundations of the completeness and constructiveness of parametric characterization of the set of non-dominated actions has been given by Wierzbicki (1986). From a practical point of view, evolutionary algorithms appeared to be particularly effective in finding a good approximation of the set of non-dominated actions in multiobjective optimization (Deb 2008).

This type of result is particularly welcome in interactive multiobjective optimization, where the set A′, being a complete set of non-dominated actions or its (approximate) representation, is a base for an interactive procedure leading to some best compromise actions. Interactive procedures are composed of two alternating phases: computation phase and dialogue phase. In the computation phase, one or several non-dominated actions are found in A′ and presented to the decision maker. Then, in the dialogue phase, the decision maker is criticizing the proposed actions unless one of them is completely satisfactory. In the latter case the procedure stops. Otherwise, the critical evaluation of proposed actions is used as preference information to guide the search of one or several non-dominated actions belonging to A′ in the next computation phase, with the intention of better fitting the decision maker’s preferences. The way of using the set A′ in this procedure is distinguishing two major categories of interactive procedures:

  1. (1)

    procedures based on exploration of A′,

  2. (2)

    procedures based on progressive contraction of A′.

Typical examples of category (1) are procedures using reference points defined by aspiration levels in the criteria space. The reference points are projected onto the set A′ in order to find the ‘closest’ non-dominated actions to be proposed to the decision maker. The projection is done using Chebyshev-like achievement (scalarizing) functions. Changing the reference point, one can browse the whole set A′. The reference point approaches have been described by Wierzbicki (1999). Some implementations of the projection principle give to the user an impression of driving a vehicle over the non-dominated set—this is, for example, the case of Pareto race (Korhonen and Wallenius 1988) or NIMBUS method (Miettinen 1999).

Interactive procedures of category (2) present to the decision maker in each dialogue phase a sample of non-dominated actions picked from a gradually reduced region of the set A′. A typical example is the method of Choo and Atkins (1980). The interesting subregion of the set A′ is often delimited by a polyhedral cone with the origin in a reference point, oriented toward the non-dominated set. The spread of this cone is controlled by interval values of weights assigned to criteria in the weighted Chebyshev achievement function, like in the cone contraction method (Steuer 1986). In the ‘Light Beam Search’ method (Jaszkiewicz and Słowiński 1999), the spread of the cone is defined by an outranking relation between a non-dominated action, called middle point, resulting from projection of a reference point onto the set A′, and its neighborhood actions, such that the subregion of A′ is composed of non-dominated actions which are not worse than the middle point, i.e. outrank the middle point. In the robust cone contraction method (Kadziński and Słowiński 2012), the spread of the cone with the origin in a reference point is defined by directions of the isoquants of all weighted Chebyshev achievement functions compatible with pairwise comparisons of some non-dominated actions from the current set A′, specified by the decision maker. In successive iterations, each new pairwise comparison contracts the cone which is zooming on a subset of non-dominated actions of greatest interest for the decision maker. It is also worth mentioning the NEMO method which combines an evolutionary multiobjective optimization with robust ordinal regression in an interactive procedure (Branke et al. 2010); this combination allows speeding up convergence to the most preferred subregion of the set A′.

Note that, independently of category (1) or (2), the final result of interactive multiobjective optimization methods is of type 3.

The context fully adequate to type 5 of results is Engineering design of a chemical reactor (12). It may also be useful in Programming of water supply systems for rural areas (8), in part (b) concerning the choice of the best variant of technical construction of the regional WSS, and, particularly, in development planning of a jointly operated urban water supply and wastewater treatment system, formulated as a multiobjective fuzzy LP problem.

Five other key questions to choose the right method

The analyst is advised to answer these questions while considering successively various methods short-listed in reply to the initial question about the type of expected results. The way these questions are ordered below does not suggest any priority in answering them. The decision context, in which the analyst plays her role, and the methods she has short-listed, can influence the order of examination of these questions.

Question 1a:

Do the original performance scales have all required properties for a rightful application of the considered method?

Some methods (in particular those mentioned in “A crucial question conditioning the choice of the method by the analyst” in point (a)) cannot handle directly the evaluation criteria whose performances are located on verbal scales, or even numerical but purely ordinal scales. When the answer to question 1a is “no”, the analyst has to check if it is possible to code or transform in a meaningful way the original scales, such that the properties of scales required by the considered method are satisfied. The analyst can carry out this check by looking for an additional information, especially in the course of interaction with the decision maker (or his representative). In order to decide about keeping or rejecting the considered method, the analyst will have to examine to what extent the transformations are arbitrary, and how much the numerical coding of performances masks their meaning. If among the short-listed methods there are some methods that do not require this type of transformation, she will have to assess the trade-off between the advantage of keeping the original scales (for interpretation of results and organization of a debate about them) and the inconvenience that may have these alternative methods.

In the case study referring to the context of Siting of a nuclear power-plant (2), Keeney and Nair (1976) had to recode the original performances such that the selected method (MAUT) could be applied in a meaningful way. Roy and Bouyssou (1986) have shown that it would have been possible to avoid this recoding if ELECTRE III had been used, taking into account imperfect knowledge through discrimination thresholds without invoking distributions of probability. In cases considered within the contexts of Management of highway assets (7), and Clinical decision support in emergency room (9), the analyst came to conclusion that it is not possible to make a meaningful recoding of original performances into a common scale, which would permit to use, e.g., the Sugeno integral (Grabisch and Labreuche 2005), and thus she selected the DRSA method based on rough sets (Augeri et al. 2011; Wilk et al. 2005; Słowiński et al. 2009). This method is using a preference model in the form of a set of logical “if…, then…” decision rules that express conditions on particular criteria in terms of their original scales (see question 1d).

Question 1b:

Is it simple or hard (even impossible) to get preference information that the method requires?

In order to make the results provided by a short-listed method pertinent for decision aiding, it is necessary to adjust some of its characteristics so as to take into account in the best possible way the preferences of the decision maker; these are usually some wishes underlying the value system of the decision maker. The analyst has thus to acquire what is called preference information. This information takes different forms for various methods: ordering of criteria, ordering of some actions, acceptable trade-offs, pairwise comparisons of some actions, assignment of some actions to categories, comparisons of some actions with respect to intensity of preference, assessment of lotteries, specification of the relative importance of criteria, presence of veto, etc. To acquire this information, the analyst has to interact with the decision maker (or with his representative), in view of co-constructing the model of preferences that the considered method exploits to work out expected results. A key issue is to organize this interaction such that the analyst is able to elaborate meaningful results. This implies that the interaction protocol or the software tool involved should be compatible with the way in which the analyst has been inserted in the decision process, with the way of reasoning of the inquired people, and with their meaning of useful results. This protocol or software tool has to ensure, moreover, an intelligibility and a traceability of the impact of the preference information on the results. If such an interaction appears to be impossible, the considered method has to be rejected. In all cases the analyst has to assess the part of arbitrariness that the acquired information may contain. She has to remember that it will be necessary to analyze its impact on the results provided by the method (sensitivity analysis, robustness concern).

In the case study Line extension of the Paris metro (4), the impossibility of acquiring preference information about relative importance of criteria led to elaboration of a new method: ELECTRE IV (Figueira et al. 2005). In cases considered within the contexts of Management of highway assets (7), and Clinical decision support in emergency room (9), direct elicitation of preference information about values of parameters of a preference model (in particular, the relative importance weights of criteria) also appeared impossible, which put forward the DRSA method based on rough sets. DRSA accepts sorting examples as input preference information (Greco et al. 2005). Remark that sorting examples, i.e. assignments of some well-known actions to decision categories, does not need as much cognitive effort from the decision maker as direct specification of preference model parameters required by many other methods.

On the other hand, in the case study Engineering design of a chemical reactor (12), it was natural to select a method taking into account the preference information expressed in terms of aspiration levels on the four objectives related to: reactor cost, production cost, volume of production, and efficiency. These aspiration levels define a reference point in the objective space, which can be projected onto the non-dominated set, indicating a candidate for the compromise action, together with its neighborhood that can then be explored by the decision maker (Jaszkiewicz and Słowiński 1999).

Question 1c:

Should the part of imprecision, uncertainty or indetermination in the definition of performances be taken into account, and if so, in what way?

It is rare that the performances of the considered actions can be evaluated on each criterion without any ambiguity. The way in which a criterion is modeling preferences related to a specific point of view, or the role that an attribute is supposed to play, can be ill-determined or contain a part of arbitrariness. Some data used to construct criteria can be imprecise or defined in an ambiguous way. When it is important to take into account such an imperfect knowledge, the analyst should examine carefully the possibility of handling this knowledge by the short-listed methods.

We note that, when performance data are ambiguous, this can cause inconsistency in the indirect preference information given by the decision maker in the form of decision examples (sorting examples or pairwise comparisons). In case of inconsistency, the rough set concept is useful for discerning certain from possible knowledge in reasoning about ordinal data (Słowiński et al. 2012).

Let us observe, moreover, that, independently of the method, it is always possible to handle the imperfect knowledge:

  • Either by sensitivity or robustness analysis of results provided by the method. This approach becomes quickly too onerous when the sources of imperfect knowledge affect more than one or two criteria, because this sensitivity or robustness analysis has often to be combined with a similar analysis taking into account imperfect preference information (partial, inconsistent, vague, etc.; see question 1b).

  • Or, in an indirect way, by modeling the imperfect knowledge using probability or possibility distribution of actions’ performances on considered criteria. In probabilistic case, the action’s performance on a criterion is set to an expected value of probabilistic distribution. In possibilistic case, it is set to a “mean value” of fuzzy number representing possibilistic distribution (Dubois and Prade 1987). This approach admits, however, that it is possible to model the indetermination in probabilistic or possibilistic way on the base of a relatively objective assessment. A frequent use of the Gaussian distribution in the probabilistic approach, and of the triangular fuzzy numbers in the possibilistic approach, is often missing a sound foundation, given that the performances can vary in a rather narrow interval and that extending it for very low probabilities or possibilities is rather arbitrary. To validate a probabilistic approach, the analyst can sometimes be tempted to ask the decision maker (or his representative) to compare some lotteries in view of revealing his perception of preference or indifference. Proceeding in this way, the analyst is led to translate the indetermination of performances into the terms of preference information (see point 1b), which involves an attitude toward risk. These questions can, however, confuse the decision maker.

The methods that take into account pseudo-criteria are able to deal with imperfect knowledge about performances through the use of discrimination thresholds (Roy and Vincke 1987): indifference and preference thresholds. In order to assign values to these thresholds, the analyst has to investigate what sources of imprecision, uncertainty and indetermination affect the performance of an action on the considered criterion. These sources can originate from imperfect representation of the specific viewpoint by this criterion as well as from imperfect knowledge of data to be used for definition of the performance. Being conscious of these sources, the analyst should be able to assess the smallest difference between two performances which, when growing, becomes significant for preference of the action with the better evaluation over the action with the worse evaluation (this is the preference threshold), as well as the greatest difference between two performances which, when decreasing, becomes insignificant and leads to indifference of the two actions (this is the indifference threshold). These two values are not necessarily equal. The analyst must be convinced that the values assigned to these thresholds are appropriate for a right handling of the imperfect knowledge, or using other words, for giving to the comparison of two performances the meaning it deserves (for more details, see Roy and Figueira 2013).

Among the methods short-listed by the analyst, there can be methods which have been developed for the handling of a particular form of imperfect knowledge. Taking into account that reasoning based on imperfect knowledge can lead to uncertain conclusions, three kinds of uncertainty can be distinguished (Zadeh 1999): (1) uncertainty following from a random change of some variables, called veristic, which can be modeled by probability, (2) uncertainty following from subjective judgments, called possibilistic, which can be modeled by fuzzy sets, and (3) uncertainty caused by granularity of information, called inconsistency, which can be modeled by rough sets. The theories standing behind these concepts of uncertainty are: (1) probability theory, (2) possibility theory (Dubois and Prade 1988) and fuzzy set theory (Słowiński 1998), and (3) rough set theory (Pawlak 1991; Greco et al. 2001). Sometimes the uncertainty is more complex and needs hybridization of the above theories, e.g., probabilistic-fuzzy (van den Berg et al. 2004), fuzzy-rough (Greco et al. 2008b), or probabilistic-rough (Kotłowski et al. 2008). Some comparative studies have been carried out between stochastic and fuzzy approaches on the ground of multiobjective optimization (Słowiński and Teghem 1990).

If some methods accepting pseudo-criteria have been short-listed together with some methods specialized in handling a particular type of imperfect knowledge, the analyst must check which one of them will fit the context under study.

In the case studies Supplier selection (5), Programming of water supply systems for rural areas (8), Credit granting (10), Monitoring of risk zones (11), and Engineering design of a chemical reactor (12), the handling of imprecision, uncertainty and indetermination in the definition of performances of actions on some criteria was crucial. The methods applied in these studies dealt with imperfect knowledge about performances through application of pseudo-criteria. In the development planning of a jointly operated urban water supply and wastewater treatment system, considered in context (8), imperfect knowledge about cost and reliability coefficients, water pollution indices, discount factors and the user’s demands was modeled by fuzzy numbers; in consequence, the problem has been formulated as a multiobjective fuzzy LP problem. Finally, in the decision contexts Management of highway assets (7) and Clinical decision support in emergency room (9), the crucial problem related to imperfect knowledge was the inconsistency in the indirect preference information given in the form of sorting examples; for this reason the rough set approach (DRSA) has been used in these studies.

Question 1d:

Is the compensation of bad performances on some criteria by good ones on other criteria acceptable?

In the context of multicriteria decision aiding methods, compensation means that the method offers possibilities of the following type. Let a be an action strictly preferred to another action b, both having the same performances on all but one criterion i on which b is significantly worse. One says that the method offers a possibility of compensation if improving one or more performances of b on other criteria than i it is possible to define an action c indifferent to a. These improvements compensate the bad performance of b compared to a on the i-th criterion. Such possibilities of compensation are offered in many ways by additive methods. In a lexicographic method, they exist only if the i-th criterion is not the most important. In ELECTRE type methods, they exist under extremely restrictive conditions.

On one hand, the methods that rely on aggregation of criteria into a synthetic criterion assigning a numerical value (utility, score) to each potential action use extensively and systematically this kind of compensation. On the other hand, methods that rely on multicriteria aggregation involving the concepts of concordance and discordance in view of elaborating outranking relations allow a very limited compensation in particular conditions. Moreover, methods that rely on the concept of rough set represent preferences in terms of “if…, then…” decision rules which do not admit any compensation. In the condition part of these rules there is a conjunction of elementary conditions concerning either performances of a single action on a subset of criteria (in case of sorting) or difference of performances for pairs of actions on a subset of criteria (in case of choice or ranking); for example, denoting the set of identifiers of criteria by I, criteria (of the gain type) by g i , g j , and actions by a, b, the rules have the following syntax (Greco et al. 2005):

  • in case of sorting:

    “if g i1(a) ≥ r i1 and g i2(a) ≥ r i2 and… g ik (a) ≥ r ik , then a is assigned to category t or better”,

    “if g j1(a) ≤ r j1 and g j2(a) ≤ r j2 and… g jh (a) ≤ r jh , then a is assigned to category t or worse”,

    where r i1r i2,…, r ik and r j1r j2,…, r jh are some threshold performances on criteria i1,…, ikj1,…, jh ∈ I, found during induction of rules from preference information given in the form of sorting examples and structured using the rough set concept,

  • in case of choice and ranking:

    “if g i1(a) ≥ r i1 and g i1(b) ≤ s i2 and… g ik (a) ≥ r ik and g ik (b) ≤ s ik , then a is outranking b”,

    “if g j1(a) ≤ r j1 and g j1(b) ≥ s j2 and… g jh (a) ≤ r jh and g jh (b) ≥ s jh , then a is not outranking b”,

    where r i1r i2, …,r ik r j1r j2,…, r jh and s i1s i2, …, s ik s j1s j2,…, s jh are some threshold performances on criteria i1,…, ikj1, …, jh ∈ I, found during induction of rules from preference information given in the form of pairwise comparisons of some actions and structured using the rough set concept.

We note that each rule is a scenario of a causal relationship between performances on a subset of criteria and a comprehensive judgment. The rules are non-compensatory aggregators that do not convert ordinal scales of criteria into a richer (interval or ratio) scale.

There are also other methods using non-compensatory aggregation, such as the lexicographic method and the method using the Sugeno integral. A comparison of these methods at an axiomatic level has been done by Greco et al. (2003, 2004), and by Słowiński et al. (2002).

Among the twelve contexts described in “A number of actual decision contexts”, Siting of a nuclear power-plant (2), Programming of water supply systems for rural areas (8), but only in part (b) concerning the choice of the best variant of technical construction of the regional WSS, and Engineering design of a chemical reactor (12), are those in which the compensation of a very bad performance on one criterion by a series of good performances on other criteria seems conceivable. In all other contexts, the decision maker would rather be reluctant to accept such compensation, and thus the analyst would be obliged to eliminate compensatory methods.

Question 1e:

Is it necessary to take into account some forms of interaction among criteria?

A great majority of methods available nowadays, do not account for any form of interaction. Let us remind that interaction is a complex concept (Roy 1996, chapter 10; Roy 2009). For this reason, in general, the analyst is interested in designing the family of criteria so that any interaction among these criteria is excluded. If such a design appears impossible, the analyst and the decision maker (or his representative) should examine together the forms of interactions that should be handled in course of a multicriteria aggregation. This examination can be done either a priori or a posteriori; in the former case, identification of the form of interaction to be handled by the preference model permits to select a method designed for this form of interaction; in the latter case, the examination may resort to analysis of compatibility of preference information, provided in the form of sorting examples or pairwise comparisons of some actions, with the preference model that accounts for interaction or not. Then, considering all short-listed methods, the analyst will hesitate between:

  • a special form of MAUT, involving either a decomposition and graphical representation of additive value functions under ceteris paribus assumption, like GAI-networks (Gonzales and Perny 2005) and UCP-networks (Boutilier et al. 2001), or on one hand, a multilinear value function (Keeney and Raiffa 1976), and on the other hand, Choquet or Sugeno integral (Grabisch and Labreuche 2005),

  • the robust ordinal regression methods with an additive utility function augmented by components accounting for positive and negative synergy of pairs of criteria (Greco et al. 2013b),

  • ELECTRE method designed to handle interactions (Figueira et al. 2009a), and

  • the methods using a set of “if…, then…” decision rules as a preference model, since this model based on a very simple syntax of rules is able to handle the most complex interactions (Greco et al. 2002b, 2004; Słowiński et al. 2002).

None of the cases considered in “A number of actual decision contexts” required a priori recognition of some form of interaction among criteria. However, in the case studies referring to contexts: Management of highway assets (7) and Clinical decision support in emergency room (9), the rough set approach has been used, which is able to handle interactions through decision rules, if such interactions would appear in the preference information.

Secondary questions

Before making the final choice, especially if hesitating between various methods, the analyst may consider the following secondary questions:

Question 2a:

Is the method able to satisfy properly the needs of comprehension from the part of stakeholders involved in the decision process?

The needs of comprehension can come not only from the decision maker (or his representative) but also from other stakeholders. If they are not properly satisfied, they can compromise a good insertion of the analyst in the decision process. Thus, she must try to assess up to what level of detail she should explain the way of functioning of the tool, which is the method. Depending on the level at which the requests are placed, she must also assess if those making the requests are ready to devote enough time for listening to the explanation. In many cases, what matters is the possibility of explaining the link that exists between information and data provided from one side, and the results obtained by the method from the other side, without entering into details of the method. As an example, let us mention the ‘even swaps’ method which typically provides non-technical explanation of this link for an additive preference model (Hammond et al. 1998).

This question was considered when choosing the methods in the contexts: Responses to tenders (6), Credit granting (10) and Monitoring of risk zones (11). In these cases, the needs of comprehension of the selected method were satisfied by showing on some well-chosen practical examples the type of results these methods produce. An answer to this question also influenced the selection of the method using a preference model in the form of “if…, then…” decision rules in the contexts of Management of highway assets (7), and Clinical decision support in emergency room (9)—intelligibility and traceability of the feedback between the preference information and the recommendations given by rules were convincing arguments for choosing in these contexts the DRSA method based on rough sets (Słowiński et al. 2009).

Question 2b:

Is an axiomatic characterization of the method available, and if so, is it acceptable in the considered decision context?

This characterization, if available, provides a set of axioms that justify the application of the method for every analyst and decision maker who find this set of axioms adequate to the considered decision context. A good example of the case where the preferences expressed by the decision maker are not compatible with a set of axioms underlying a multicriteria aggregation procedure is the example of mayor’s preferences presented by Vincke (1982). An axiomatic characterization can serve as a scientific backing to the theorist playing the role of the analyst but she should not overestimate its importance. The scientific guarantee brought by this characterization has to be put into perspective of the following reasoning:

  1. 1.

    Suppose first that the analyst accepts the hypothesis that the decision maker has in his mind a relatively well-defined system of preferences, and that the analyst wants to know if every axiom characterizing the short-listed method is satisfied by this pre-existing system of preferences. A classic way to answer this question is to involve the decision maker (in practice, the person being inquired is rarely the decision maker; usually, this is somebody who represents the decision maker—sometimes a member of the ‘task force’ team) in a series of choice situations, and to ask him for precise answers. The decision maker is not necessarily familiar with these situations. He may find them artificial and thus unrealistic. In consequence, the answers given by the decision maker should be interpreted cautiously. Nevertheless, if in a given situation an answer violated clearly the considered axiom, the analyst would be allowed to conclude that the process which determines the decision maker’s preferences with respect to this choice situation violates the axiom in question. If, however, such a violation would not happen for any presented choice situation, the analyst could not be certain if in other choice situations the axiom would not be violated. The analyst should also take into account the fact that the process which determines the answer of the decision maker in choice situation n can be influenced by the way in which he was led to think and answer in n − 1 previous situations. Interrogation protocols which do not escape the trap of this influence obscure the conditions in which the analyst can check if a set of axioms conforms with a system of preferences that pre-exists in the decision maker’s mind (Roy 2010).

  2. 2.

    Let us suppose now that the analyst is not interested in knowing if the decision maker has in his mind a system of preferences concordant with the considered set of axioms, but, more modestly, if he accepts each axiom as a working hypothesis. Such a question can be posed directly to the decision maker only if the axioms are sufficiently simple and well understandable by him. For any other axiom, the analyst will have to present as realistic situations as possible and check the decision maker’s reaction to these situations. They should be conceived in a way permitting to show the decision maker that accepting the axiom implies one type of answers, and rejecting it implies another type. Also here, the way of conceiving the situations and the underlying questions are not neutral. In consequence, the responses being obtained have to be seen as constructs rather than a reflection of an objective reality. It follows that the analyst is influencing more or less consciously the acceptance or rejection of an axiom as a working hypothesis.

  3. 3.

    Accepting one by one the axioms from a certain set does not ensure that the set would be accepted in total by the decision maker. Taking into account all axioms of a certain set jointly may show some emerging phenomena that the decision maker cannot see when analyzing sequentially these axioms. It would be thus wrong to suggest that accepting each axiom from a set would imply the validity of all results following from the totality of the set of axioms. The wrong conclusion follows from the confusion between two levels of logics: the one of parts, and the one of the whole. This probably explains why the answers obtained by Maurice Allais to the choice situations presented to the adherents of the utility theory (and, in particular, to many of its founders) appeared to be discordant with the results following from the set of axioms on which this theory is based (Allais 1953; Allais and Hagen 1979).

The above considerations should not lead to believe that the axiomatic work is not useful. On the contrary, it can provide a better understanding and insight to the analyst on how the method actually behaves and how it compares to other methods at the axiomatic level. Such an analysis permitted, for example, to compare the capacity of preference representation of a general utility function and three of its special cases: associative operator, Sugeno integral and ordered weighted maximum on one hand, and a set of rough set decision rules on the other hand (Greco et al. 2004). The formal proof that the decision-rule aggregation (preference) model is the most general among the known aggregation functions is a useful conclusion of this study.

As to the influence of this question on the choice of a multicriteria decision aiding method in the twelve contexts described in “A number of actual decision contexts”, we can say that this influence was not observed. Perhaps with the exception of the way the case of Siting of a nuclear power-plant (2) was treated by Keeney and Nair (1976) and Keeney and Robillard (1977).

Question 2c:

Can the weak points of the method affect the final choice?

All methods of decision aiding have some weak points. The analyst should know them if she is familiar with the method. She has to examine the possible impact of these weak points in the considered decision context.

An example of the weak point of many methods that involve pairwise comparisons of actions to build a preference model is known under the name of “rank reversal”. This weakness can compromise a method that one would like to use in a decision context where the set of actions may be modified incrementally; this may be the case of ranking an unstable set of actions. Another weak point could be the calculation time if it would be incompatible with the conditions of using the method in the considered decision context; for example, too long calculations between dialogue phases in a multiobjective optimization method.

The weak points of some popular methods have been clearly shown and discussed in various publications, e.g.: for ELECTRE methods in Figueira et al. (2013), for AHP in Bana e Costa and Vansnick (2008) and in Bouyssou et al. (2000, point 6.3.2), for TOPSIS in Martel and Roy (2006), for MAUT in McCord and de Neufville (1983), for methods based on Choquet integral in Roy (2009) and in Greco et al. (2013b), and for methods based on Sugeno integral in Słowiński et al. (2002).

In all the considered decision contexts, the week points of the selected methods have been examined to feel reassured about the final choice.

Conclusions

In this paper, we have reviewed the questions which, in our view, should be answered by an analyst before choosing the method to be used in a decision context. We have proposed a hierarchy of questions which, although quite general, may appear incomplete or inadequate in some specific cases.

After answering all these questions, the analyst can face two difficulties:

  1. (a)

    Her answers to the questions lead her to select a method which she does not control well enough or for which there is no software implementation available. If the analyst does not have enough time to learn the method, or if she is not able to make or order its software implementation, then she will be obliged to disregard this method and reconsider her answers to these questions.

  2. (b)

    Taking into account her answers to the questions, there is no appropriate method. If the analyst can afford (competence, time and financing), she can try to design and implement an appropriate method. Otherwise, she must revisit her answers to these questions, and accept some less satisfactory answers with the aim of permitting to find a method.

The content of the questions, and the diversity of answers that can be given with respect to the decision context, lead us to the conclusion that it is not possible to conceive a family of criteria which would permit a multicriteria formulation of the problem of choosing a multicriteria decision aiding method. The few attempts known from the literature do not seem to be a success (Ozernoy 1988, 1992; Guitouni and Martel 1998). The literature confirms that we are not alone to claim that the choice of an appropriate method is one of the most difficult problems to which the analyst is confronted in multicriteria decision aiding (Belton and Stewart 2002).

The fact that in real world contexts the answers to the nine questions formulated above do not boil down to terms “yes” or “no”, as well as the fact that the short-listed methods are usually more or less adequate to the considered context, explain why so many methods have been developed, and why it is difficult to compare one method to another in an insightful way. Unfortunately, many researchers are tempted to compare different multicriteria decision aiding methods by basing their conclusions mainly on comparison of results obtained by these methods. The arguments enunciated in this article highlight the fact that such a comparison is ill-founded.