Dynamic Treatment Regimes (DTRs) are useful for modelling how decision-makers should act in order to achieve their goals. To estimate an optimal DTR for a particular task, one typically makes the simplifying assumption that the goal of the decision-maker is to optimize the expectation of a scalar-valued outcome, for example symptom level or side-effect burden or some fixed combination of these that reflects the decision-maker's preferences. Furthermore, one also assumes when estimating the optimal action for a given time-point that the decision-maker will act optimally in the future. We will discuss how these two assumptions are related and describe a practical algorithm for weakening the assumptions in order to produce a decision support system that is more realistic. Our motivating application is clinical decision support, where the presence of different competing objectives can make both assumptions unreasonable.