To execute non-repetitive tasks, robots need to learn on the tasks to improve task performance. The performance model cannot be built in advance for such non-repetitive tasks. The robot can execute a small portion of the task with certain process parameters and attractively update the process parameters based on the observations of the task performance. To make the learning process efficient, the trade-off between exploration and exploitation should be explicitly considered. Too much exploration may lead to the waste of time without significant improvement on task performance. On the other hand, stopping exploration prematurely may lead to suboptimal task performance. This paper describes a sequential decision making approach to select the set of parameters to improve task performance. The overall learning approach uses feasibility biased sampling, surrogate model construction and greedy optimization. We implement our approach in the simulation of robotic sanding. We also compare our method with other design of experiments methods.