Extreme Learning Machine (ELM) has a powerful capability to approximate the regression and classification problems for a lot of data. ELM does not need to learn parameters in hidden neurons, which enables ELM to learn a thousand times faster than conventional popular learning algorithms. Since the parameters in the hidden layers are randomly generated, what is the optimal randomness? Lévy distribution, a heavy-tailed distribution, has been shown to be the optimal randomness in an unknown environment for finding some targets. Thus, Lévy distribution is used to generate the parameters in the hidden layers (more likely to reach the optimal parameters) and better computational results are then derived. Since Lévy distribution is a special case of Mittag-Leffler distribution, in this paper, the Mittag-Leffler distribution is used in order to get better performance. We show the procedure of generating the Mittag-Leffler distribution and then the training algorithm using Mittag-Leffler distribution is given. The experimental result shows that the Mittag-Leffler distribution performs similarly as the Lévy distribution, both can reach better performance than the conventional method. Some detailed discussions are finally presented to explain the experimental results.