Multi-armed bandit (MAB) is a well-known reinforcement learning algorithm that has shown outstanding performance for recommendation systems and other areas. On the other hand, metaheuristic algorithms have gained much popularity due to their great performance in solving complex problems with endless search spaces. Pendulum Search Algorithm (PSA) is a recently created metaheuristic inspired by the harmonic motion of a pendulum. Its main limitation is to solve combinatorial optimization problems, characterized by using variables in the discrete domain. To overcome this limitation, we propose to use a two-step binarization technique, which offers a large number of possible options that we call scheme. For this, we use MAB as an algorithm that learns and recommends a binarization schemes during the execution of the iterations (online). With the experiments carried out, we show that it delivers better results in solving the Set Covering problem than using a fixed binarization scheme.