Cardiovascular Disease (CVD) is one of the main causes of death in the world. Early detection could prevent deaths associated to cardiac problems. In this work, we propose a methodology based on data pre-processing and Machine Learning (ML) techniques for predicting cardiovascular disease, by using the Sleep Heart Health Study (SHHS) dataset. First, the principal component analysis and lowest p-value logistic regression are applied to select optimal features which could be related to the CVD then, the selected features are used for training four ML algorithms: Naïve Bayes (NB), Feed Forward Neural Networks (NN), Support Vector Machine (SVM) and Random Forest (RF). A binary feature was considered as output of the proposed models and the SMOTE sampling has been used for balancing the training set. Among the proposed methods, NN provided the best accuracy (0.81) and AUC (0.76) outperforming the results obtained in other studies.