A Voting Classifier is an ensemble meta-classifier that fits multiple base classifiers on the dataset and uses their average predicted probabilities (for soft voting) or majority vote (for hard voting) to predict the class labels. This can be useful for combining the strengths of different algorithms to achieve better overall performance.
Here's how to use the Voting Classifier in scikit-learn
:
from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.ensemble import VotingClassifier from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC
# For demonstration, let's use the iris dataset iris = datasets.load_iris() X, y = iris.data, iris.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
log_clf = LogisticRegression() dt_clf = DecisionTreeClassifier() svm_clf = SVC(probability=True) # Setting probability=True for soft voting
You can choose between hard voting (voting='hard'
) or soft voting (voting='soft'
):
voting_clf = VotingClassifier( estimators=[('lr', log_clf), ('dt', dt_clf), ('svc', svm_clf)], voting='hard' ) voting_clf.fit(X_train, y_train)
from sklearn.metrics import accuracy_score y_pred = voting_clf.predict(X_test) print("Voting classifier accuracy:", accuracy_score(y_test, y_pred))
This step is optional but is useful to see if the Voting Classifier provides an advantage over individual classifiers.
for clf in (log_clf, dt_clf, svm_clf, voting_clf): clf.fit(X_train, y_train) y_pred = clf.predict(X_test) print(clf.__class__.__name__, "accuracy:", accuracy_score(y_test, y_pred))
Remember:
predict_proba
method in scikit-learn).In many scenarios, especially when the individual classifiers are weak or moderately accurate, the Voting Classifier can yield better results by leveraging the strength of each individual model.
android-checkbox locale portrait dojo-1.8 removeall voip yaml laravel-artisan spfx azure-application-insights