Table 1 Fitness of self-contained classifiers to address characteristics and issues in enterprise data.

Classifiers	Imbalance	Mixed features	Heterogeneity	Sparsity	Inconsistency	Dynamics	Data quality issues
KNN	\(\checkmark\)
Naive Byes	\(\checkmark\)	\(\checkmark\)
SVM	\(\checkmark\)		\(\checkmark\)	\(\checkmark\)
Decision Tree	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)
Random Forest	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)
XGBoost	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)
DNN		\(\checkmark\)	\(\checkmark\)	\(\checkmark\)
Table2Vec	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)	\(\checkmark\)

Typical classifiers only focus on classification, requiring heavy and duplicated commitments by each analyst to address similar data quality issues. Table2Vec instead addresses both data quality issues and representation learning in one go, enabling end-to-end and automated enterprise data science.

Quick links

Search