Out of the Box: Machine Learning
Part of the Talend (Real-Time) Big Data Platform is a large assortment of Machine Learning components which allow for analysis to be performed directly in the Talend Studio without custom coding.
The full list, description, and related documentation for Machine Learning components is detailed in the table below. For complete information on the Machine Learning components, see the documentation.
Icon | Name | Component | Description |
---|---|---|---|
![]() | ALS (Alternating Least Squares) | tALSModel | Process information received from Spark and performs ALS computations over these sets to generate and write a product recommender model (user ranking) in Parquet format |
![]() | Bayes (Naïve) | tNaiveBayesModel | Analyzes data sets and applies Bayes’ law with a naïve assumption and generates a classification model in PMML (Predictive Model Markup Language) format |
![]() | Classification | tClassify | Uses a given model to classify elements in the dataset |
![]() | Classification (Support Vector Machine) | tClassifySVM | Uses SVM (Support Vector Machines) model to classify elements in the dataset |
![]() | Decision Trees | tDecisionTreeModel | Uses the Decision Tree algorithm to generate a classification model |
![]() | Gradient Boosted Tree Model | tGradientBoostedTreeModel | Generates a binary classification model |
![]() | K-Means | tKMeansModel | Analyzes incoming datasets and applies K-means algorithm producing a clustering model |
![]() | K-Means (Streaming) | tKMeansStrModel | Analyzes incoming datasets and applies K-means algorithm in real-time |
![]() | Linear Regression | tLinearRegressionModel | Builds a linear regression model using a training dataset |
![]() | Logistic Regression | tLogisticRegressionModel | Analyzes incoming datasets and applies Logistic Regression algorithm producing a classification model |
![]() | Model Encoder | tModelEncoder | Can apply a wide range of feature processing algorithms: HashingTF, Inverse document frequency, Word2Vector, CountVectorizer, Binarizer, Bucketizer, Discrete Cosine Transform (DCT), MinMaxScaler, N-gram, Normalizer, One hot enconder, PCA, Polynomial expansion, Quantile Discretizer, Regex tokenizer, Tokenizer, SQL Transformer, Standard scaler, StopWordsRemover, String indexer, Vector indexer, Vector assembler, ChiSQSelector, RFormula, VectorSlicer |
![]() | Predict | tPredict | Uses a given classification, clustering or relationship model to analyse datasets |
![]() | Predict (Cluster) | tPredictCluster | Uses a given clustering model to analyse datasets into different clusters |
Random Forest Model | tRandomForestModel | Analyzes incoming datasets and applies Random Forest algorithm | |
![]() | Recommend | tRecommend | Analyzes incoming data in conjunction with ALS computations using a user defined recommendation model |
![]() | SVM (Support Vector Machine) | tSVMModel | Applies SVM algorithm to analyze feature vectors |