PENERAPAN PART-OF-SPEECH FILTERING PADA FEATURE SELECTION DALAM METODE SUPPORT VECTOR MACHINE TERHADAP ANALISA SENTIMEN TWITTER MENGENAI PEMILIHAN GUBERNUR DKI JAKARTA 2017

Fregy Damara, Hartatik
UNIVERSITAS AMIKOM YOGYAKARTA.2017

A B S T R A C T

In this research, the authors doing a research to determine the effects of feature selection on the Support Vector Machine in classifying sentiments on Twitter tweets. The input space given to SVM is a feature that has been processed through the Part-of-Speech Filtering stage, which is useful for determining the portion of words appropriate for the learning process model from theoretical and linguistic perspectives. There are 4 tags that are selected, that is the noun tag (NN), verb (VB), adjectives (JJ), and adverbs (RB).

Input Space has previously been processed by the calculation of TF-IDF weight. After TF-IDF of each tweet has been calculated, next step is measure the similarity of TF-IDF between each positive, negative and neutral feature list by calculating Cosine Similarity weights. These three Cosine Similarity weights will be classified by the Support Vector Machine.

In addition, the comparison of two models (POS Filter and without POS Filter) clarified that the models without POS Filtering outperformed the model with POS Filtering with the percentage of accuracy by 99,25%. The percentage of accuracy that obtained by the model with POS Filtering is 96.66%. The percentage of prediction accuracy done by each model is equal to 53,33% for filter post and 56,66% for POS Filter. This proves that the number of features in features list used in the Cosine Similarity weighting process has an effect on the classification process done by Support Vector Machine..

Keywords : Support Vector Machine, Part Of Speech, Part Of Speech Filtering, Text Minning, Filtering Feature Selection, Sentiment Analysis.

CategoryUndergraduate Thesis
Posted Date( undocumented )
Modified Date11 September 2017
Download File Publikasi_13.11.6927.pdf
Google Scholar