TF IDF
TF IDF is stands for Term Frequency-Inverse Document Factory. Basically the TF IDF will produced a vector of word every document. To calculate tf-idf there is somestep to follow :
- Create the matrix term frequency of word in each document.
example word : “I am very sleepy, and i want to go to bed”
the term frequency matrix is :
I = 2
am = 1
very = 1
sleepy = 1
and = 1
want= 1
to = 2
go = 1
bed = 1

- Inverse the document frequency.
you can use that formula by logaritmic count of row every word divide count of row when the count of word is more than 0.

Calculate the tf idf
After that calculate matrix tf and matrix idf

Normalize the Tf-idf
as you see, the result of tf-idf is not normal distributed. So we need 2 step to normalize the tf-idf :
- find the weight / divider for every word.

- divide the Tf-idf every document with the weight of the document
after we...
Keynote 1 - Digital Democracy: How the Web is empowering citizens, and how Big Data Analytics affords better understanding of government - Hisar Maruli Manurung, Ph.D (slide) pada talk pembuka ini dibawakan oleh Pak Hisar Ruli Manurung beliau pertama - tama menjelaskan kisah mengenai jika kritikan social sudah ada sejak tahun 1991 melalui milis “apakabar”, lalu beliau bercerita mengenai pemilu 2014 yang lalu. Beliau menceritakan mengenai bagaimana beliau mencanangkan ide awal mula dari pengawalan pemilu, beliau memberikan sebuah rancangan dokumen mengenai adanya crowd sourcing dalam input data dari...




