Deep Inside Learning

I work as Android Developer. Currently learning the Machine Learning, Data Science and Life too. This is my personal blog and any views expressed here are mine, and not my employers.

Read this first

TF IDF

TF IDF is stands for Term Frequency-Inverse Document Factory. Basically the TF IDF will produced a vector of word every document. To calculate tf-idf there is somestep to follow :

  1. Create the matrix term frequency of word in each document.

example word : “I am very sleepy, and i want to go to bed”
the term frequency matrix is :
I = 2
am = 1
very = 1
sleepy = 1
and = 1
want= 1
to = 2
go = 1
bed = 1
xac.png

  1. Inverse the document frequency. f6d7c0066f71552b3b06151fe887b6bc.png

you can use that formula by logaritmic count of row every word divide count of row when the count of word is more than 0.
xac.png

  1. Calculate the tf idf
    After that calculate matrix tf and matrix idf
    xac.png

  2. Normalize the Tf-idf
    as you see, the result of tf-idf is not normal distributed. So we need 2 step to normalize the tf-idf :

  • find the weight / divider for every word. xac.png

xac.png

  • divide the Tf-idf every document with the weight of the document xac.png

after we...

Continue reading →


STUFF I LEARNED AT SUMMER SCHOOL 2015: WEB SCIENCE AND BIG DATA ANALYTICS

Beberapa waktu lalu, saya dan Hafiz Badrie Lubis menulis mengenai “SUMMER SCHOOL 2015: WEB SCIENCE AND BIG DATA ANALYTICS” kali ini saya akan melanjutkan tulisan dengan memberikan ringkasan dari setiap talk. Berikut adalah ringkasan dari keynote - keynote yang dibawakan oleh para pembicara UntitledKeynote 1 - Digital Democracy: How the Web is empowering citizens, and how Big Data Analytics affords better understanding of government  - Hisar Maruli Manurung, Ph.D (slide) pada talk pembuka ini dibawakan oleh Pak Hisar Ruli Manurung beliau pertama - tama menjelaskan kisah mengenai jika kritikan social sudah ada sejak tahun 1991 melalui milis “apakabar”, lalu beliau bercerita mengenai pemilu 2014 yang lalu. Beliau menceritakan mengenai bagaimana beliau mencanangkan ide awal mula dari pengawalan pemilu, beliau memberikan sebuah rancangan dokumen mengenai adanya crowd sourcing dalam input data dari...

Continue reading →


Cosine Similarity

As you know, to measure similarity between 2 dataset can use several way, maybe you are familiar with Pearson Correlation, Pearson Correlation is a measure two dataset by measuring their Eucledian Distance or the minimum distance.

Another metode is using Cosine Similarity, different with Pearson Correlation, which is sort the similarity from the smallest distance between data in other words same data are similar if their distance is 0, Cosine similarity is the most similar data, then they should closer to 1.

f369863aa2814d6e283f859986a1574d.png

Thats the formula of cosine similarity, you need to calculate cross product the 2 dataset and divide it with normalize every data set. The most closer the result will be 1. BUT, something that need to remember is we must clean the data first, which mean the data that we will use must be the intersection of two data set.

The implementation of Cosine Similarity in Java can be...

Continue reading →


N Gram analysis

It’s been 2 weeks i’m not posting anything in my blog. Not because there is nothing interesting in this 2 weeks, but this 2 weeks have something to do with my research project proposal and doing some Summer School, and also participating in Hackathon4Nation.

Now, in Sunday Night, Serie - A, Premiere League is off. So i got bored and i don’t want to make my brain not doing anything. So, i’m doing research in N-Gram Analysis. Why i’m researching this ? because it’s have correlation to my research project about comparing text. I write the code in java, and doing it in TDD (YEAYYYYY) and have test about it. You can check on the github and check how to use it in the test.

Ok first letmeexplainyou N-Gram, according to wikipedia, n-gram is a contiguous sequence of n items from a given sequence of text. So basically it will map sequence of text, and group it, and having key and map it into...

Continue reading →


Preparing kerup.uk

Today i already bought a domain kerup.uk. The purpose of this domain as a market place to sell snack to user in Indonesia. Stay tune for our launching. And yes we are hiring. Contact goman@kerup.uk for more info

View →


Official p.hasibuan@acm.org

Since Yesterday, 1 August 2015 i’m officially registered as ACM member and this is my email address [p.hasibuan@acm.org][5] . My purpose for joining this because to able me to access Journal in ACM. I’m also already plan to attend some acm conference such as KDD conference and Recsys conference. And also i’m planning to put 1 or 2 publication next year on ACM conference.

I hope will be contribute more in acm

View →


MultiPart Upload File using Retrofit

We can upload file to server using several technique. But this one i want to demonstrate how to upload file using Retrofit. To upload file using retrofit, you need to wrap your file to TypedFile. How to do it ?

Basically TypeFile is retrieve 2 paramter in their constructor MimeType and File :

TypedFile uploadFile = new TypedFile("video/mp4", new File("/sdcard/path/to/video.mp4"))

If you want to read MimeType you can use this Utility Class.
This is code in your service interface

@Multipart
@POST("/upload/file")
Observable<String> upload(@Part("file") TypedFile file)

First you need to add annotation @Multipart to your method, and then specify @Part which is part of your multipart and what is the parameter name.

i’m using RxJava for Android, the code above is when we upload a file to server, after finished the upload it will return the url of upload file. Easy right ?

But what...

Continue reading →


Porter/Duff in Android

2 days ago, i have a problem to blend 2 view. If you look Android Documentation about porter/duff and I said OMG, what the hell is this, this is not helping.

pd.jpg

After searching i found a good blog explain porter duff. This blog writer is a software engineer from google.

As you can see, basic of porter duff is divide into 4 regions (Both, Source, Dest, Neither). But don’t forget we have 2 images source and destinations.

source.pngdest.png

four regions :

diagram.png

One region where only the source is present, one where only the destination is present, one where both are present, and one where neither is present.

By deciding on what happens in each of the four regions, various effects can be generated. For example, if the destination-only region is treated as blank, the source-only region is filled with the source color, and the ‘both’ region is filled with the destination color like this:

destatop-diagram.png

the result will...

Continue reading →


The Optimal Way to Use List

public List<String> getNames(List<Human> humans) {
    List<String> names = new ArrayList<>()
    for(Human human : humans) {
        names.add(human.name);
    }
    return names;
}

I bet you are understand what code will do. It’ll get names of every human on the list. Yeah, and so? according to Jake Wharton Slides and Android Perfomance pattern we can improve this code so it will be more perform. How ?

  1. As you can see we need to copy from list humans to list string. So, size list of names and size list of users is same. and yes you can change it to be :

    public List<String> getNames(List<Human> humans) {
        int humansSize = humans.size()
        List<String> names = new ArrayList<>(humansSize);
        for(Human human : humans) {
            names.add(human.name);
        }
        return names;
    }
    

    Why we need this ? do you know how list work. It will replicate itself by twice of current size if...

Continue reading →


DON’T GIVE UP

I give up, that’s something that come to my mind when i failed at something, when i don’t get what i want and don’t expect something that i don’t want.

And then i feel like i’m at bottom of the sea cannot breath. And then i read this http://tinybuddha.com/blog/why-we-dont-always-get-what-we-want .
Great blog to read from that blog i know something, if i’m failed i don’t learn from anything (even though you are very upset). Yes i know when you are feel upset, you are like nothing, you don’t want to eat anything, you eyes is like sleepy every time. Yes you need a buffer time from the upset to back to spirit. But How much time the buffer is actually need ?

I don’t know. Its depends on what is your failure, how is your mental accept this failure, how big the impact, are you jealous to another people that succeed than you at the same thing.

I remember when i’m hiking to Gunung Gde, i...

Continue reading →