Douban crawler

Just as I mention in previous article, I extend my program to get datas from Douban site.

I start to collect some user data from group member list. It’s not hard, just need to wait for some interval because of timing-out protection mechanism. I got 21000 user datas after try few groups. It’s enough for me.

Another one is Movie data. I plan to get all the favorite movie from current users. I still try the ugly html. I think it’s good time for me to practice rescue keyword of Ruby.

I still need some test to see if I got robust script or not.

Still have many thing to do. And want to practice d3.js for display the relation of user and movie. And then, maybe try some impressive.js.

I saw there exist one website is using impressive.js for web design. Really impressive!! Maybe I will try to create one. :)

 
1
Kudos
 
1
Kudos

Now read this

ReadFile vs Readln

This weekend I try to expand my book page with more novels. To do it, I need to update TF/IDF database and insert the novel table. This time I try to make my previous mecab.go can run on my Linode 2G node. I try to use ulimit and and try... Continue →