iGoogle _IG_FetchContent supports Big5

前幾天發現 iGoogle gadget API 的 _IG_FetchContent, _IG_FetchXmlContent 和 _IG_FetchFeedAsJSON 開始支援 Big5 了! 以前寫 gadget 時,只要遇到要抓取 Big5 的網頁,就沒辦法用。因為之前這三個 function 假設網頁的內容是 UTF-8。不過前幾天我抓了一個 Big5 的網頁,發現內容竟然正常了。看來他現在會幫你猜網頁的 encoding 了! 這樣開發 gadget 就方便多啦。

Mapplet 也是一樣,因為 mapplet 也可以用 gadget 的 API。

也許 Google 正在辦的比賽會有更多的 idea 跑出來。

Parallel::ForkManager

在 Perl 裡面如何把這種程式平行化? 假設每次 interation 是獨立的。

foreach $data (@all_data) {
    eat_cpu($data);
}

超簡單


use Parallel::ForkManager; # !!
$pm = new Parallel::ForkManager($MAX_PROCESSES);

foreach $data (@all_data) {
    $pid = $pm->start and next;
    eat_cpu($data);
    $pm->finish;
}

C++ 發明人談 C++

C++ 的發明人 Bjarne StroustrupThe Problem with C++

有很多想法不知道怎麼表達。反正,我是懂越多 C++ 越不想用 C++,能不用 C++ 就不用 C++ 啊 Orz

InvSqrt()

看到開根號倒數 (InvSqrt(), 1 / sqrt(x)) 速算法,這真是太棒了。寫了一段 code 測試,跑 1..1000000000 出來的時間,1 / sqrtf(i) 出來平均大約是 9 秒多一點,而 InvSqrt() 是接近 3 秒。至於數值的精確度,我對 1 .. MAX 每個值丟給兩個方法看他們的差,發現數字越大越準,而 InvSqrt() 出來的值「略小於」1 / sqrtf(x)。

Let f1 = 1 / sqrtf(x), f2 = InvSqrt(x)
這個表第一欄是 MAX,第二欄是 \sigma_{1..MAX}(f1 - f2),第三欄是把第二欄 sigma 裡面的值取絕對值,第四欄是平均誤差。

1000000000: 8.0000000000 8.0000000000 0.0000000800
10000000: 5.7150559425 5.7150559425 0.0000005715
1000000: 1.8854961395 1.8854961395 0.0000018855
100000: 0.5647696257 0.5647696257 0.0000056478
10000: 0.1864424646 0.1864424646 0.0000186461

另外,這是多次 (f1 - f2) / f1 跟平均值:

1000000: 944.7488403320 0.0009447498
100000: 85.7057800293 0.0008570664
10000: 9.4323215485 0.0009433264
1000: 0.9615601301 0.0009625227

1/1000 以下的相對誤差,看起還不賴呀。以後應該會用得著 XD

Hates…

I found that an easy/small program can make g++ 3.3 unhappy, but 3.4 (and after) feels good.


#include

int main(void)
{
    std::basic_string ustr;
    ustr.append(ustr);
    return 0;
}

But after the bug report, they tell me 3.3 branch has been closed.

But why I feel really bad is when I tried to remove gcc-3.3 and g++3.3 from my box, I found the many things depend on them. It’s really sad that I have to let this bug living in my box :(

Self-reproduce

Jserv’s blog 上看到一篇文章:
自我複製程式理論依據
提到 zao 的文章: http://s88.dyndns.org/index.php?p=144#more-144 ,他用
Turing Machine 來 prove 並 implement 一個 self-reproduce 的程式。

寫得還不錯 :)

makecontext(3)

竟然有這種東西 orz

#include

void makecontext(ucontext_t *ucp, void *func(), int argc, ...);
int swapcontext (ucontext_t *oucp, ucontext_t *ucp);

TIME_WAIT state

When I wrote my Computer Network’s homework, I found that binding to the same port will be failed after a client comes in and stop the httpd. I see TIME_WAIT state when execute netstat.

One should wait in TIME_WAIT state for 2 x MSL (/maximum segment lifetime/) after close one connection. The suggestion time interval in RFC 1122 is 120 seconds. But the implementation of Linux follows BSD, which set MSL to 30 seconds. There’s a comment in net/ipv4/tcp_minisocks.c:

* [ BTW Linux. following BSD, violates this requirement waiting
*   only for 60sec, we should wait at least for 240 secs.
*   Well, 240 consumes too much of resources 8)
* ]

I observed the time, and it really about 1 minute Q_Q So I should add a port to the argument :/

BTW, Unix Network Programming is really a great book!

Update:
Use setsockopt to enable SO_REUSEADDR will solve this problem.