2007年11月24日 星期六

Flickr API undocumented limitation

我目前在學校有個 project 需要從 flickr 抓取大量資料,包括照片的 tag、owner 等 meta data。但是我發現我拿到的 data 怪怪的,一些常用的 tag 竟然只有幾百筆。我用的 API 是 flickr.photos.search,它可以接受很多條件,然後傳回符合的照片,其中有兩個條件是最早跟最晚的上傳時間。於是我只好把時間間隔縮小很多。

直到昨天,我收到一封恐嚇^H^H警告信,說明 query return 回來的 offset 不能超過 4000,換句話說頂多拿到 4000 筆左右的結果。API 的文件沒提到這件事,query 傳回來的 status 也都是正常 ...

還好我抓得夠暴力(?),不然如果沒收到這封信,之後學校的實驗就可能是錯的了 :(


Greetings!
We have noticed that the api key 72157602728288126
registered to you is sending very large offset queries to
us.
example (offset=16893165)
Can you please check your usage and reconsider using these
high offsets? They are heavy/costly queries that tax our
backends.

Also to note: the search backend currently will not return
any results greater than offset 4000.

If you could limit the max offset to below 4000, we would
greatly appreciate it (and not have to disable the key -
not because we want to, but because our backends can't take
very much of it).

Thanks!
- Flickr Team
張貼留言