i writing internet data crawler. logic simple, producer finds url links crawl, put batches of urls in cycle queue, consumer pick links queue 1 one , loads html document on each link , processes it.
producer -> urls batch -> queue -> 1 url -> consumer
the problem need add delay consumer, because consumer make download requests fast. internet servers thinks trying ddos them , blocks requests.
i need add kind of countdown. when consumer pick url queue, starts countdown equals 200ms. after completed process url, needs check timer, if 200ms has not passed, must wait before pick new url.
put : java delayqueue example google , .. magic! here it!
you have object (maybe downloaded url or whatever) , implements delayed interface. have producer keep data blockingqueue , last 1 consumer. simple solution concurrency. tutorial here example: https://examples.javacodegeeks.com/core-java/util/concurrent/delayqueue/java-util-concurrent-delayqueue-example/
if share code us, maybe own implementation ;)
edit
as wrote in comment not working you. share wroted code here. work if have correctly implemented. don't know how storing urls, how getting them , collection? question general.
Comments
Post a Comment