neo4j How to cahe all the nodes and relations to RAM for importing query performance -


i have installed apoc procedures , used "call apoc.warmup.run."

the result follow:

 pagesize 8192  nodesperpage  nodestotal  nodesloaded  nodestime 546           156255221   286182       21  relsp‌erpage   relstotal   rel‌sloaded   relstime 240           167012639   695886       8  tot‌altime 30 

it looks neo4j server caches part of nodes , relations. want cache nodes , relationships in order improve query performance.

first of all, data cached, need page cache large enough.

then, problem not neo4j not cache can, it's more of bug in apoc.warmup.run procedure: retrieves number of nodes (resp. relationships) in database, , expects them have ids between 1 , number of nodes (resp. relationships). however, it's not true if you've had churn in db, creating more nodes deleting of them.

i believe fixed using query instead:

match (n) return count(n) count, max(id(n)) maxid 

as profiling shows same number of db hits number of nodes, , takes 650 ms on machine 1.4 million nodes.


update: i've opened issue on subject.


update 2

while issue ids real, missed real reason why procedure reports reading far less nodes: reads one node per page (assuming they're stored sequentially), since it's pages cached. current values, means trying read 1 node every 546 nodes. happens 156255221 ÷ 546 = 286181, , node 0 makes 286182 nodes loaded.


Comments