I've been doing some explorations of thrudb which is a new document-oriented database service. I want to see how good thrudb perform with large dataset so I feed it with DMOZ catalog which contains information about 4,600,000 websites in all other world.
I also write a small django application which accepts a keyword and query thrudb to get the relevant links to it. You can check it out at http://ec2-72-44-40-221.z-2
As you use the application, you may notice that the time thrudoc takes for each query is much larger than thrudex. This is because, for the sake of simplicity, I use the disk backend for thrudoc and, as both Ross and Jake said, disk backend is not suitable for a large dataset. I'm going to load the same dataset to other backends such as mysql or bdb to see how they perform. I'll post the result to this blog when done, stay tuned.
In the mean time, please help testing thrudb performance by doing some random searches at the link above. Please note that in order to gather performance data, I log all the queries (not your IP address though).