DB Growth

Mar 16, 2016 at 11:27 PM
Can someone explain why the db grows to an outrageous gigabyte size after updating a counter within the database couple thousand times?

/c
Coordinator
Mar 17, 2016 at 9:34 AM
It is basically a multi-threading performance tradeoff. This explains the different store types : http://brightstardb.readthedocs.org/en/latest/Store_Persistence_Types/

I guess you are using the append only store (the default setting), which is built to support concurrent readers and writers without locking but as a consequence results in a store that grows significantly over time - exactly what that means depends on your application (it works best for large, relatively infrequent writes with concurrent reads). The append-only structure also supports nice features like being able to query any previous state of the store.

http://brightstardb.readthedocs.org/en/latest/Admin_API/#consolidating-the-store explains how to free up all of the space used by old transactions. Alternatively you could switch to the read/write store which has its own trade-offs (in particular long writes can effectively block reads).

I'm actually considering changing (for BrightstarDB 2.0) to a hybrid approach similar to that used in LightningDB that dynamically reuses old pages to minimize store growth, but with the option to "pin" a version so that it is always accessible. Still very much in the design stage at the moment though (i.e. there ain't no running code yet ;-)

Cheers

Kal
Mar 18, 2016 at 12:42 AM
Hey Kal.

You know, I have tried BrightstarDB.PersistenceType in web.config and I also tried setting rewritable with the API when I create the store.
Once the store is created, how can I confirm the persistence type?

Thanks
Chris
Coordinator
Mar 18, 2016 at 9:13 AM
Hi Chris,

Hmmm...actually it looks like there is no easy way to do that. That is a bit of an oversight in the API!

Right now I think the only ways to test the store type is either to build B* and trace through a call to open the store, or if you don't want to go that way you can use Polaris to query a past commit point - select the store in question in Polaris, right-click and choose New > History View. This will display a pane with the history of updates to the store (so choose a store with at least two or three updates in its history). Select a timestamp (other than the most recent one) and put a query in to the text field on the right (something like SELECT * WHERE { ?s ?p ?o } LIMIT 10). Then press the blue play button in the ribbon at the top of the Polaris window. If the store is append only it should execute the query and generate some results. If the store is rewrite then you should see an error message like:
BrightstarDB.Polaris.ViewModel.SparqlQueryException: An error occurred while executing the SPARQL query. 
---> BrightstarDB.Client.BrightstarClientException: Error querying store test@11 with expression SELECT * WHERE { ?s ?p ?o }. Query of past commit points is not supported by the binary page persistence type 
---> BrightstarDB.Client.BrightstarClientException: Query of past commit points is not supported by the binary page persistence type
I've logged this as an enhancement request: https://github.com/BrightstarDB/BrightstarDB/issues/273

Cheers
Mar 19, 2016 at 11:43 PM
Thanks, it is indeed 'rewrite'.
BrightstarDB.Polaris.ViewModel.SparqlQueryException: An error occurred while executing the SPARQL query. ---> BrightstarDB.Client.BrightstarClientException: Error querying store kp_data@99 with expression SELECT * WHERE { ?s ?p ?o } LIMIT 10. Query of past commit points is not supported by the binary page persistence type ---> BrightstarDB.Client.BrightstarClientException: Query of past commit points is not supported by the binary page persistence type
Mar 19, 2016 at 11:45 PM
So let me run some updates on an increment counter and I will get back to you with some db stats.
May 5, 2016 at 2:07 AM
So I have been getting familiar with Polaris and looking around the data a bit.
I have disabled transaction logging.
DB is set to rewrite.

data.bs keeps growing A LOT by just incrementing the same triple object value by 1 every 5-10 seconds.

What else can I look into why this is happening?

/c
Coordinator
May 5, 2016 at 8:56 AM
Hi Chris,

One thing to note is that regardless of rewrite/append-only settings, literal values are never automatically culled from the database. So if you increment that counter, each value that it has had will remain in the database and I suspect that this is where your database growth may be coming from. If that is the case I would also expect that consolidating the store (http://brightstardb.readthedocs.io/en/latest/Admin_API/#consolidating-the-store) would drastically reduce the size of the database.

This feels like it would all be good information to put into a B* FAQ, so thanks for asking!

Cheers

Kal
May 6, 2016 at 11:54 AM
WOW!
var client = BrightstarDB.Client.BrightstarService.GetClient(defaultConnectionString);
var consolidateJob = client.ConsolidateStore(storeName);
From 70MB to 800K!