This project has moved. For the latest updates, please go here.

DotNetRdfDataObjectContext store and unit tests

Jan 23, 2014 at 11:36 AM
Edited Jan 23, 2014 at 1:03 PM
In unit tests there is this pattern
 using (var dataObjectStore = _dataObjectContext.CreateStore(storeName))
            {
                using (var context = new MyEntityContext(dataObjectStore))
....................
  using (var dataObjectStore = _dataObjectContext.OpenStore(storeName))
            {
                using (var context = new MyEntityContext(dataObjectStore))
The concept of store is there(I assume) because it's how bsdb works and initially EF api was only working with it.
What I would suggest is to change DotNetRdfDataObjectContext so that it behaves ok in unit tests.
  1. DeleteStore could clear all data in store and CreateStore could call DeleteStore. This should fix using unit tests with dnr context even if for a dnr context create/delete don't make sense.
    Or
  2. Change unit tests to work on a single store that is cleared before each test. Of course for tests that use multiple stores no change should be made.
There is also the issue of lazy loading. I would like to be able to write a custom query that loads a person and his father(so 2 related entities), then clear the store and test if navigating on "person.Father" is not null. I think now this would lazy load the father.
Coordinator
Jan 23, 2014 at 6:31 PM
Most of the unit tests use unique store names so there is a different pattern that makes use of this in (for example) BrightstarDB.Tests.EntityFramework.LinqTests. Would that work ?

However, I have been thinking that the DNR DataObjectContext implementation really could do with changing to support multiple stores. There is no reason why you cannot have a configuration file with multiple stores in it, then use the URI of the configuration as the store name.

DeleteStore could do a clear as you suggest but I don't think it is safe to always assume that a store in a DNR context is "clearable" - which may in turn lead to some unexpected exceptions at run-time if you try to call DeleteStore. Though it may be that it is possible to test that at runtime through the DNR APIs - I'll have to check that.
Jan 24, 2014 at 8:19 AM
Edited Jan 24, 2014 at 11:25 AM
Most of the unit tests use unique store names so there is a different pattern that makes use of this in (for example) BrightstarDB.Tests.EntityFramework.LinqTests. Would that work ?
No. I have 2 sparql endpoints for query/update in config and no stores and I don't really want to create stores at runtime. I could do that but it would get complicated and require a lot of time.
Ideally i would like to have one connection string for all tests, and when I change that all tests run against that connection. This would mean the second TestFixture in all tests should point to a single location instead of having connection string like it is now.
However, I have been thinking that the DNR DataObjectContext implementation really could do with changing to support multiple stores. There is no reason why you cannot have a configuration file with multiple stores in it, then use the URI of the configuration as the store name.
By "stores" here you mean triples stores/database or another unit of separation inside a triple store (similar to bsdb create store). I'm not sure of the meaning of the word in this context, but i assume it's triple store db.
Maybe would make sense to have connections to stardog, virtuoso, etc and be able to run tests against multiple db. I have systap bigdata and stardog so i can test with both (assuming i can connect directly with just an http endpoint).
DeleteStore could do a clear as you suggest but I don't think it is safe to always assume that a store in a DNR context is "clearable" - which may in turn lead to some unexpected exceptions at run-time if you try to call DeleteStore. Though it may be that it is possible to test that at runtime through the DNR APIs - I'll have to check that.
You mean kind of read-only store? I'm thinking i can issue an update command "delete where {?s ?p ?o}" and this should clear the database.
There is also the issue of lazy loading. I would like to be able to write a custom query that loads a person and his father(so 2 related entities), then clear the store and test if navigating on "person.Father" is not null. I think now this would lazy load the father.
Also you missed to answer this case. I mean while context is connected and store has all the data, EF api will lazy load behind the scenes the relations, and I want to unit test this scenarios to make sure that data is eager loaded(for example in case of a custom query that brings both parent and child) and no calls are made to db when accessing relation properties.
Jan 24, 2014 at 3:33 PM
feugen24 wrote:
DeleteStore could do a clear as you suggest but I don't think it is safe to always assume that a store in a DNR context is "clearable" - which may in turn lead to some unexpected exceptions at run-time if you try to call DeleteStore. Though it may be that it is possible to test that at runtime through the DNR APIs - I'll have to check that.
You mean kind of read-only store? I'm thinking i can issue an update command "delete where {?s ?p ?o}" and this should clear the database.
There's a couple of things to consider with regards to clearing a dotNetRDF store:
  • If it supports IUpdateableStorage than you could issue a DROP ALL update to clear the store, this will be much more efficient than the suggested DELETE WHERE
  • If it supports ListGraphs() and DeleteGraph() you could list the graphs and delete each graph plus issue a DeleteGraph() on the default graph
For stores that don't support deleting graphs (whether because they are read-only or their APIs don't allow it) then there is nothing you can do to clear it.

One other thought on the multiple stores idea is that there is the IStorageServer API which represents servers that manage multiple stores (currently there are only Stardog and Sesame implementations for this). So you could imagine loading a IStorageServer instance from a configuration file and then using the ListStores() method to determine available stores and the GetStore() method to access stores. This API also allows for creating new stores programmatically which may be useful to some people.
Coordinator
Jan 24, 2014 at 4:59 PM
feugen24 wrote:
Most of the unit tests use unique store names so there is a different pattern that makes use of this in (for example) BrightstarDB.Tests.EntityFramework.LinqTests. Would that work ?
No. I have 2 sparql endpoints for query/update in config and no stores and I don't really want to create stores at runtime. I could do that but it would get complicated and require a lot of time.
Ideally i would like to have one connection string for all tests, and when I change that all tests run against that connection. This would mean the second TestFixture in all tests should point to a single location instead of having connection string like it is now.
I think you misunderstand (or I didn't explain). The test implementation is written so that for the DNR binding the store name is updated in the connection string - it means there is only one configuration it just gets its assigned name changed. And because the implementation is in-memory it is cleared on each test.
However, I have been thinking that the DNR DataObjectContext implementation really could do with changing to support multiple stores. There is no reason why you cannot have a configuration file with multiple stores in it, then use the URI of the configuration as the store name.
By "stores" here you mean triples stores/database or another unit of separation inside a triple store (similar to bsdb create store). I'm not sure of the meaning of the word in this context, but i assume it's triple store db.
Maybe would make sense to have connections to stardog, virtuoso, etc and be able to run tests against multiple db. I have systap bigdata and stardog so i can test with both (assuming i can connect directly with just an http endpoint).
DeleteStore could do a clear as you suggest but I don't think it is safe to always assume that a store in a DNR context is "clearable" - which may in turn lead to some unexpected exceptions at run-time if you try to call DeleteStore. Though it may be that it is possible to test that at runtime through the DNR APIs - I'll have to check that.
You mean kind of read-only store? I'm thinking i can issue an update command "delete where {?s ?p ?o}" and this should clear the database.
Yes I am thinking of readonly datastores as being the exception here. I guess if SPARQL UPDATE is supported then you can use this (or the graph store equivalent as suggested by Rob). I would be interested to see the performance of a DeleteStore command implemented on different backends though.
There is also the issue of lazy loading. I would like to be able to write a custom query that loads a person and his father(so 2 related entities), then clear the store and test if navigating on "person.Father" is not null. I think now this would lazy load the father.
Also you missed to answer this case. I mean while context is connected and store has all the data, EF api will lazy load behind the scenes the relations, and I want to unit test this scenarios to make sure that data is eager loaded(for example in case of a custom query that brings both parent and child) and no calls are made to db when accessing relation properties.
You could do this by mocking the store and testing the SPARQL that gets executed - this is probably a better way to go in any case.


In summary:
  • Yep we totally can do some sort of support for "deleting" a store, where "deleting" is redefined as "making it empty".
  • It shouldn't be hard to push into the next release
  • But I would like that aligned with changing the DNR DataObjectContext to support multiple stores, because I think that is a genuinely useful feature and it feels like the existing implementation is unnecessarily crippled, so I probably won't make the DeleteStore implementation change in isolation.
  • In the specifics of unit testing eager vs lazy loading, it may make more sense to use mocks
Coordinator
Jan 24, 2014 at 5:05 PM
RobVesse wrote:
feugen24 wrote:
DeleteStore could do a clear as you suggest but I don't think it is safe to always assume that a store in a DNR context is "clearable" - which may in turn lead to some unexpected exceptions at run-time if you try to call DeleteStore. Though it may be that it is possible to test that at runtime through the DNR APIs - I'll have to check that.
You mean kind of read-only store? I'm thinking i can issue an update command "delete where {?s ?p ?o}" and this should clear the database.
There's a couple of things to consider with regards to clearing a dotNetRDF store:
  • If it supports IUpdateableStorage than you could issue a DROP ALL update to clear the store, this will be much more efficient than the suggested DELETE WHERE
  • If it supports ListGraphs() and DeleteGraph() you could list the graphs and delete each graph plus issue a DeleteGraph() on the default graph
For stores that don't support deleting graphs (whether because they are read-only or their APIs don't allow it) then there is nothing you can do to clear it.
Is there a store implementation that doesn't support ListGraphs() ? I guess maybe a generic SPARQL endpoint wouldn't but then that would be a readonly thing, and maybe a generic SPARQL update endpoint wouldn't, unless it supports the graph store stuff ?
One other thought on the multiple stores idea is that there is the IStorageServer API which represents servers that manage multiple stores (currently there are only Stardog and Sesame implementations for this). So you could imagine loading a IStorageServer instance from a configuration file and then using the ListStores() method to determine available stores and the GetStore() method to access stores. This API also allows for creating new stores programmatically which may be useful to some people.
That would be another way to go too. It might be nice to be able to support both specifying a collection of stores in the configuration file as well as being able to just specify an IStorageServer instance - maybe just through a slight variance in the connection string syntax...
Jan 25, 2014 at 9:29 AM
techquila wrote:
feugen24 wrote:
Most of the unit tests use unique store names so there is a different pattern that makes use of this in (for example) BrightstarDB.Tests.EntityFramework.LinqTests. Would that work ?
No. I have 2 sparql endpoints for query/update in config and no stores and I don't really want to create stores at runtime. I could do that but it would get complicated and require a lot of time.
Ideally i would like to have one connection string for all tests, and when I change that all tests run against that connection. This would mean the second TestFixture in all tests should point to a single location instead of having connection string like it is now.
I think you misunderstand (or I didn't explain). The test implementation is written so that for the DNR binding the store name is updated in the connection string - it means there is only one configuration it just gets its assigned name changed. And because the implementation is in-memory it is cleared on each test.
Ok. Consider this connections, assuming dataObjectStoreConfig.ttl has correct configs:
 1. [TestFixture("type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;storeName={1};store=http://www.brightstardb.com/tests#emptyStore")]

 2. [TestFixture("type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;storeName={1};store=http://www.brightstardb.com/tests#emptyStore; query=http://example.org/configuration#sparqlQuery;update=http://example.org/configuration#sparqlUpdate;")]

 3. [TestFixture("type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;storeName={1};query=http://example.org/configuration#sparqlQuery;update=http://example.org/configuration#sparqlUpdate;")]
I run "TestLinqCount" test and:
  1. Works but sparql endpoint is not hit (no data in database)
  2. Works but sparql endpoint is not hit (no data in database);
  3. Works, inserts data in database but if I run it a second time it fails because store is not cleared of previous run data
In variant 2 "query" and "update" are added to connection string, in variant 3 "store" is removed.
Assuming I have not missed something in the setup, the dnr connection string template should have "store" parameter as optional and a optional "query, update", so the TestFixture only has common part and depending on a "Configuration.UseSparqlEndpoint" it transforms the connection string; plus store needs to be cleared after each test (already suggested in summary)
There is also the issue of lazy loading. I would like to be able to write a custom query that loads a person and his father(so 2 related entities), then clear the store and test if navigating on "person.Father" is not null. I think now this would lazy load the father.
Also you missed to answer this case. I mean while context is connected and store has all the data, EF api will lazy load behind the scenes the relations, and I want to unit test this scenarios to make sure that data is eager loaded(for example in case of a custom query that brings both parent and child) and no calls are made to db when accessing relation properties.
You could do this by mocking the store and testing the SPARQL that gets executed - this is probably a better way to go in any case.
good point
Coordinator
Jan 25, 2014 at 3:29 PM
Ah, OK I see now. The thing is the test setup I have written thus far is only using the in-memory DNR store, which gets initialized when you initialize the DNR context. I'm not quite sure why your example (1) doesn't work, though I guess it depends on what the store http://www.brightstardb.com/tests#emptyStore is configured as, and also if your test is holding on ot the DNR context - if you let it go out of scope, the in-memory store will also go out of scope - perhaps that is why it looks like there is no data in the store... or perhaps there really is a bug there :-) If you have chance to write a short reproducable test that I could look at that would be great.
Jan 25, 2014 at 6:31 PM
Edited Jan 26, 2014 at 7:15 AM
I'm not explaining well so i'll try to explain the problem again:

I want to connect to a triple store represented by the sparql endpoint: "http://127.0.0.1:8081/sparql" and run unit tests against it.
I have configured .ttl file as described in docs
To have this work I need to provide a connection string that has: "query/update" params, but it doesnt have the "store" param.
The connection string would look like
"type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;query=http://example.org/configuration#sparqlQuery;update=http://example.org/configuration#sparqlUpdate;"
If it would have a "store" param then in DotNetRdfDataObjectContext at line 69 it would go on the wrong if branch. (it should go on else)
 if (!String.IsNullOrEmpty(connectionString.DnrStore))
........
else
            {
                if (String.IsNullOrEmpty(connectionString.DnrQuery) ||
                    String.IsNullOrEmpty(connectionString.DnrUpdate))
                {
                    throw new BrightstarClientException("DotNetRDF connection requires either a Store property or a Query and an Update property.");
                }
So in the TestFixture should look like in option 3 in prev post:
3. [TestFixture("type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;storeName={1};query=http://example.org/configuration#sparqlQuery;update=http://example.org/configuration#sparqlUpdate;")]
but the template in latest version looks like (1 prev post):
1. [TestFixture("type=dotnetrdf;configuration={0}dataObjectStoreConfig.ttl;storeName={1};store=http://www.brightstardb.com/tests#emptyStore")]
So by changing configuration i should be able to obtain form (3) of the connection string, and this requires some minor changes to the code.
A simple way to test this is to debug and put a breakpoint in DotNetRdfDataObjectContext on the "else" branch mentioned above (currently line 110). That breakpoint should be hit in a test only by changing configuration (without actually changing any unit test code).
Optional a check could be added:
if (!String.IsNullOrEmpty(connectionString.DnrStore))
            {
if ( !String.IsNullOrEmpty(connectionString.DnrQuery) ||  !String.IsNullOrEmpty(connectionString.DnrUpdate))
                {
                    throw new BrightstarClientException("DotNetRDF connection requires either a Store property or a Query and an Update property, but not both.");
                }

I could create a fork and make the changes but i'm not sure how you want to fix this.
Jan 27, 2014 at 6:24 PM
I just saw the other post with "Connection strings for other stores /Configuration Scenarios", so i suppose the above problem it's only for current implementation.
Coordinator
Jan 28, 2014 at 10:51 AM
OK - sounds like just doing the updates suggested on the other thread, and providing more in the way of constructor injection is the way to go.