Polaris: some issues and suggestions

May 26, 2015 at 5:30 PM
Hi there,

I was looking for a simple and easy to use tool for re-aranging some RDF data. A tool which would allow me to just perform three steps: importing the files, processing the SPARQL query and getting out the results.

Polaris 1.10.0.0 is indeed rather simple and usable without any knowledge about the architecture of BrightstarDB. However, I've encountered some issues which I'd like to share with you:
  1. The status bar behaves sometimes confusing. Speaking of "Import jobs", the program already runs several minutes (and uses several GB of memory), before it even says "Job started". That is especially confusing if one job ends with an error after you've done another Import job: In this case, there is still the line "successfull" from the previous job.
  2. I had problems with some values in some tripples. I didn't need those anyway, so I ended up deleting all triples which caused problems in Polaris. However, an option "ignore and omit all problematic triples" would save a lot of time in cases like this.
  3. The error message "Job error" concerning an Import job is not very helpful.
  4. I had an "Out of memory" error while - according to Windows - 20 gigabyte of memory were still unused.
  5. It would be quite useful, if there was a batch import mode, enabling the user to import several files at once.
  6. While the "Select file" for "Import file" window remembers the last directory used, it doesn't remember the last file ending used. So, I had to switch from .nt to .ttl each time.
  7. In the "Load a SPARQL query" window you can only select .sq files. That is an unnecessary inconvenience. SPARQL query files do not allways end in ".sq". I guess, .rq is as comon.
If Polaris is still under development and some developers read along, then maybe this list is of some use.

Best regards.
Coordinator
May 26, 2015 at 9:11 PM
Hi!

Firstly, thank you for taking the time to provide your feedback - it is all really useful stuff. One thing that would help to know is what sort of connection you are using in Polaris - is it an embedded connection or a rest connection ?

See other comments inline below.

cisfyrst wrote:
Hi there,

I was looking for a simple and easy to use tool for re-aranging some RDF data. A tool which would allow me to just perform three steps: importing the files, processing the SPARQL query and getting out the results.

Polaris 1.10.0.0 is indeed rather simple and usable without any knowledge about the architecture of BrightstarDB. However, I've encountered some issues which I'd like to share with you:
  1. The status bar behaves sometimes confusing. Speaking of "Import jobs", the program already runs several minutes (and uses several GB of memory), before it even says "Job started". That is especially confusing if one job ends with an error after you've done another Import job: In this case, there is still the line "successfull" from the previous job.
This may depend on which sort of import you are using. A local import will first parse the file locally to construct nquads and then send that to the server - that local parsing could account for the delay before you see "Job started". There should definitely be some user feedback at the start like "Preparing file for server" or something that makes it clear that something is happening on the client machine. If you have large files and easy access to the BrightstarDB server, using a Remote import would be more efficient - you just drop the files into the import folder and then from Polaris specify remote import and just provide the file name - the BrightstarDB server then parses the file from the import folder.
  1. I had problems with some values in some tripples. I didn't need those anyway, so I ended up deleting all triples which caused problems in Polaris. However, an option "ignore and omit all problematic triples" would save a lot of time in cases like this.
That is a great idea! I'll have to see if it is possible with the underlying parsers. It might depend a bit on the format (e.g. it would be far easier to support skipping invalid data in NTriples that it would in RDF/XML).
  1. The error message "Job error" concerning an Import job is not very helpful.
Totally agree. The annoying thing is that more information is available, it just doesn't make it as far as the UI. I'm going to log this as a bug.
  1. I had an "Out of memory" error while - according to Windows - 20 gigabyte of memory were still unused.
That definitely should not happen. I'll need to try and reproduce it here. It would be good to get a better idea about what you were doing at the time if you can - e.g. was during one big import, or after lots of little imports.
  1. It would be quite useful, if there was a batch import mode, enabling the user to import several files at once.
That would definitely be a nice addition. It would allow you to kick off a bunch of imports to run overnight for example. I'll log an enhancement request for that one.
  1. While the "Select file" for "Import file" window remembers the last directory used, it doesn't remember the last file ending used. So, I had to switch from .nt to .ttl each time.
Hopefully that one should be easy to fix!
  1. In the "Load a SPARQL query" window you can only select .sq files. That is an unnecessary inconvenience. SPARQL query files do not allways end in ".sq". I guess, .rq is as comon.
Good point. I'll add a .rq option and there should also be an "All Files" option for this dialog and for the import file selection dialog (if there isn't already).
If Polaris is still under development and some developers read along, then maybe this list is of some use.
Really useful, thank you !

Cheers

Kal
May 27, 2015 at 2:23 PM
Hi Kal,

thanks for your quick reply!

techquila wrote:
One thing that would help to know is what sort of connection you are using in Polaris - is it an embedded connection or a rest connection ?
I did those embedded and with "local" import jobs. I thought, if client and server run on the same machine there wouldn't be much difference. But as I said: I know nearly nothing about the architecture of BrightstarDB.

techquila wrote:
It would be good to get a better idea about what you were doing at the time if you can - e.g. was during one big import, or after lots of little imports.
I was trying to import one big file (about 2.5gb). Other files less than 1gb worked fine for me.
Best regards