This project has moved. For the latest updates, please go here.

to Brightstar or not to Brightstar, please help.

Dec 11, 2013 at 7:37 AM
I am small potatoes, 5 table DB, less than 10,000 records but will grow. I use VB.NET or C#, pure javascript, AJAXless AJAX calls, JSONless JSON data format, no JQuery, and SQL Server.

Looking to drop SQLServer, and thinking to add JSON(possibly). As you can see I run lean. I have started to look at libraries which allow for conditional includes, so I don't have to load an entire library just to use a few functions.
I do not want to load images in an RDBMS anymore. I want to revert to reading them from disk. possibly from an RDF object if that is possible.

I am leaning towards BrightstarDB out of a gut feeling, not out of clear understanding(Totally new at this.) I must make a decision fast. I have no time and little know-how to proceed with an evaluation, I must rely on others to provide advice.

Comparing Brightstar to 2 other products

mongoDB (replacing heavy iron ORACLE)
...was evaluated by the Apollo Group for Phoenix University for a migration from ORACLE: TEXT
They are experts at the top of the food chain, and they picked mongoDB over the 150 NoSQL platforms to replace ORACLE, that's impressive.
mongoDB has superb documentation, web and pdf.
I downloaded the product free and started to explore how to put it to use.
That one tempts me much because with that type documentation I could be up and running in reasonable time with hardly any help.

ShemaFreeDB ZERO INDEXING management and NO INSTALL.
as it is a pure JSON over HTTP web API approach. All that is needed is that HTTP and JSON are supported(well that's about 99% of all systems, I think)
They claim their approach is superior to all others, ZERO indexing(all handled automatically)
Best you take a look at this page yourself: TEXT
as it lists all the comparative items
So I wondered where is the engine! and I saw this(between the dashed lines below), which tells me that they run the engine on theirs servers and my data would be on their servers, not my HOST company database server(or host)...
--- again best you go to the page yourself: TEXT
---------------------------------------------------------------------------------------------_Connecting to your Database (SchemaFreeDB)
Connect to your database using a simple, secure HTTPS Request/Response.
Send a HTTP POST to the SchemafreeDB service url [TEXT]((e.g. https://svc.schemafreedb.com/api)).
Set the value of HTTP header x-sfdb-access-key to your account access key.
Set the POST payload according the JSON-formatted Request Payload Format (see below).

Receive the JSON HTTP response _

That removes so much of the hassles, no need to accommodate the product on the HOST, all javascript, essentially embedding SFQL commands and getting JSON back.
Extremely tempting if I can get some assurances as to their reputation in terms of customers data protection. ...but I do wonder about the difference in the data traveling over the net, while a database server or BrightstarDB? might be in-house on an intranet(solid, faster I would think, less chances for 'jitter flow'.
Note: checking out Xornet.Inc(the creators), I was not able to get much open info.

BrightstarDB
I have downloaded the product and started to get familiar with it...
Documentation may be good for experts, but it is near useless for me. It will take me weeks to decipher the basic understanding about how to visualize my RDBMS data structure in an RDF model... and how to understand the nuts and bolts of the basic work flow and what tool is used where! sure you say, I could peek at the mongoDB docs since much of it might be parallel, but I am not so sure.
I still believe BrightstarDB is my answer, because it seems to be parallel to mongoDB but it does not require I move my data to a new house.

Can someone more knowledgeable than I give their thoughts on this scenario?
What would you do? and why? and as part of the discussion include answers to the following issues on BrightstarDB: can I just install on the host a dll(C# or VB.NET API) in the bin folder under the web site root and create folders under a master folder parallel to the root for the RDF stores without having to have the hostISP personnel involved, and is that all that would be needed for the site's operation? Assuming all the tools I use on the localdev machine work remotely to do any manual management over the line.

Thank you for understanding that placing all of these issues in separate posts would not make sense as they are all related to a decision to Brightstar or not to Brightstar.
Coordinator
Dec 11, 2013 at 10:25 AM
Hi,

I think there are probably three dimensions to your analysis:
  1. Where does the process run
  2. How easy is it to connect to from server-side
  3. How easy is it to connect to from client-side
  4. What is the data model like and how easy is it to use
MongoDB
  • Runs in a separate process. That means if you install on a host you need someway to manage the MongoDB process (start and stop the server).
  • Server and client-side connections can be made through HTTP if you are willing to expose your service in that way.
  • Otherwise you may need to write some server-side code to handle requests from the client and verify them before executing.
  • Note that there are also a number of vendors offering MongoDB as a service, which then makes it similar to the SchemaFreeDB offering I think
  • Mongo's document structure is quite intuitive for simple data models, but things like many-to-many relations can be tricky if you are coming from an RDBMS background. The documentation is exceptional and you will find a lot of information out there. I actually like MongoDB quite a lot :)
SchemaFreeDB

I've never heard of this before. But it sounds like a DB-as-a-service offering.
  • No need to manage processes. Though you have to think about management of API keys.
  • Again you can connect from the server-side or the client side (though connecting from the client would expose your API keys so it might not be a good idea)
  • Again if you want to verify client requests before they hit the database you need some server-side code.
  • No idea about the model here - it sounds like document storage, something similar to RavenDB perhaps (which you should also look at). I would say if you have a hard time getting information out of the vendors at this stage that should set off warning bells.
BrightstarDB
  • Runs either in a separate process or embedded in the server process. So it meets your requirement to just add DLLs and create a folder on the server to get things running.
  • If you run as an embedded process, the server has direct API calls into a DLL, but for the client you would need some server-side code to proxy the requests. You could look at the BrightstarDB service as a fully functional example of this that uses the Nancy framework for creating its RESTful API.
  • If you run as a separate process you are back to the issues about needing to manage stopping and starting the service. This can work OK if you are on a dedicated host (or a dedicated VM) as it can run as a Windows Service, but its problematic if you are on shared hosting as you would need to involve your service provider. Again you could choose to expose the service publicly so that connections could be made from client-side or server-side but again I think for security reasons you would probably want to create server-side code to verify requests from the client.
  • The core data model is RDF which is essentially a graph structure. It makes it very flexible but can be a bit hard to get your head around at first, which is why we created the Entity Framework layer - this basically allows you to model your data as a collection of interfaces (maybe on per table...depends on your data) with properties on them for the values and the relationships between the tables. Then the Entity Framework lets you use LINQ queries and a simple .NET API for update - so you don't need to learn anything about RDF. The project is much younger than MongoDB, there is nowhere near the same community around BrightstarDB (yet!) but I'm always happy to help if you have further questions.
At the end of the day I can't recommend for you, and if I did you would be right to treat any recommendation with suspicion. MongoDB is easy enough to evaluate on a local PC, so is BrightstarDB and RavenDB (ravendb.net). I would recommend that before you make your final commitment you spend a day or two with each of these to get a feel for them and what they do. I would avoid a software as a service provider that is not absolutely fanatical about support.

Cheers

Kal
Dec 11, 2013 at 6:18 PM
I appreciate your answer, not only informative but lightning quick.

I would not blame you if you did offer a recommendation, though I believe it is too early to know at this stage of the discussion, unless one takes a blanket approach to close the matter.

RavenDB is not free, that puts it out of contention. I have a feeling ShemaFreeDB is not free either... their demo is timed, if it was free they would let you start production storing and let you fly.(I think)
So that leaves me with mongoDB and BrightStarDB.

I am still leaning towards BrightstarDB because
  • it is free(at least on the onset)
  • I believe it will be better than mongoDB(I now forgot what I read in the site presentation or docs that makes me think that, perhaps you might remind me... I think it was a statement that sets BrightstarDB apart from all other NoSQL implementations in its power to address certain data relationships, perhaps the same allusion you seem to have implied in your response in the last bullet of the mongoDB block)
  • I am not looking for the heavy iron tool yet, but a tool that will allow me to get the basic model/prototype in place, working in production and as icing on the cake a tool that I will stay with.
I will not mind paying for the tool at some point in success. I don't believe in profiting from another's effort without compensation, but aside of budget, one of my dilemma is to make sure I don't embark on a trek that will fall short before I put at least the basics of my project in place. I am loyal and like to stay with an existing tool as long as its support(growth and help) does not come to a screeechin' halt. I see that possible with a young project since my requirement is rather novice... no volume, no scalability concerns at all, and no enterprise needs.
When it gets to that level, I won't be the person to update the system, a person like you will have been be handed the gig.

You bet I have questions, a quarter-millllion of them(lol, I explain below).

The below long winded part of this post is so that you are aligned with my WEAK level of understanding of OOP and all the related mechanisms that make a web app be, my subsequent posts will try to avoid burning your ears and ask direct separate point-addressed questions even at the fear of sounding a numpty.

Although I have the brain to do this, my other stumbling blocks are lingo, abstractions and limited tool knowledge. I tend to learn on the fly as I start gnawing at a tool or at a way to use a tool. I consider classes/objects as entities that come with their own toolbox. Batteries and switches included would be a good wording to show my OOP penmanship, so-to-speak.

Thus when I read your response I spend much time reading it 100 times, while I have to look up areas mentioned in the response so that I may tie the meaning of the abstractions to something concrete. foo and foobar are fleeping shite to me! ...and yes I understand abstraction is the higher form of the language that allows to describe objects and processes generically and rapidly, but when one reads blackbox lingo one is on the wrong end of the paddle if ignorant of the content/functionality of the blackbox.

Let me digest your answer more thoroughly... expect a stream of directed questions.

Again, thank you for your answer.
Dang I wish this post box was stretchable, but the preview does the trick!
Dec 11, 2013 at 6:43 PM
Hi NIQ,

Just a couple thoughts from someone who's used BrightstarDB on a development machine for a couple of months.

I think the Entity Framework integration is great, and definitely gives you an edge on ease of development if you use this system. I haven't gotten far enough along to do any kind of performance benchmarks for the types of functionality I am making use of.

One caveat to using BrightstarDB is that it's still a relatively new project compared to others, so don't expect everything to work exactly the way you expect to, though I haven't really found any actual bugs myself personally. If you find something isn't working as expected, Kal is very quick about responding to issues on here. Personally I prefer to debug the source myself, which is also a not-too-difficult option since it's C#.

And don't forget that this uses the MIT license, which is really the one that gives you the most permissions.

Even with the above caveat I still think it's a solid choice for .NET developers. I would only consider something else right now if I were not using .NET.
Dec 12, 2013 at 8:03 AM
@Nuzz604 - thank you for that input. see my question on separate post (node.js & bDB)
Dec 12, 2013 at 10:38 PM
I dug up the article section that I mentioned in the 2nd bullet on my Wed 2:18pm post.
While exploring to convert from RDBMS to RDF, facing my initial short list(4 or 5 entries, which in the end turns out to be 150 so-called NoSQL engines), that paragraph is what kept me investigating bDB.
It is a most important point that belongs to the context of this thread... so, here it is:

In the Docs: TEXT
                 3rd bullet - __Why BrightStarDB?__
                 2nd section -  __An Associative Model__
                 __3rd paragraph__
"... Few existing NoSQL databases offer a data model that understands, and automatically manages relationships between data entities. Most NoSQL databases require the application developer to take care of updating ‘join’ documents, or adding redundant data into ‘document’ representations, or storing extra data in a key value store. This makes many NoSQL databases not particularly good at dealing with many real word data models, such as social networks, or any graph like data structure. ..."

Note: Although a novice with ZERO knowledge on RDF, in a few self-study-cram-sessions, I am able to say 'NoSQL' is a misleading term. While 'NoSQL' is the key I used in my early searches for a data storage management system that allows me to track objects in an OS hierarchy, I translate the meaning of the 'NoSQL' term as 'RID YOURSELF OF THE RDBMS MONSTER'. An RDF based system still requires a sort of Query(SQL) language(or query code) to query/filter the data based on a criterion. NoSQL should be NoSchema... but I can see why 'NoSQL' would relate more to the novice ear on the onset, and therefore it serves as a selling catch-phrase.
I have an eery feeling this 'Note' is going to come back and bite me.(lol)
Coordinator
Dec 13, 2013 at 7:16 AM
You are correct - really BrightstarDB is a schema-free database solution. In fact NoSQL is also misleading because the origins of the NoSQL movement are in the trade-off between ACID properties and performance/scalability. BrightstarDB remains stubbornly ACID at its core. Of course in the early days (by which I mean a year and a half ago!) we felt that no-one would understand/care about the difference and that NoSQL was a good label to apply. Its probably time to revisit that when I get a bit of time to think about revamping the website.

The query language is another thing entirely. SPARQL is more like a pattern matching language in which you describe the shape of the solution you want with variables where you want the query language to pick out the values from your data. It is designed to work without any knowledge of a schema in the relational database sense of the word and in a world where you have to consider the possibility that anyone can add any properties to any thing (i.e. there is totally flexibility in the data).
Dec 13, 2013 at 7:38 AM
Edited Dec 13, 2013 at 7:39 AM
The 5 important things to me that I am looking for in this DB are:

Raw I/O Performance

I have yet to learn the most efficient ways to query the API to retrieve related objects yet. I'm not sure if using the Entity Framework API is the most efficient way to directly load an object, given its ID, and marshal it into an entity model, or whether I will need to do this via the low level interface.

API Learning Curve

I am happy with the documentation that's here so far. I'd like to see more of it, in greater depth! But I understand that is very time-consuming and not easy for one person to do.

Activity of Project

I'd like to see this database survive long-term. I'm sure we'll see some more applications using it in production at some point.

Security (which can be dealt with later in my case)

I know there's not a security layer right now, but there will definitely have to be one eventually, at least in my case. Preferably robust enough to stop everything but the most top notch malware from penetrating it (wishful thinking?)

Horizontal Scalability / Distributable Computing

I haven't decided on whether I need to use the bDB sharding (if implemented) or if my software will work better with individual embedded instances of bDB on multiple machines.
Dec 30, 2013 at 2:02 AM
OUCH! I am apprehensive to request so much...

I know, much of what is discussed earlier might partially if not already answer the gaps I mention here. But obviously I am missing something. I hope this helps generate clarifications that others can benefit from, not just me.

IIS is powerful. It offers so many features that can be leveraged, such as AUTHENTICATION for example, thus we take the route to push everything thru IIS if we can.

Is not the IIS context a RESTful http scenario by definition?, thus the statement in the documentation "...under a Windows Service implementation that exposes a RESTful HTTP service endpoint..." is not an exclusive meaning but a substitute, a mimicking of a web scenario without the need to get involved with the IIS monster?

Thus in the context of RDF and bDB as a 'GRAPH' model and IIS Hosting(in my case) and with willingness to plunge-in via the low level APIs...
I am starting to get a grasp of this RDF approach and I see the low level APIs are the gate to bDB's addressing the data content in the same manner an object is enumerated in OOP, outta-the-box, 'so to speak'. But I don't have a clear up-front vision. How do I go about structuring the data model to accomplish that. If someone could point me to a good sample that will place the perennial visualization in mind and even if explained at the level of the Entity Framework API, with some cross reference clues to the RAW APIs level.

...pardon my ignorance, I know the bDB engine may automatically create that structure since it is innate to the model, as the schema is living(dynamic and automatic), but my concern is to understand enough of it before I embark in a possible painful data conversion. This will be my last major data conversion, I won't be on Earth for the next, but the project will live on.

MORE detail: As I ponder over this new approach in data-storing as against the SQL relational model where all my data is buried(including pictures-what a pain), I am visualizing the RDF model(mechanism) as 2 interlaced models.
[1.] A technical major block of relations(pointers, URIs when the case) stored as 'TRIPLE' grammar that essentially points to
[2.] assemblies that are stored separately, perhaps also embedded in the RDF block but yet ['readily-accessible' by tools outside the RDF block]:key-point mentioned below) - (examples of assemblies: [2.]a document being a smitten of an example of an assembly, [2.]a picture an other, [1.]the list of all the components(and their respective parts) associated with the engine/running train of a vehicle, yet another, ...)

...portion [1.] driven by the bDB engine and the structure it expects on the storage device.
...portion [2.] resting on the already existing worldwide storage mechanism that allows humans(and machine) to peruse data with simple tools such as Windows Explorer(or a browser linked to a Facebook like engine, ...) without the need to know anything about interrogating an RDBMS(SQL) or an RDF based block, granted, with less power of manipulation, but with ready-access to quick jobs and temporary usage, if not just usage. (:key-point - a feature needed in our case, to remove as much as possible of the HUGE RDBMS wall which requires a programmer between the data and the average normal human being.)

At this point of my understanding I have gaps that need to be cleared up, for example: I have read the TRIPLE model expects a URI in all 3 of its components,(subject, predicate, object) with the possibly that the object allows sub components/collections... I know, I might be lost here, help! ...and with poor visualization of this RDF GRAPH model, I suspect many tables in the SQL world, become folders in the RDF world...??! but not forcefully so much as a 1 to 1 conversion and possibly in some cases hardly a 1 to 1 given that the RDBMS converted from, even though functional, may not have been laid out according to best practice or standard rules.

[...dumb question... as I have no idea how this is accomplished...] Are indeces and other supporting items stored as bDB objects under each piece of the file system hierarchy!?] One concern being that while inspecting and working with a file hierarchy, system personnel even-though trained and conscious to walk carefully thru the gallery, could inadvertently demolish a section, forbid a section that drives a large portion of the data store.

Thus, including answers to the above questions and addressing both :
  • [assemblies] that require their respective tool to arrive at the presentation stage(ex: Adobe Reader, will read a pdf MIME),
  • and [assemblies] that require data 'drilling' (ex: a distilled list from block[1.] which points to other assemblies),
...can someone 'dissert' a short overview about how the data is stored, pointing out which parts of the model in each scenario are or are not READILY accessible via the simple tools mentioned, and give some general idea as to what type pieces will tend to reside in which block[1.][2.]. (perhaps adding some clarity as to why bDB and its particular implementation of the GRAPH model is more suited than other schema-less DBs - this last statement due to observing in my quest, other expert recommending bDB.)

Scenario: database is down...
  • Just to show the idea, even if a bad example but a simple scenario we may face... we have no way to look a the picture of a stocked item or get a needed list until the system is up again. Solution: we simply store the pictures on the file system instead of inside the RDBMS block, that allows for wild card searches in Windows Explorer.
  • A more complicated issue: need a list or an idea of a list of items... is the RDF format so open that it allows to traverse the file system hierarchy manually(visually), or via even the SHELL command line to generate the list using the regular expressions engine, for instance?(even if painful)
...and incidentally, do the bDB APIs expose the SHELL? ...with return value, not just via 'env' variables?! if at all.

Thank you.
Dec 30, 2013 at 10:09 AM
Some points:

In a triple the object part(subject predicate object) could be a literal value not a uri.(23, "A name", 23/14/2013, etc)

As any database Bdb stores data(triples+indexes) in some files using an internal format that you can't read without knowing the format( with a notepad for example.). You should have nothing to do with those files.
The rdf format is open but how the db stores it internally it's a different matter (you can't read sql server data with notepad also).
You could periodically export db content to a readable rdf file in any format you like(ex: turtle, n3). Rdf file formats are general standards so this file can be imported in any other triple store.
I'm not sure if it can store BLOB's but if you have documents you can store the in file system separate of db.

About relational table in short: each table cell transforms into a triple (rowid, columnname, cellvalue), so one row with 12 columns means 12 triples. Db schema is also transformed into triples. So all the data including restrictions, tables, anything goes to triples.

You need to get understanding of basic rdf...so try a tutorial/book, it's way too much to explain here.
Dec 31, 2013 at 6:15 PM
Thank you so much... I am doing exactly what you say about RDF, going thru my early stages digesting write-ups and tutorials I can put my hands on. But your overview is so much better, gold to me. The few words you laid out on the subject have opened up my visualization of this model majorly. You have no idea how this answer of yours fills gaps of understanding. Your short well exemplified description of the parallelism with the schema model went direct to the heart of my understanding doubts in the translation of the two (languages)so to speak. Reading about RDF will now be orders of ease due to your kind help. Again, thank you so much.