In this post I use my MLDB (soon to be MLJS) JavaScript MarkLogic API to draw graphs from aspects of search results. Read on for how to get great charts simply with MarkLogic…
An obvious extension to creating a search application is in building a summary. These are most effective when that summary is visual and instantly understood. This is where charts come in useful. Thankfully, MarkLogic includes a free license of HighCharts. You can use this for any application where the data comes from a MarkLogic system – it’s all covered under your MarkLogic license.
HighCharts is probably one of the easiest graphing libraries to use, and I’ve found it to be very visually appealing by default, which makes a refreshing change to some JavaScript libraries!
In this post I’ll extend our previously configured search interface to include two summary charts between the results and the search bar. You’ll need to follow those instructions first to get the full app, or start from a blank new hybrid Roxy application. Before we configure our new pages though, we’ll need some data.
In the MLDB wettest Roxy web application I include a sample at /mldbtest/populate.html (on file system at /mldbwebtext/src/public/js/mldbtest/page-populate.js. I’ll walk you through that first
Roxy file structure
A quick refresher on how Roxy structures its files. Roxy is a Ruby on Rails like web framework for XQuery, developed by the Vanguard team at MarkLogic. You create a controller – in our case mldbtest – which resides at mldbwebtest/src/app/controllers/mldbtest.xqy . Within here there are several methods. By default just one called main. Our method, populate, is in here too. We simply pass no details to the view it uses as we’re doing everything we need in JavaScript.
The view used is xhtml by default, so it will be mldbwebtest/src/app/views/mldbtest/populate.html.xqy . In here we include any JavaScript and CSS files we need (unless included via the global template already), and a html element to hold any log output.
Note that in here we refer to a page-populate.js file. I always create a master JavaScript page control file external to the HTML code. This is generally good practice, I believe, as it externalises all JavaScript. This is particularly useful in the XQuery world though where { and } used extensively in JavaScript are used as inline XQuery invocation grammar – so best to keep any actual JavaScript outside to avoid the use of double {{ and }} everywhere.
Populating the database
Go ahead and create the above files. You do this by invoking ml in the root of your web app:-
./ml create mldbtest/populate html
Type ‘yes’ then [Enter] when requested.
Edit the files for the controller and the view to reflect the above code – or grab it from here: [controller] and [view]
Now create your page-populate.js file under src/public/js/mldbtest/ and edit it to include [this page-populate.js javascript code]
You’ll notice here that I have an array of JSON documents that I’m adding to MarkLogic when this page loads. I’m also logging completion in to a HTML element on the page.
Go ahead and deploy your changes using this command:-
./ml local deploy modules
Now navigate to http://yourserver:port/mldbtest/populate.html . After a few seconds you should get confirmation that the docs have all been created.
A note on JSON vs. XML results
In the MarkLogic REST API you have two options for results to any function. You can receive results as JSON or as XML. You accomplish this by using an Accept header or the ?format=json|xml parameter. The documents themselves are not altered by what content tip you accept. The complication comes when your database contains both JSON documents (saved via REST API, stored internally as XML, but rendered automatically as JSON) and XML documents.
By default MLDB will perform all requests in JSON. This means that the search results wrapper object will be a JSON object. Any JSON document results are stored as a JSON object under results[i].content whereas any result that is an XML document is returned as a string within the same element. This means your result processing code needs to be aware of this fact.
The MLDB search results object and it’s default content renderers do check for whether a document is a JSON or XML document, and performs default rendering accordingly. Currently, though, the high charts widget only supports JSON documents. We’re going to restrict all search results to the animals collection to prevent any problems.
Chart search results page
We’re going to create a results page with just a search bar and two charts. We can plug in other widgets if we like, including search results, facets, paging and sort, but for now I’ll stick to just these two for simplicity.
Now create the page:-
./ml create mldbtest/chartsearch html
Type yes and [enter] when requested.
Now edit your controller code so it matches the first example, at the top of this page. Also edit the /src/app/views/mldbtest/chartsearch.html.xqy page so it looks like this example [chartsearch.html.xqy].
Note the divs using 960.cs classes to layout the search bar along the top, with two charts immediately below taking 50% width of the page.
Create a page-chartsearch.js file under /src/public/js/mldbtest/ as usual and edit to to reflect the example page-chartsearch.js file.
Lets assess this JavaScript file in order. Lines 3 and 4 create an MLDB connection using the default (same app context) connection settings. Line 6 names the search options we’re going to save. MarkLogic 6 requires search options be persisted before they are used, hence the naming – the search bar widget will save these automatically prior to a search. MarkLogic 7 drops this restriction, allowing you to send the options with the search itself.
Lines 8-19 create a line chart using a spline (curved line rather than angular) to join the data points together. Line 8 creates a new HighCharts widget instance attached to the splineline div. Line 9 instructs the widget to hardcode the Series name to ‘Animals’ using a prepended # character. This is useful if the data itself does not have a field to use for the series name.
Note that we also provide the category path and the value path. These are dot delimited path statements used by the internal jsonExtractValues function to step down the object graph of the result. In this example we merely extract the ‘animal’ and ‘age’ values from the top level of the object. If these were in another contained object called details, we would specify ‘details.animal’ and ‘details.age’ here instead. This is useful for storing complex JSON documents. NB Currently arrays within this page are not supported.
Line 10 instructs the chart widget to summarise the data using a mean average. I should mention that this point that the current implementation of the HighCharts widget calculates these aggregations by reading all the documents available in the result set. The out of the box Application Builder chart widget, however, pulls these from facet configuration only. In the future I’ll update the HighCharts widget to allow this too. For now though this provides a quick way to get a results summary – but does require ALL documents be sent to the client before calculating an aggregation.
Line 11 tells the widget to figure out category names itself. By default the graph widget assumes you pass hard coded values. Think of months of the year, for example, where you may want ‘February’ to be shown even though no data point mentions february. Setting auto categories to true means you don’t have to provide this information. This makes sense in our use case because the Animal name could be anything and is not limited to a finite set by our UI code.
Lines 12-14 set various titles for the chart. Line 15 sets the chart type. Line 16 provides the HighCharts specific configuration. In this example how to generate a tooltip, and with data labels disabled. See the HighCharts documentation for full options supported.
Lines 21-30 do much the same, but this time counts the number of each type of animal rather than takes a mean of their ages per type. You see that lines 28 and 29-30 are the only ones significantly different, with data labels enabled and some styling set, and of course type set to column.
Lines 32-37 set up the search options object using the helper method provided in MLDB. Note the page length of 100 rather than 10. This is because right now our widget doesn’t use the values() methods to calculate aggregations, but rather document content itself. To ensure I get all the documents, I set page length to 100. A bit of a hack, and one I hope to replace in the future.
Lines 39-40 create the search options object JSON from our helper class.
We then create our search bar widget on lines 42-44. Both our widgets are fed by this, as you can see in lines 46-49.
Finally, lines 51-54 save the search options and executes the search (thus also drawing our charts).
Now deploy this using:-
./ml local deploy modules
And go to http://yourserver:port/mldbtest/chartsearch.html
You will see after a couple of seconds that the search options are saved, the search is executed, and both charts appear. Try typing ‘family:bird’ in the search bar and clicking search to alter the chart results.
In summary
MLDB provides a simple way to generate charts for search results in JSON documents. It does this without requiring the use of range indexes, but is restricted to building charts on only the result documents fetched from MarkLogic in their entirety.
For the June release of MLDB – to be renamed MLJS – I will add support for in XML document value extraction and use of search options to aggregate values (so you don’t have to pull back all documents). I’ll also allow charts to dynamically render within a search page widget, much like the Application Builder allows today.
Hopefully this example has shown how easy it is to visualise data held within documents stored in MarkLogic 6. If you have any other requirements or requests, please add a comment to the below.
One comment