Friday, 15 August 2014

Cayley: A Graph Engine Inspired by Google’s Knowledge Graph

Barak Michener, a Software Engineer working for the Google Knowledge Team, has open sourced a personal project called Cayley, a graph database inspired by Freebase and theGoogle Knowledge Graph, the later powering Google’s search engine. Freebase is a collection of free structured data, currently at ~2.7B facts and counting, and an API for querying this data.
Cayley provides a way to append and query complex semantic data stored in various back-end stores such as LevelDB, MongoDB or in-memory. According to Michener, the graph store waswritten in Go for performance reasons:
Cayley is written in Go, which was a natural choice. As a backend service that depends upon speed and concurrent access, Go seemed like a good fit. Go did not disappoint; with a fantastic standard library and easy access to open source libraries from the community, the necessary building blocks were already there. Combined with Go’s effective concurrency patterns compared to C, creating a performance-competitive successor to graphd became a reality.
Cayley uses a RESTful API or a REPL with a query editor and visualizer that can be tested online. The query engine supports Gremlin, a JavaScript DSL for property graph traversal, and a simplified MQL, Freebase’s query language. Cayley can be extended with more back-end stores and query languages if needed.
Cayley is currently not a Google project, but it is “created and maintained by a Googler, with permission from and assignment to Google, under the Apache License, version 2.0,” according to this disclaimer.

Oracle Database Gets In-Memory

Oracle Database 12c Release 1 (12.1.0.2) is now available and includes the much anticipated “In-memory” feature, along with several other improvements.
Some important features introduced –
  • In-memory Column Store – Storing of objects in memory in a  Columnar format, with much better performance for scans, joins and aggregates
  • In-memory Aggregation- improves performance of star queries and reduces CPU usage
  • Advanced Index Compression
  • APPROX_COUNT_DISTINCT() – significantly faster than exact aggregation for large volumes of data, with negligible deviance
  • Attribute Clustering- allows storing logically related data in close physical proximity. Can greatly reduce amount of data to be processed and lead to better compression ratios
  • Full-Database Caching- Can be used to cache the entire database in-memory when the cache size is greater than the whole database size
  • JSON Support- Support for storing, querying and indexing JSON data, and allowing the database to enforce that the JSON stored conforms to the JSON rules
Kevin Closson points out that the In-Memory feature, which is licensed separately, could be used accidentally since it is on by default after the upgrade, and recommends caution.  
You can read the “new features” guide for a detailed list of improvements in this release.
Related Posts Plugin for WordPress, Blogger...