Firstly of all, everything is running in "Dgraph" which is the term for the process of MDEX Engine. Data in MDEX Engine won't be accessible without a related running Dgraph. Relevant Data from variant source will be extracted, transformed and loaded into MDEX Engine. Compare to traditional RDBMS or OLAP Cubes, MDEX Engine structure its data in a different way.
The data model in the MDEX Engine consists of records and attributes.
- Records are the fundamental units of data.
- Attributes are the fundamental units of a record schema which describes the data model of Records.
For a data record, an assignment on an attribute (also known as key value pairs) provides information about that record. For example, for a list of bike records, an assignment on the "Category" attribute contains the category description (e.g. mountain) of the bike record. Each attribute is identified by a unique name.
Each attribute on a data record is itself represented by a record that describes this attribute. Following the bike records example, there is a record that describes the "Category" attribute. A collection of these records that describe attributes forms a schema for your records. The aspects of the attribute on a data record are configured in the schema. For example, an attribute on any data record can be searchable or not.
Let's have a look at an example which may help you to digest these concepts.
In an MDEX Engine which stores bike information, a typical Data Record will be like below:
- TxnID = 12324
- ProductID = 506
- Category = Mountain Bike
- Amount = $499.99
- Suspension = Fox 32 F-Series
- FrameType = Aluminium
- Saddle = Bontrager SSR
- Mountain Accessories = Fork and shock sag meter
- Mountain Accessories = Water Bottle
- Review = A great bike for off road. Smooth ride over the bumps
- ReviewSentiment = Positive
- ReviewTerm = Great
- ReviewTerm = Off Road
- ReviewTerm = Smooth
- ReviewTerm = Bumps
- Name = Category
- Type = String
- Display Name = Category
- Searchable = Yes
- Sort = Ascendant
The collection of system records is called Schema.
In MDEX Engine, data records are not necessary to be stored in a conformed container. Null value key pair such as "AttributeName = Null" are not allowed. For example, as source data, if a relational database record has NULL value for column "Suspension", when it's loaded into MDEX Engine, a new MDEX data record will be inserted but no Attribute "Suspension" will be created for that record. So, it's not unusual have "Jagged records" like below exist in MDEX Engine, though they are describing the same business entity.
Data Records in MDEX Engine may be loaded from structured, semi-structured or unstructured data sources.
For structured data, each Tuple becomes a Data Record and each column (except for the columns with NULL value) becomes an Attribute.
As the key differentiator, EID extends BI analysis to unstructured data such as text documents or social data. In MDEX Engine, unstructured data can be stored as their own records for "side-by-side" analysis. Or, they can be linked to existing data records by any available key.
Any unstructured attribute can be enriched using text analytics to expand the structure of its containing record. Common techniques include but are not limited to Automatic tagging, Named entity extraction ,Sentiment analysis ,Term extraction.
Beyond all these data records which consolidate information from database, XML document, Facebook, etc, MDEX Engine also creates hierarchy/relationship graphs, indexes for the attributes and attribute values. Those graphs and indexes are so important that information discovery can not be performed effectively and efficiently on MDEX data records without them.
In summary, MDEX Engine of Endeca stores information in data records as series of Attribute/Value pairs. Data Records can be structured differently with each other. With patented mechanisms of managing navigation graph for attribute relationships and hierarchies, users can quickly navigate through different attributes, search for keywords, or create queries as a more conventional approach. With MDEX Engine, no data is left behind.
Until next time, stay intelligent, stay agile.
In summary, MDEX Engine of Endeca stores information in data records as series of Attribute/Value pairs. Data Records can be structured differently with each other. With patented mechanisms of managing navigation graph for attribute relationships and hierarchies, users can quickly navigate through different attributes, search for keywords, or create queries as a more conventional approach. With MDEX Engine, no data is left behind.
Until next time, stay intelligent, stay agile.
No comments:
Post a Comment