Main Content:

Do you have a corporate taxonomy? Probably not.

How many times have you been asked about your "Corporate Taxonomy"?
I'll bet you dollars to doughnuts you don't have one.

Do you know what a taxonomy is? Can you explain how it differs from an ontology? How about the difference between a taxonomy and a namespace?

Most can't accurately answer these questions, so don't feel bad. But if you want to have meaningful discussions about your metadata namespace management approaches (and if you don't, you shouldn't be in Knowledge Management!) then you should have a clear understanding of the terms.

I was recently asked to build a glossary and FAQ about Knowledge Management, to serve as a standard to help guide us through our incredibly complex merger of two large, complicated and distinct organizations – as we build a centralized Knowledge Management practice across the enterprise.

I thought this might help others in the field, so I’m putting it here (with slight variations – for protecting IP and all that shit)…

P.S. A few of the terms (such as 'semi-managed namespace' and 'co-managed namespace') are strictly internal terms that I made up... and that's an important point!
It's MORE important to standardize on language if there is no standard industry term!

Anyway... I hope its of value to at least one person...

Knowledge Management Definitions & FAQ

Knowledge Management Intro

Knowledge Management is the systematic management of an organization's knowledge assets for the purpose of creating value and meeting tactical & strategic requirements; it consists of the initiatives, processes, strategies, and systems that sustain and enhance the storage, assessment, sharing, refinement, and creation of knowledge.
Knowledge management (KM) therefore implies a strong tie to organizational goals and strategy, and it involves the management of knowledge that is useful for some purpose and which creates value for the organization.

Information Management and Governance Terms & Distinctions
1.   Content Management System (CMS)
Software used to create, manage and publish content (unstructured documents)
2.   Digital Asset Management (DAM)
Similar to a CMS, but optimized for digital content (multimedia/rich media)
3.   Enterprise Content Management (ECM)
Centralized platform of tools and strategies, used to capture, manage and deliver content. The ECM is the platform that enables and connects the CMS, DAM and other sources of content – along with content enrichment strategies, such as taxonomical/ontological structures.
4.   Knowledge Management (KM)
The active, consumer-centric management of strategies, processes and systems to ensure ECM is providing strategic value to the organization. KM extends the scope of content management to include and leverage other sources of data and information. KM sits between the ECM and the people who generate, curate, consume and benefit from an organization’s information capital. Where ECM focuses on the systems housing Explicit Knowledge, KM extends that scope out toward Tacit Knowledge.
5.   Explicit Knowledge
Knowledge that can be (has been) captured as sharable content. From full user guides, to specific knowledge articles, to targeted snippets – all knowledge that can be captured and consumed by human beings in one or another form of content.
6.   Tacit Knowledge
Knowledge that cannot be captured as sharable content.
There has been decades of debate on how to define and characterize what “Tacit Knowledge” (or “Tacit Knowing”) is, since the terms were introduced in 1958. What may qualify as Tacit Knowledge ranges from applied wisdom, to the mental equivalent of muscle memory – and there is a broad spectrum of knowledge ranging from Tacit to Explicit Knowledge.
While, strictly speaking, Tacit Knowledge can’t be captured in written form, a primary focus of Knowledge Management as a practice is to push what we can capture, ever further toward Tacit Knowledge.
7.   Information Governance (IG)
Information Governance is the application of policies and controls over information storage, retention, retrieval, security and all other aspects of information liability. Information Governance ensures regulatory compliance, legal compliance and corporate policy compliance, through application of IT policies. Knowledge Management, as a discipline, operates within the constraints and controls of the wider-ranging application of Information Governance.

Knowledge, Information and Structures Terms & Distinctions
1.   Metadata
Additional information about content, which can help improve search, retrieval, classification, categorization and management of the content. Metadata can be temporal (when it was created, modified, last accessed, etc.) contextual (purpose, application, usage, filename, etc.) content-based (descriptive of the subject and scope) or any other useful descriptors. Metadata could also be programmatically extracted, through the use of linguistic analysis techniques – such as text analytics or Natural Language Processing – to extract key concepts, entities and other critical information that can provide document gist-ness.
2.   Tagging
Method of manually attaching additional metadata to content. Users can add a term to a document’s metadata, often describing the content, purpose or audience. Tagging may be open to all users or restricted audiences – such as content creators or internal users, only. Tagging may be restricted to a set of allowed tags (such as what is defined in a namespace/taxonomy) or it may be unrestricted, allowing users to create any tags they find appropriate – or some hybrid of the two. One common approach, as an example, is to limit tags that would impact search relevancy for the entire user community, but allow users to create whatever private tags they see fit.
3.   Metadata Namespace
When considering the metadata of a given domain (document repository, system, community space, environment, etc.) it’s referred to as that domain’s metadata namespace.

Effective enterprise metadata domain namespace management allows us to build connections between the many namespaces – resulting in vast potential for improvements in customer experience management, operational efficiency, cross-functional collaboration & communication, and many other areas.

Levels of control over a metadata namespace range from Folksonomy (or allowing the user community complete freedom to create/manage their own metadata tags) to Managed Taxonomy (strictly-controlled, structured formal taxonomy) and there are numerous options between (and endless hybrid approaches).

          Folksonomy: Open meta-tagging defined by the user community
          Managed Namespace: Defined by those who own/manage the environment.
          Semi-Managed Namespace: Hybrid between a managed namespace and folksonomy. Allows for folksonomy (internally audited) as well as managing for SEO & connectivity
          Co-Managed Namespace: Broadly defined by the platform vendor, but allowing some flexibility for the environment manager

Metadata namespaces are (generally speaking) either unstructured, strictly structured in a taxonomy, loosely structured in an ontology, or some hybrid approach across these options.

4.   Taxonomy
A formal language classification model of a given domain (such as an enterprise, or user community within an enterprise) in a managed tree structure.

It is important to note that (like the taxonomy of the animal kingdom, for example) each branch in a taxonomy tree shares all the characteristics of the preceding branches, and no entity can exist on more than one branch in a single tree. (See Taxonomy vs. Ontology distinction in FAQ.)

Taxonomies are often used to structure website navigation, but the two are distinct.

5.   Ontology
A language categorization model (often informal and dynamic) of a given domain (such as an enterprise, or user community within an enterprise) expressed as relationships between entities.

Whereas “Chihuahua” may appear in a taxonomy under “Animal > Mammal > Canine > Domestic Dog”, in an ontology “Chihuahua” would be an entity (expressed as a node) and may have relationships (or edges) to many other nodes, such as: Type of > Dog; May have color > Brown; Relative size > Small… (See Taxonomy vs. Ontology distinction in FAQ.)

A comprehensive ontology can be used effectively as an approach to manage and integrate multiple namespaces/taxonomies, by overlaying ontological relationships.

Knowledge Management Measures
1.   Time To Resolve (TTR)
The amount of time it takes for a customer Service Request to be resolved – the duration between impact and restoration of service. (Also known in the industry as MTRS – Meantime To Restore Service – when measuring collectively.)
2.   Deflection
When a customer is experiencing a service outage – or a facing a potential service outage, due to some event – Knowledge Management’s goal is to deliver them information that enables them to resolve the incident, without costly escalation to Customer Support. Each successful avoidance is one deflection.
Defection is notoriously challenging to measure accurately. We can measure certain specific deflections with very high confidence – and others require a lot of assumptions. Knowledge Management continues to strive toward better measurement of deflection (as we do with all measures and metrics). Our current approach is to focus on measuring other directly-observable customer-impacting factors as accurately as possible, while assuming the industry standard assumption of 10% of customer views of a knowledge resource leading to one deflection.

Search and Delivery Terms & Distinctions
1.   Search Engine Optimization (SEO)
SEO is concerned with facilitating optimized delivery of controlled information assets through public internet search engines – i.e. how we perform on Google searches. SEO employs a wide variety of tools and techniques to improve content placement and brand management. One of the techniques is strategic use of Branded and Unbranded keywords – whether within the content or applied through metadata. Someone may search for “hybrid cloud solutions” or “Pivotal Cloud Foundry” – we want to perform well, either way.
2.   Indexing
When performing a search (whether on the internet, or internally) what your query runs against is an index of the content sources. An index often consists of document metadata and a pointer to the document source.
Some key differentiators are how the indexing platform integrates with existing security, whether and how well it extracts information through linguistic analysis, and existing integrations to CMS’s.
3.   Unified vs. Federated Searches
Though these terms are often used interchangeably (and, to the user, there’s little-to-no distinction between equally matched platforms) there are important differences between the approaches.

A Federated Search strategy uses API calls and other integrations to query existing search indices within the various repositories, then effectively stitches all the result sets together for presentation.

A Unified Search platform, on the other hand, maintains its own universal index of content, through direct ingestion and processing of the content.
There are pros and cons to each approach – and it’s far from as cut and dry as most vendors would have customers believe.


Are text analytics, text mining and Natural Language Processing the same thing?
While it’s, admittedly, an oversimplification, it’s convenient to talk about two general approaches to extracting data from unstructured text…

Text Analytics/Text Mining breaks the textual input into digestible chunks of string variables and uses statistical modeling techniques to find patterns in those variables. Text Analytics cares nothing about the content itself. It’s not concerned with discerning meaning or context – just statistical patterns.

The ideal of Natural Language Processing is to develop a translation engine between human language and machine language. NLP uses some of the same statistical modeling approaches as Text Analytics, but goes much further by applying semantic and syntactic analysis to extract meaning, intention, sentiment and key concepts (among other things) included in the text.

Text Analytics is faster, cheaper and easier to implement. NLP provides more accurate, contextually consistent results.

So… What about Machine Learning, Artificial Intelligence and Data Science?

What’s the difference between Content Management and Knowledge Management?
Content Management is all about managing the content. It’s putting systems in place to allow people to publish, edit and serve content to others. It may include some functionality to relate different content systems together. It may have taxonomical support. It could (though it’s relatively rare) offer ontological support.

Knowledge Management is the layer of business processes, information processing, linguistic analysis, business logic and many other aspects in place to ensure the content consumers (internal or external) and the business units optimize the value of our business services. Knowledge Management is also responsible for optimizing the experience for content creators & consumers (internal or external).

What’s the purpose of Web, Social, Community and Forums and Lines of Demarcation?
Forum: An online space for people to ask questions or share information in threaded discussion format.
Online Community: Socially-enabled (tags, likes, shares, comments…) online space to share information and interact with others. In general terms, an online community can contain a forum, but does not have to.
Social Media: A public online community outside the domain of our organizational control. Facebook, Twitter, Reddit, tumblr, etc.

How does Taxonomy differ from Ontology in function and application?
Taxonomy: Classification of entities into distinct, managed tree structures. Every branch on the tree has common properties (or attributes) with the preceding branches. Each leaf (or node) can fit into only one classification. Consider the Plant Taxonomy below…

All Flowering Plants and all Conifers necessarily bear seeds, and necessarily do not bear spores. If a plant is discovered that bears both seeds AND spores, it can’t be placed in both branches, so a new classification tree must be created at the top branch.

Taxonomies are useful for strictly-controlled namespaces and exclusive classifications. They require detailed planning – or will require a great deal of rework. They must be manually maintained and updated as new terms are added.

Ontologies tend to lack the level of control Taxonomies have, but they excel with more fluid applications and dynamic updating – through Machine learning, for example.

Ontology: Entities (nodes) are reflected through their relationships (edges) to other entities. Ontologies are usually more dynamic and fluid than Taxonomies. Nodes generally have their own attributes and, depending on the platform and application, edges may also have attributes.

The ‘Employee’ type node (like ‘Bob Jones’) may have attributes such as ‘Hourly Rate’ and ‘Location’, whereas the ‘Has a contract with’ edge may have attributes such as ‘Contract Expiration Date’ and ‘Primary Contact’.

To start tracking what skills employees may have, this Ontology may include a ‘Skill’ type node, and perhaps a few new edges for ‘Expert In’, ‘Experienced In’, and ‘Novice In’…

Relational Database Systems are not well-suited for storing, managing and implementing a robust ontological framework. If the ontological structure is expected to be vast/complex, an organization should consider housing it in a Graph Database System.

May 18, 2018, 12:18:47 pm

About The Author


Add a Comment

Only registered members can post comments, please click here to register.

Pages: [1]