A Serious Insights How To: Creating A Maturity Model Assessment
For several years I worked with GlobalEnglish, a data-driven English learning company (now part of Learnship) that assessed people’s English competencies and offered learning experiences to shore up their deficiencies and take them to new levels of mastery.
As an industry analyst, I have developed assessments for business areas like knowledge management and collaboration to help organizations understand where they sit against a maturity model—and assessments for vendors to ascertain how their tools stack up against customer needs. Based on the results of a survey, I offer guidance on how to move people and organizations from one state to another within the specific areas of the model.
This assessment work taught me that the fundamental starting point for any assessment is a model. A model defines the attributes of the assessment; it acts as the framework for discussing maturity. I also learned that the data isn’t just about the assessment participants but about the model and the assessment itself. Those who create assessments often learn more in preparation and maintenance of the assessment than they do from analyzing results.
Creating a maturity model assessment: build a model
Assessments cannot exist without a model of what good looks like. A model must consist of components, the logical subgroups of the model, and a maturity or competency level for each component.
Individuals will likely be evaluated against competency and organizations against maturity.
Although standard models exist for many things, a standard model is better than a non-standard model only in respect to investments in testing the model and developing model adherents. Standard models, however, may require licenses—but in turn, their proprietors may offer pre-built assessments, saving the time and cost of building an assessment.
As an analyst, I am often in the position of not being able to leverage existing models. If a client reaches out to me to build an assessment, they likely already decided not to leverage an existing model.
Building new models, while it can take time, also offers more control over the analysis. A proprietary model may offer insights that cannot be garnered from an existing model. New models emerge because their creators bring different perspectives and because they will likely include updated technologies and concepts.
By example, as I learned quickly from my time at GlobalEnglish, there are many English language learning companies, each with its own assessment tools. Many individuals may assess for proficiency only against their own goals, such as being prepared for a trip to another country. An organization may acquire an English language training solution to help employees become more proficient in a business-related area, such as customer support in a global bank. That organization buys into the solutions learning model, and as long as employees demonstrate progress against the model, and customers report continued or improved satisfaction with their interactions, they are happy.
Some organizations, and individual job hunters, however, may seek assessment against a standard English proficiency model. IELTS [ International English Language Testing System], TOEFL [ Test of English as a Foreign Language], or PTE [ Pearson Test of English] represent three models for assessing English competency. These assessments are driven by different models and, therefore, different scoring systems.
Pearson, a long-time leader in English language training and one-time owner of GlobalEnglish, created its PTE standard in 2009. TOEFL dates back to the early 1960s. There were obviously perceived deficiencies in existing models, opportunities for differentiation in a large market, and synergy between testing and learning that could lead to client lock-in around the models.
Those examples offer insights for those looking to build a business around assessment. While creating a proprietary model can differentiate, challenge, and disrupt, it is easier to assess against an existing model. It also means more people can do it. Creating a proprietary model offers a barrier to entry, but it also requires more marketing in order to build confidence in the model’s ability to reflect customer needs, perhaps even some form of regulatory or industry adoption. Learning experiences created against an existing model would differentiate on unique ways to meet the model’s stated goals, such as speed to proficiency, or the breadth of knowledge acquisition, such as offering supplemental learning for industry concepts and vocabulary in areas like technology or medicine.
Like any standard, product differentiation against maturity models arrives at the edge where the model either fails to keep up with existing needs, missed important areas to begin with, or proved too general or too specific in practice.
Regardless of the topic, a model should include a set of levels for each component against which the assessment determines maturity. These levels drive the development of assessment questions which range from a single question to a set of questions used in aggregate to assess maturing. Good models remain hypotheses open to improvement through testing and perhaps even abandonment when the model no longer offers relevancy. A model is a product and needs to be managed like one.
Develop a hypothesis for maturity on the model components
Model components and their maturity reflect a perspective. They hold a truth only through their alignment with a consensus reality. That is why competing assessments exist. If only a single view of English language competency existed, the world would need only one model.
The same is true of approaches to moving people from one competency level to the next. If one mode existed, all paths would be the same.
Different English language assessments and learning approaches exist because in a concept as large as the English language, multiple paths prove relevant enough to co-coexist and, therefore, compete.
Any consultancy seeking to establish its credibility on a topic that includes an assessment must deliver a point of view on what maturity means. They must then offer a systematic approach to moving organizations from one level of maturity to the next. Without the model and associated guidance, an assessment does little more than shout an observation to someone lost in a canyon. It offers no rope to bring them to safety.
The maturity levels will always be a hypothesis. With complex concepts, there can be no definitive answer to either the categories or the assessments against them, but over time feedback can calibrate them more precisely. They get tested, and they prove either resilient or they fail. If they fail, the model needs to be revisited, bringing into question the associated assessment questions, which may require revision, elimination—or perhaps, new questions need to be created to align with the updated model.
Be cautious of models that have been around too long without revisions. While they may appear stable, they are more likely obsolete. Other factors may impinge on a model that if its creators do not remain vigilant in the face of change, they will find their work no longer relevant.
There is nothing wrong with a hypothesis. A good working hypothesis acts as a set of parameters. Most of the time, the organization or individual being assessed will fall within those parameters. The model will find them a place. Associated guidance will instruct stakeholders on how to move from one place to another, how to become more conversational—or more innovative—or more strategic.
Each component should have its own maturity rating, as an overall rating does little to provide insight or offer a framework for guidance. An organization assessing its supply-chain needs to know, for instance, that poor partner management is impacting its overall performance. That way, it can concentrate on correcting a specific problem.
Physicians diagnose human ailments in the same way. We don’t have an overall health score. Assessments against pulmonary, respiratory, lymphatic and digestive systems point to issues with specific systems. Without an understanding (a current understanding) of what good looks like for those systems, a diagnosis cannot lead to treatment and improved function.
Overall scores do, however, offer meaning. In individual competencies like English, being a communicator involves not only written and spoken competency but also cultural sensitivity, and awareness of idiom and dialect. An aggregate score, however, should be just that, a reflection of individual components that remain discrete–assessments that offer high-level diagnosis are more likely marketing tools aimed at eliciting an emotional response to a perceived deficiency than they are diagnostic tools.
In business, interconnected systems drive performance. The supply-chain partner management issue may be part of a bigger partner management issue that also impacts innovation. Organizations need to assess themselves across a broad range of activities in order to discover their real opportunities for improvement.
Develop assessment questions
Questions built without a model create information with no clear way to assess the meaning of the information. A set of questions that requires extensive interpretation usually represents an undocumented model, one held closely by a consultant but not articulated. This does little to foster confidence as the lack of transparency makes it very difficult for the assessor to convince a client that the assessment can result in meaningful action.
The process of developing the assessment question is always iterative. The goal should be to ask the fewest number of questions in order to determine the state of whatever is being assessed. But rather than start with a few questions, I always suggest capturing a large number of questions and honing them down over time.
The first inclination is to create questions with yes/no or a range of responses. Analyzing an early question set will likely lead to questions that consolidate responses. A typical example is a list of do you do this or not questions that can be coded in an assessment as “select all of the following that you currently do.” A number of questions become one question.
Attempt to ask questions that demonstrate the desired state rather than ask about it directly. An innovation assessment question that asks: “Do you personally have innovation goals that you must perform to?” proves much more powerful than asking, “Has your organization embedded innovation goals into the performance system?” The latter reflects an intention, not a success.
Are you evaluated against innovation goals during performance evaluations?
- I am not evaluated against innovation goals
- I have innovation goals, but I am primarily evaluated against my core work
- I have innovation goals
- I am evaluated against my innovation goals proportionate to the innovation expectations of my role
The question about personal innovation goals, however, also includes a weakness in that it does not inquire about the adequacy or quality of the element; it simply ascertains if an individual perceives that innovation performance affects them directly. For most high-level assessments, that question may prove sufficient to determine the maturity of an organization. Deeper assessments may be required, however, to determine the efficacy of components.
Each model component should have at least one question to determine its maturity. Some may have more than one. When multiple questions are needed, the assessment should be designed to aggregate and reconcile across the questions so that data reinforces or augments, rather than creating conflict, within the dataset.
I will often include questions that add additional information in addition to those that lead directly to a maturity assessment. In most cases, if a participant makes any selection for those questions, the answer is considered a positive indicator of greater maturity. Selecting all usually reflects the highest level of maturity, but because of mix or ambiguities of execution, it is often difficult during an assessment to determine the maturity of these components without additional research.
Wherever possible, ask questions that demonstrate evidence of a state, rather than ask an opinion about placement within the maturity model.
Test the model and the maturity hypothesis
Pilot assessments if you can, and I highly suggest you do. I have participated in few surveys that, after looking at some initial data, I didn’t say to myself, “um, I wish I had asked about that.”
Too many people who create one-off assessments do so with the view that the assessment must be perfect at its onset. I suggest a more incremental product approach. An early version of an assessment can be considered a prototype, the next version a pilot, and the initially released version an MVP. Assessments work well with agile cycles applied to their development, execution, and ongoing improvement. These cycles should take place between deployments unless a revision will significantly change the outcome.
It is critical that assessment developers assess questions for their accuracy in determining the maturity level of the participant. It should be clear which questions lead to insight about maturity and which offer additional information. Assessing the questions can lead to pruning of the questions, which makes the assessment more efficient to deploy. Always strive for the fewest number of questions to reach insight. This is the stage where data, rather than instinct, will inform the pruning process.
Deploy the assessment
This is the easy part. The design targets an audience, either individuals or groups of people. An assessment should be part of a broader business initiative, and therefore, access to the target audience, at least some portion of them, should be relatively easy.
A SurveySparrow Overview
I recently adopted SurveySparrow for some experimental assessment work. (In full disclosure, links to SurveySparrow may result in a partner payment to Serious Insights.)
SurveySparrow offers a wide range of question types, which helps keep people engaged. It also helps designers craft questions that provide several insights in a single question. The analytics are clear, and the data easily exported.
For example, rather than asking if a company is using a technology, their grouping question type can be used to create buckets such as investigating/researching, actively developing, no activity, that participants can drag and drop choices into. A visual input that takes the place of many questions helps increase not only engagement but completion rates. [Note: This question type, however, would benefit from additional formatting options for narrow screens to eliminate the need for horizontal scrolling on the categories.]
Digital assessment development depends on a strong tool to offer the most options for design. Each question I crafted required a thoughtful choice of how best to ask it in the tool. I often found myself rewriting to improve questions based on the capability of the tool rather than its constraint.
For instance, it would be easy to ask a series of questions that indicated some level of maturity against a set of related items such as, “do you think your innovation is disruptive or incremental?” Rather than rating them individually, I crafted a Constant Sum question that allowed the areas of innovation to be rated against each other in one question: Weight your perception of the types of innovation your organization produces (must add to 100%).
I have looked at and used a number of survey tools. None of them are perfect, but I found SurveySparrow to be pushing the edges of the conversational and traditional models with a clean and clear user interface. It is the tool I am using to explore perceptions about the metaverse, innovation, and working from home.
Serious Insights will be running a full review of SurveySparrow soon.
Assessments, unlike surveys, don’t suffer from fatigue as they are either used to help the individual accomplish some goal, or they are part of a corporate initiative that requires people to participate. If your organization is trying to determine its state against an innovation or digital transformation model, it should be easy to entice people to participate.
Gather feedback on the assessment
Assessment deployment will point out flaws in even the most thoughtfully designed assessments. Deploying the assessment furthers an understanding of its accuracy and completeness, as well as its experience design.
Individual, repeatable assessments, such as those associated with language mastery, prove the easiest to assess because regardless of the standard, a standard exists with a degree of rigor against which results can be measured. Rapid feedback leads to timely interventions aimed at nullifying the assessment’s flaws. Over time the assessment likely becomes fixed and proven, which can be its long-term downfall, but this post does not address that issue.
While individual assessments may provide some insight into the model, models often do not belong to the assessment creator, making changes in the model difficult. These types of models tend to change slowly over time. Those creating assessments usually master the model and may even create assessments that qualify others on the conceptual model, as well as its application.
For IT and business assessments where the model owner also creates the assessment, feedback can occur rapidly, with the model reflecting learning with each response, though designers typically benefit from allowing patterns to form before a modification—though at times the flaw announces itself in the data so obviously that small sample sizes may be enough to uncover it.
Examples of data to look for that would suggest the model, or the assessment may be misaligned include:
- Frequently answered open text questions with a similar answer. A choice or a model component may be missing.
- Frequently skipped questions or points in an assessment where the assessment is abandoned. Look at phrasing, perhaps for offense, or the time it takes to complete the question–or the vagueness of the question with regard to the stated goals of the assessment.
- Overwhelming convergence on a choice or set of choices that suggest the answer should have been known by the designers, and perhaps the assessment would improve by assessing at a deeper level of abstraction.
When these issues occur, address them by changing the assessment questions or updating the model, whichever offers a more logical path to incorporating the feedback.
Also watch for increasing numbers of incompletes or rising rates of negative feedback as it may mean the idea behind the model no longer applies.
In the 1990s, expert system teams could assess their maturity in knowledge engineering, systems integration, and development environments. Today, you would not find anyone to administer such an assessment to, let alone anyone interested in its results. Assessments tie closely to their context. They have a lifecycle that includes retirement when the assessment no longer offers value.
Assessments and Sample Size
For assessments that focus on individual skills, like English competency, the sample size is one. Learning how well the assessment works, however, requires looking across all participants to determine if the assessment accurately reflects current skill levels and if the analysis of the results places the person at the right level.
For organizational assessments, the broader the reach, the better. Small sample sizes introduce bias in the results, especially if the target participants are closely aligned with the topic. Asking the innovation governance team, for instance, to assess the organization’s innovation capability will likely result in answers that will reflect a need to report success where the broader organization might disagree and be overly critical in areas where an individual feels the team isn’t doing what they should, and they see the assessment as a way of making their point.
If you ask, for instance, “does the organization compensate people adequately for their innovation contribution?” the governance team may state emphatically yes, as they created a compensation policy and implemented it. A broader survey of the organization may assess that those covered by the compensation policy don’t feel it adequately compensates them. The same holds for embedding innovation principles, which a central team might likely rate well, and a broader view where people don’t experience innovation encouragement or empowerment as part of their day-to-day experience.
Developing guidance
Assessments must serve a purpose. In the case of English language mastery, they provide insight on mastery that informs learner placement against a competency model. Each level of the model includes a curriculum that will migrate the learner from one level to the next. Smaller assessments along the way will test for direction. A deeper assessment at the conclusion of a learning module will confirm understanding, and movement against the model may suggest the next level of curriculum—or the completion of a particular path.
Consultants developing assessments for IT skill mastery or concept maturity will need also need guidance on what to do depending on where people land on the model. Those familiar with Microsoft or Cisco certifications will recognize placement examinations aimed at helping people prove their certification in a technology domain, piece of hardware, or application—or discover their knowledge gaps and suggest courses or readings that will enhance their knowledge.
Where assessments go wrong
Assessments should never exist in a vacuum. Assessments go wrong when they assess without informing, or they offer insights without an accompanying action. They provide a score without a clear understanding of what the person taking the assessment should do next.
The purpose of an assessment is to evaluate performance against a standard for an interested third party or to assess a state against a model in order to inform investments toward better alignment with a model.
In the first case, an English assessment may tell an employer where a candidate stands against a job requirement. If the job requires a certain level of English competency, the assessment will inform the employer of the candidate’s competency relative to the need. It may also tell the participant what would be required to attain that level of competency should they not meet it.
In a business assessment, the results inform the sponsor of weaknesses against a model and where the organization stands against that model. Assessments without a model confuse rather than enlighten—they appear uninformed, detached—and any advice offered from analysis appears arbitrary.
Assessments without a model confuse rather than enlighten—they appear uninformed, detached—and any advice offered from analysis appears arbitrary.
Assessments also go wrong when used as a weapon rather than a tool for improvement. Assessments should not be used to punish but to inform, as a way to understand in order to move forward, not as a tool to deride.
For consultants staking their future on assessments as a business tool, unless they plan on building a business, either proprietary or as a consortium that is large enough to conduct ongoing research to keep the model and the maturity levels relevant, they should not over-invest in the model or the maturity levels. Assessments need a model. And models need maturity levels. But you don’t need six sigma proof of your model’s accuracy. Think about it like a civil lawsuit—one that requires a preponderance of evidence rather than a test of beyond a reasonable doubt.
Assessments can be powerful tools for focusing attention on the most significant issues facing an individual or an organization. They can go very wrong when they become institutional ways to gauge movement against internal goals—over time; repetitive assessments can be gamed, like any measurement system that employs judgment and opinion. People can also be desensitized to assessments that ask the same things in the same way over time. Ideally other measurements, more embedded in process and practice, supplant assessments in mature organizations.
As data science continues to progress, assessments will connect to systems in order to gain knowledge of them and data about the business’s performance. At that point, many business assessments will become transformative—and the assessment world will also transform. Assessments will no longer be standalone efforts, but ongoing, embedded feedback as conceptual models and their states become input to machine learning algorithms. We see hints of this future in personal computers.
Operating systems, and the security software installed on them, monitor for state. An immature computer might be filled with junk, unused applications, and an overflowing trash can. Security software performs an assessment, against a model, of what a good computing environment looks like. It shares its assessment, and then likely offers the choice of emptying cache and trash cans, uninstalling old applications, or deleting duplicate files—and perhaps other activities, including changing policies.
Imagine a future where assessments for innovation, digital transformation, supply chain, or the remote work environment, start with an analysis of systems and their configurations, a reading of policy and a check for its implementation, and an analysis of feedback from customers and partners in CRM systems and on social media—evidence that exists without the need to ask people for their thoughts or perceptions at all. Algorithms glean the data about maturity from the operating environment.
Those future assessments will offer much greater detail and likely, much faster interventions and mitigations for certain issues—but people will remain a critical part of any assessment, especially for areas that can’t be so easily automated like innovation, marketing, and talent-related areas. Those interfaces between people, technology, and systems, where understanding the model of business, and figuring out how to assess the current state of implementation against that model, will remain critical to how businesses diagnose practices and discover the cause of challenges—and how they start on the path to overcoming them. Machine learning may be able to identify many patterns, but some human perceptions will likely remain beyond its grasp.
Creating a maturity model assessment: assessing the assessment approach
I offer approach and insights outlined above from years of experience in developing and deploying assessments. It includes a bias toward personal experience, not academic rigor. There are dozens of books and papers available that may reinforce my approach, and probably an equal number that will offer alternatives.
While benchmarking often proves the goal of model and assessment builders, keep in mind that benchmarking data remains relevant only as long as the model does. As concepts come and go and one technology replaces another, old data becomes just that, old data—no longer relevant to inform the model or provide insight about practice. Assessments that promise benchmarks likely only do so for a limited period of time.
I find it more valuable to adopt a worthy model that matches your aspirations and use the assessment to improve practice rather than see how you stack up against the firm down the road—they may share certain abstract parameters about an area, but they do not share your strategic goals, your operating model, or your history.
It is well and good to know how you stack up against some anonymous group of other people, but that momentary connection to external data doesn’t mean nearly as much as does an analysis that links your strategic goals, be they personal or organizational, against your current capacity and capability. Individuals already know at least some other people are better at whatever topic they seek to master, and businesses know that too. The important thing for a business is becoming good at what it needs to be good at, not what someone else has decided they need to be good at.
The important thing for a business is becoming good at what it needs to be good at, not what someone else has decided they need to be good at.
Differentiation plays an important role in reading back the results of a business assessment. Those conducting the assessment work need to always report not against just the ideal of the model, but against the relevancy of the model to the client they are working with. Only then can assessment be used to move a specific business toward its own very specific goals.
Did you enjoy A Serious Insights How To: Creating A Maturity Model Assessment? Have a question? Leave us a comment.
For more serious insights on strategy, click here.
Leave a Reply