Wednesday, June 13, 2012

Myths of Measurement: Do Measures Reflect Reality?


In the last blog post we discussed the mental models that inform our understanding of talent. Today’s post will examine how measures make mental models explicit and useful. This is true of talent management and other fields. 

I’d also like to discuss how easy it is to misunderstand talent measures as concrete entities. Just as there was a danger in reifying our mental models of talent, it’s easy to forget that measurement results are just a numerical representation of a model. The model is not “real,” and the measures, for all their predictive or descriptive strength, are just a representation of the model.

Mental and Mathematical Models

When measuring talent, we develop mathematical models to represent our mental models.  Often we start with a conceptual model, which is a sketchy idea. An operational model, on the other hand, is precisely specified in mathematical language. Operational models often have good predictive or descriptive strength. 
This is similar to an architect’s process. An architect starts a project by drawing a conceptual sketch, and refines the sketch into a scale plan. Sometimes it turns out that the original ideas don’t work. Sometimes the scale plan makes the concepts more workable. Scaling the concept mathematically makes it more predictive, more descriptive, and more useful.  

Refining Measures

Operational measures and scales are strong tools, and often work well to summarize personality, results, potential, or competency. The numerical values of the scales can be compared and linked to other values such as compensation. They can also be tested.

As an architect may find that her concept won’t work in practice, we may find that a talent measure does not work as we conceptualized it. For example, if we compare measures of performance and personality to investigate our mental model that extroverts are better at sales, we may find that personality does not relate to performance as we expected.

Statistics can help us refine and strengthen our talent measures. If we find that an employee engagement survey is only weakly related to customer satisfaction, we can add survey questions to strengthen the relationship. Adding questions about the organizational climate, such as “my co-workers really care about the customer’s experience,” is likely to increase the correlation. Examining statistical correlations can help us develop a measure that’s quite important to the business.  

Personality assessments are among the most refined talent measures. Many personality instruments have been revised over the years—the state of the art, in some cases, is astounding. The Hogan Personality Inventory (HPI), which defines personality as social reputation, has now undergone 30 years of refinements. It was developed by correlating respondents’ answers to survey questions with friends’ and co-workers’ descriptions of the respondents (social reputation). Today, the 206 questions of the survey—questions such as “I would like to be a race-car driver”—allow surprisingly accurate assessment and precise differentiation between different aspects of personality. 

Many assessment participants feel that the HPI can read their minds, but the “wow” factor is simply produced by probabilistic relationships between survey questions and reputation. In a sense, it’s the magic of statistics—“any sufficiently advanced technology is indistinguishable from magic” (Arthur C. Clarke). However, participants’ feelings that the HPI personality instrument can see their true selves can easily lead to reification.

Of course, not all personality instruments are as well refined as the HPI, and it’s important to remember that even the HPI is probabilistic. These instruments are accurate nearly all the time, but not always. Imperfections are easy to overlook because the instruments are “right” so often, and in general. Overlooking the imperfections, however, has dangers.

How Reification Happens

There is something about putting numbers on a model that makes the model seem real and unquestionable. But this presents a problem. When we can’t ask questions about our models, we can’t learn.  

For some reason, it’s easy to accept mathematical talent measurement results as the truth, and not look beyond the numbers. I have some theories about why this reification happens.
  • Some people aren’t as comfortable with numbers as they are with words. If it’s a lot of work for an individual to understand a chart or a report full of numbers, it’s likely that the person will only review the measures superficially. It’s also less likely that the person will ask questions. 
  •  The basis of talent measures isn’t always made clear. When providing HPI feedback, we don’t explain conceptually or computationally how the scales were developed or scored. In fact, the calculation methods are a secret known only to the Hogans. In one sense, it’s not important to know these details. But in another sense, not understanding how a measure works—or having no access to the mechanism behind the measures—could lead to reification. 
  •  When the talent measures are rigidly used for decision making, for example compensation or selection, the are in a sense real. Certainly they control real outcomes.  
 

Reification and the History of Intelligence Testing

The danger of measure reification is obvious in the long and often sad history of intelligence testing. In 1905, Alfred Binet proposed  a method to measure intelligence in children. A careful scientist, he noted the method’s limitations: 

This scale properly speaking does not permit the measure of … intelligence, because intellectual qualities … cannot be measured as linear surfaces are measured.

Binet intended to develop a tool to classify children needing attention. He tried to not reify the underlying capability.

Since then, intelligence has been reified and recast as a real and invariable human attribute—an attribute that describes a limit of human potential. The application of intelligence testing has limited access to immigration, schools, and jobs.  

When we reify a measure, we extend the measure beyond its original design. In this case, research indicates that intelligence does change. In addition, capabilities such as emotional intelligence are more important for some jobs. Making decisions based solely on employee intelligence is a mistake.  Intelligence quotient is not a real thing. It is a measure developed for a specific and narrow task: identifying children who need attention to succeed academically. Use in industry, and for immigration, came much later.

While many would argue with me, I assert that intelligence must be combined with other measures to be useful in business.

Reification and the Danger of Self-Fulfilling Prophecies

Reifying measures can lead to self-fulfilling prophecies. For example, designating an employee as “high potential” one year often means they will continue to be seen as high potential in future years, regardless of changes in performance. This is similar to calling a student “gifted.”
When a manager gives a low performance rating to an employee, there can be similar long-term consequences. People often conform to expectations. This is called the Pygmalion effect, which is well studied in schools. The Pygmalion effect also happens in organizations

Reification and the Danger of Limited Thinking

Unquestioning acceptance of any representative model is a problem because it limits our ability to think broadly about a situation. We tend to think that a talent measure describes talent completely. If we do this, we fall into the trap of mistaking the map for the territory.

Early sea charts were representations of mariners’ mental models. They were crude but adequate for coastal navigation at the time. Today they seem wildly imaginative and mostly decorative. But partly as a result of the maps’ reification of these mental models, sailors stayed close to shore to avoid the monsters, whirlpools, and other dangers that became very real to them—including the danger of sailing over the edge of the world.  

Sometimes, we stay close to what is familiar. If we’re familiar with the idea of intelligence, we refer to someone as smart. If we’re familiar with descriptions of personality, we may refer to a person as an introvert.  But there is much more to a person than our mental models, and our measures, would suggest.

Recognizing the Limits of Measures Is the Key to Using Them Well

Ultimately, talent measures are just representations of mental models. The underlying talent is always much more complicated. Any representation, or model, is necessarily a simplification.
I am concerned that we take measures as better, and more, than they actually are. If we don’t consider the limits of the tools, the limits of the tools become our limits.

I don’t think we should look for more perfect measures of talent. I am certain they do not exist. For one thing, the available technology reflects our current understanding of talent. 

So, throwing out our current talent measures is probably not helpful. Instead, we can do better by increasing our understanding of the current measures. This is an evolutionary process, and a probably a process that must be done in collaboration with others. How else can we examine our assumptions, and question both our measures and the underlying mental models on which they’re based? (I’ll be talking extensively about building shared meaning of measures in future blogs)

If we’re to use our measures intelligently, we won’t expect them to be more perfect than they are—even if they’re mathematically correct 95% of the time. We’ll remember that measures are never true representations of reality: A measure can never contain the whole truth, the total complexity of a person, or an entire situation. And we won’t allow ourselves to be daunted by the “truth” of numerical measures, which leads us to accept them superficially. Instead, we can use measures as a starting point for thoughtful exploration and deeper communication. 

It’s important to remember that all measures represent someone’s theory. The theory may not be appropriate in the current context, and may not be measured well

No comments: