Searching for Better Alternatives to General Mental Ability Tests: Is There Such a Thing?

Source: https://www.sciencedirect.com/science/article/pii/S0160289624000862

This research explored whether it is mathematically possible to develop an alternative test that can measure general cognitive ability but lack subgroup differences (e.g. racial differences). However, it was implied such replacements cannot exist because of several factors:

  1. g is still the best predictor of job performance or academic success.

  2. Each g-test show subgroup differences, but are attributed to the g and not s (Specific abilities). This means differences in g-test scores are not because of race/national origin (RNO), and the s is not related to these factors.

  3. g-tests are already equally valid for all groups in terms of education, employment and other settings. Once that is considered, specific abilities have little added impact. Also, any attempt to reduce subgroup differences can also lower the predictive validity of the test.

Overall, the findings confirm that g-tests always add incremental validity over substantial validity non-cognitive tests (SVNCT). This implies that measuring g will always improve the accuracy of predictions, even if non-cognitive tests have strong predictive power. So, both types of tests should be combined instead of replacing g-tests altogether.

Imagine recruiters for job hiring or college admissions, if they remove intelligence tests in the recruitment process, would interviews, personality tests or work portfolios even suffice? We all know that these assessments co-exist for a reason: they all have individual strengths and weaknesses that make up a person’s potential.

So, do you think intelligence tests should still be used for job hiring or school admissions? Or are there better ways to determine a person’s capacity without causing adverse impact?

For recruiters and admissions committees, the message here is to combine predictors. If structured interviews and work samples have lower adverse impact but GMA still adds predictive power, the solution isn’t removal, but optimal weighting. We need more research on how to best calibrate the impact of each measure to achieve high validity and legal defensibility.

@charles-clayburg763 I agree, but perhaps the emphasis should shift more heavily to maximizing the weight of measures with minimal adverse impact first, before worrying about GMA’s incremental power. If structured interviews and work samples can deliver a predictive validity of approx .40 (based on recent estimates), the incremental gain from GMA, while real, might not justify the resulting adverse impact. We should focus on refining the validity of the clean predictors (like situational judgment tests or biodata) as much as possible.

I’m having trouble with the idea that specific skills have no added impact once you factor in g. Doesn’t success in many jobs, like coding or mechanics, rely heavily on very specific talents that aren’t purely g? Maybe the way they measured those specific abilities in the study wasn’t detailed enough. What if a highly specialized test could beat g for a highly specialized job?

The structural equation models in Figure 1 are key - they show subgroup differences load on g (.514 path from g to RNO), not on specific abilities or measurement error. This means you can’t engineer away group differences without removing g itself, which defeats the purpose since g is what predicts outcomes. Multi-method assessment (cognitive + structured interviews + work samples) remains the evidence-based approach for selection.

@homesicksterling The adverse impact problem is real but removing g-tests entirely backfires. Unstructured interviews and subjective evaluations have WORSE subgroup bias and lower validity. The solution isn’t abandoning cognitive testing but using it appropriately - as one component in a holistic assessment, with proper validation, and potentially with banding or threshold approaches rather than strict rank ordering. We can’t pretend the g differences don’t exist in the construct itself.

I’ve read in some studies that adverse impact isn’t evidence that a test is biased in the psychometric sense, but it is a legitimate concern from legal and organizational perspectives. The research does show tension between maximizing validity and reducing group differences, but organizations still need to balance multiple considerations, not just predictive power, but also legal requirements, fairness perceptions, and broader equity goals. It’s less about abandoning g-tests and more about thoughtfully navigating these competing priorities.

For me, you’re right that job-specific tests (like actual coding challenges for programmers) often do add value beyond g. These capture domain knowledge and learned skills, not just cognitive ability. That’s exactly why the original post concludes that multiple assessment types should be combined, not that we should rely on g alone.